Duplicate Content and the Canonical Tag

Most website owners with any awareness of SEO, will know of the issues associated with Duplicate content. They will know (legal copyright issues aside) that there is little or no benefit from copying verbatim, swags of content from other sites – because the search engines will simply ignore that content (or in worse cases penalise the site for it).

It has long been recognised that URLs such as the following are very likely pointing to the same physical page (although this is not necessarily the case)

www.mysite.com
mysite.com
www.mysite.com/index.html
mysite.com/index.php

Search Engines are  also aware of this and are usually pretty smart about identifying (and usually ignoring) these types of potential duplicate content issues.

Traditionally these types of problems (known as "canonicalisation") have been relatively easy to resolve through the use of mod rewrites or redirects. However, these largely "technical" solutions have been a bit difficult for the less technically minded website owners.  

As more and more site owners use content management systems to create and manage their sites – and in particular with the prevalence in the use of Blogging application such as WordPress, it is becoming increasingly easy to have multiple paths (URLs) to a single physical page and thus  "inadvertently" create duplicate content on your site.

for example, if you create a single blog post, and include it within 3 different categories on your blog, there will very likely be at least 3 different paths to this post – and probably many more (once you include home page, archive and tag based methods of accessing individual posts).

These types of duplicate content issues have traditionally been much harder to fix.

In recognition of this, the major search engines (Google Yahoo, Microsoft, and Ask) have recently got their heads together and collaborated on a new tag (called the canonical tag) to make it easier for site owners to reduce duplicate content clutter and make things easier for everyone.

With the new canonical tag, website owners are able to publicly specify the preferred version of a URL.

For example: If the default home page of your website is www.mysite.com and a duplicate page is located at: www.mysite.com/homepage.htm, by adding :

<link rel=”canonical” href=” http://www.mysite.com” />

within the header section of the www.mysite.com/homepage.htm page, and the search engines will resolve them to the same page.

This tag standard has been adopted by all major search engines, when crawling and indexing your site.

Whilst the new tag may not completely resolve duplicate content issues on the web, it will certainly make it easier for site owners to help to make things a bit better.

For more information on how to use the canonical tag – see : Google Webmaster Central Blog

About Darryl

I dig helping grow and build profitable online businesses. I'm addicted to coffee, and a Rugby (All Blacks) and AFL tragic. I call Brisbane home and love the sun, beach and smart people. Follow me @ireckon

Speak Your Mind

*