What is a Canonical URL?
A canonical URL is a your preferred URL. That doesn’t really tell you much, so we’re here to explain canonical URLs in a little more detail. URLs can have multiple variables and as such, to avoid issues like duplicate content, Google asks that the webmaster chooses one to use. Let us look at our homepage and see what variables are available:
Using these stems can produce further variables, it can also depend on the type of server you use. These variables are often tacked on to the end of the stems and can include /index.html, /index.asp, even having a forward slash at the end of a URL will count as a separate page. As an example, a variant might be weareyellowball.com/index.html/.
The big problem is that whilst these may all seem like fairly trivial variables that you see regularly, in actual fact they all count as separate web pages. In fact, you could have different content on every single one of these URLs. If the content is the same (which is most likely) it causes a duplicate content issue in that because they are considered different web pages, they are all competing against each other to rank.
A canonical URL is one that has been chosen, either by the Webmaster or by a default setting on their CMS to act as the main URL, utilising 301 redirects on the other variables to automatically and permanently redirect to the canonical. Google can also automatically canonicalise a URL but you do not appear to have any control over which one they choose, they just try to choose the right one. For example, if you entered www.weareyellowball.com into your web browser it would automatically redirect to your canonical URL which is weareyellowball.com.
Canonical URls usually refer to the homepage and is also known as the canonical domain although you want to make sure that you set a preferred domain so that you do not have all of these variables occurring for each one of your web pages.
Default settings on servers can often create duplicate URLs which you will then have to redirect and place rel=canonical tags on the duplicates in order to identify you canonical URL. The most common server generated URLs are (using the Yellowball domain):
SEO Considerations and Best Practice for Canonical URLs
301 redirects are a permanent redirect from one page to another, effectively merging the two pages. More information on 301 redirects can be found on our guide to 301 redirects. All variables of a URL or domain should be 301 redirected to the canonical.
Canonical URL Tag attribute is designed for robots rather than the user (it is a rel attribute). The user will still be able to view the page in question. The canonical tag should be added to the HTML header (<head>) of a page and tells robots that this page is a duplicate of another one along with which page contains the original information. As a result, the search engines should then consider all inbound link juice and content metrics to be attributable to the original page.
You should only really have to apply either a 301 or a canonical tag to each page (we prefer to simply 301 redirect the page) although if you wanted to make sure you could add both!
If you have multiple versions of a URL or domain you run the risk of other websites linking to these variables, which will in turn reduce the amount of direct links coming to your preferred domain. 301 redirects to the canonical will pass this link juice to the preferred URL and help search visibility.
Through Google’s search console you can set your preferred domain via site settings which will indicate to Google which version of your domain you would like indexed and ranking on search results. However, this will not prevent users from viewing other URLs and as such they advise 301 redirecting these variables to the preferred domain (a.k.a canonical domain).
In a blog post, Matt Cutts gives clear instructions NOT to use the URL removal tool on Google’s search console to remove versions of a domain, for example if you wanted www.weareyellowball.com to be the preferred domain, you should not use the URL Removal tool to remove weareyellowall.com.
Finally, websites that have complex filtering systems or search functions may automatically create different URLs for pages that appear on different areas of the site. E-commerce sites are a classic example of this where products appear across multiple categories and the CMS creates multiple URLs.
Technical Implementation of Canonical URL Tag Attribute
<link rel=“canonical” href=“insert canonical URL here” />
Be careful not to create an infinite loop between www.weareyellowball.com and www.weareyellowball.com/index.html or www.weareyellowball.com/index.php due to Apache using the same file for these URLs. As such, they just carry on redirecting, creating an infinite loop. For information on how to prevent this please see (https://moz.com/blog/apache-redirect-an-index-file-to-your-domain-without-looping).