As discussed in the article about duplicate URL's, canonical URL's can be used to set a preferred URL for content that can be reached through multiple URL's. It is one of the methods advised by Google for treating your URL's. Joomla 2.5 did not support canonical URL's, but Joomla 3 does. This should be a good thing, but when you look for the keyword combination Joomla 3 canonical URL, you encounter a lot of blogposts on the Joomla.org forum, discussing incorrect setting of canonical URL's. Let me explain the concept of canonical URLs a bit more, and possible usage in Joomla
The concept of canonical URLs
Canonical URLs have been introduced by the major search engines in 2009 as a concept to help webmasters fight duplicate content on their own websites. Many CMS's used to have issues with multiple URLs for the same content because of the nature of the CMS. This included Joomla, but also Wordpress, Magento, etc. Without canonicals, search engine struggle which URL to choose for their index. Often, they simply include both (probably deluting the value of both), or the less preferred one. Also print-views and email-views can be duplicates of the source URL. This can lead to lower rankings. With the new technique, you can now choose the preferred version of the duplicates for this same piece of content. As a real world example, say you have these 2 URLs in your site:
You can now choose which one would be the nicest one to appear in the search results (obviouslky the first one). This happens by adding the following piece of code in your source code:
<link href="http://......com/blogl" rel="canonical" />
It should at least be implemented in the non-preferred version, but usually it is implemented in both URLs. Even the preferred URL pointing to itself could be a good thing, as it also prevents URLs with parameters being indexed as a separate URL. Say you have a site where you can set the fontsize. In the URL you could see something like this then:
If the URL already has a canonical pointing to itself, there are no issues with this.
Implementation in Joomla
In Joomla2.5 and earlier, canonical URLs do not exist, so unless you use an extension, you go without them. In Joomla 3, they were introduced though. This solves quite a few issues, like the one in the example above (the one with and without the index.php part in the URL). The implementation is such that all URLs receive a canonical, usually pointing to themselves for correctly found preferred URLs, while non-preferred URLs sometimes point to the preferred ones.
Unfortunately the implementation is not yet fully correct. This is a work in progress feature that might be solved in near Joomla releases (including an initiative by Hannes Papenberg to work on this). Before we get this far, we need to live with some limitations.
One notorious (solved before Joomla 3.3) issue was one that happened on sites where the homepage was set-up as a list of featured articles. The canonical URL pointed to another page. What this page was basically telling Google is: don't index the homepage, but index the page with "?view=featured" .... Well, luckily this is solved, but there remain issues. Let's discuss a few:
Canonical URL for non-SEF URLs
As you may know, even if you have non-SEF URLs switched on, the non-SEF URL is still accessible. So you move to the non-SEF URL for the article with id = 1. This will be accessible like this:
- /index.php?option=com_content&view=article&id=1
You would hope that the canonical URL for this would point to the SEF version of the article, but this is definitely not implemented yet. However, you would then at least expect it would point to its own URL, but it isn't:
- /component/content/article?id=1
It simply points to a semi-SEF version of the URL.... Even more, if you now check the canonical URL of this page, it points to yet another page:
- /component/content/article?id=article
This page does not even exits, it returms a 404-error. Happily this is basically the case for non-SEF URLs, but it still shows implementation is not really flawless... So, as long as you have a simple site with menu-items linked to single articles, you will probably not suffer, but on sites with more complex views, it might be good to be aware of the issue. There are some more examples, like incorrect canonicals for paginated blog and list views for categories, etc.
Solutions for now
Like often, there are solutions available. Realize that some of these solutions simply remove canonical URLs completely, which also removes the positive effects it has! Always check very carefully what you are doing. You may now realize that canonicals are a very powerful tool, but using the feature incorrect can cause a lot of trouble!
- One option is to prevent Joomla from creating canonical URL's, by creating an override for the file /plugins/system/sef/sef.php on line 51
- A bit of a dirty technique, but if it just concerns one or two pages on the whole site, you could consider using NoNumber's ReReplacer plugin to set the correct tag...
- The major SEF extensions (SH404SEF, MijoSEF, etc.) all set a canonical URL. Even more, they allow you to customize it, so it gives you a lot of freedom to customize them.
- Recently a really small free plugin has been publised which seems to do just what we need here, the Canonical plugin by Styleware. Take care though, I have seen sites where this plugin did not deliver, with bad consequences.
- Another plugin (paid) is Canonical Links All in One. I have not seen this one in action yet.
- You can unset the canonical tag in your template's index.php file using the following code (solution from Robert Went)
$doc = JFactory::getDocument();
foreach ( $doc->_links as $k => $array ) {
if ( $array['relation'] == 'canonical' ) {
unset($doc->_links[$k]);
}
}
Joomla 3 is still in progress, so let's hope these problems will be fixed and we don't need these fixes. By the way, this article by Yoast is also an interesting read.