Avoid duplicate URL's

Like, the www or non-www issue, any kind of duplicate content in your site could be a hazard for your search engine rankings. Of course, you will have to make sure that your content is unique and is not copied from somewhere else, or re-used in other parts of your sites, but you will also have to make sure that the same page cannot be accessed through multiple URL's.

A lot of open source CMS's can have a possible issue with this, and Joomla is one of them. Even when you have SEF links turned on in your Joomla global configuration, the non-SEF URL still exists. This means 2 URL's with the same content, and often there are more. Duplicated URL's can exist because of the following reasons:

  • www or non-www issues, as discussed in the previous article.
  • Non-SEF URL still reachable, despite SEF-URL's being activated, like this:
    /index.php?option=com_content&view=article&id=2 (and many more)
  • Pages ending with index.html, index.php, etc, which show the same information as the page without the index-part.
  • Parameters in the URL, like ..../page1?font-size=large
  • Trailing slashes
  • Sometimes even uppercase, lowercase issues
  • In Joomla specifically: The same article reached from multiple menu-items.

Joomla Hosting tip: Siteground

Joomlaseo.com loads in 0.5 - 1 second and has a Pingdom score of 100%!!! Reason enough to recommend Siteground hosting. Fast servers, great support, free SSL, etc. And it's not expensive at all...

 

Having pages being reachable from multiple URL's could harm your rankings, so it's best to prevent this. This can be done in many different ways. Some can be used on their own, but you can also combine techniques to totally get rid of your duplicates:

1: Correct menu set-up

One very common reason for duplicates is if you link one article from multiple menu-items. This is a very common thing to do: sometimes an article that is reached from the main menu must also be reachable from a footer menu-item. In this case, Joomla builds a URL for the menu-item. Let's compare 2 examples:

  1. If you have a menu called Products, with a submenu-item for each product, the URL for the article Chair would be /index.php/product/chair
  2. If you make the same article reachable from the footer, but directly (no submenu), the URL is /index.php/chair

Apart from some small stuff, like a breadcrumb path, or module assignments, these pages are identical, and are real duplicate content issues. Partly, this is because of the way Joomla works, but you can work around this in many cases:

  • Sometimes the Main Menu is repeated in a footer position. As long as it is exactly the same, simply publish the Main menu in the footer position again, don't create a new menu with identical links.
  • Often you really need a menu with different links. In this case, consider not to create a new link, but use a menu-item of the type Menu Item Alias (under System Links). This simply takes you the destination article of the original menu-item, no new URL's created!

With some creativity, you can sometimes even think of more solutions like this.

2: Set a canonical tag to the correct page

Set a canonical tag to the correct page, so that the non-SEF URL is not being indexed. There are ways to achieve this, but it is only useful for experienced users. Doing this wrongly might have the opposite effect. The easiest way to achieve this is probably by using an extension. Most SEF extensions off solutions for this.

If you set the tag correctly, all possible duplicates of a Joomla page have the tag in the head section, like for example the page you are currently looking at. It can be reached in 2 ways:

The first URL is currently rerouted, but if it wasn't, configuring a canonical URL will tell Google that it is the same page as the SEF URL:

<link href="/Checklist/avoid-duplicate-url-s" rel="canonical"/>

 Using this technique, you can prevent having duplicate URL's indexed by Google, even when they are still accessible. 

The only option you can set in Joomla is in the settings for the System - SEF plugin. It allows you to set a Site Domain. However, it is only useful if you make the same website available through multiple domains (parked domains):

system-sef-plugin-canonical

Note that since Joomla 3.4, a canonical tag is only applied for non-canonical URL's. The actual preferred URL does not need a canonical pointing to itself of course. You should be aware that currently (Joomla 3.2, fixed by now in 3.2.1), there may be some issues with how canonical URL's are treated (some of those still apply) You may need to use an extension to set them as you wish.

3: Use 301 redirects

Using 301-redirects means that you tell anyone who accesses such a URL: This link has permanently moved, please go here so if somebody goes to:

http://joomlaseo.com/index.php?option=com_content&Itemid=125&catid=15&id=18&lang=en&view=article

he is forwarded to:

http://joomlaseo.com/Checklist/avoid-duplicate-url-s

You can achieve 301 redirects either in your .htaccess file, or using an extension, like ReDJ, which is a very nice and simple extension for this. More on 301-redirects can be found in the article about re-routing old URL's.

4: Set up rules in .htaccess

Using your Joomla .htaccess file you can solve quite a few of your duplicate URL issues (provided URL-rewriting is on). We already discussed how to reroute www and non-www URL's, but you can also use it to get rid of your trailing slashes:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ http://%{HTTP_HOST}/$1 [R=301,L]

Again, test if the trailing slash is indeed removed, plus if your site actually still works. Allways be careful with .htaccess changes! Similar issues could arise because of parameters, like setting a font size, leading Google to think that 2 different pages exist:

5: Set-up robots.txt

You can set-up your robots.txt file in such a way that it disallows any URL with a query string, i.e. a '?' from being indexed, see the article about robots.txt for the code. It both prevents issues with duplicate UR's because of non-SEF URL's, but also real query strings, like these:

6: Use an extension

For smaller sites, preventing issues can easily be done by configuring .htaccess, robots.txt, and possibly a small extension for 301-redirects, but for larger sites, using a SEF extension is probably more efficient. It takes some time to learn how these extensions work, so start trying it out on a site that is not that important. If used correctly, it will ban all duplicate URL issues from your site. However, if used incorrectly, it could have the total opposite.
Some well-known SEF-extensions:

Check the extensions section of this site for information about these and others.

7: Google Webmaster Tools

Using Google Webmaster Tools is an alternative way of getting rid of duplicate URL's. Preferably you should use any of the discussed techniques to prevent issues showing up in your Webmaster Tools, and even if they do, first go back and review your set-up. However, sometimes you may not be able to prevent duplicates from showing up.

Please note: Don't panic when you see issues like this as warnings in Webmaster Tools. Especially with new sites, Google often encounters these issues, but usually, especially with parameters, it learns that this is not a separate page, and the warnings disappear after a few weeks.... Deal with the remaining issues, but remember that this is an advanced topic. For more information read our article on this subject.

Some other ways to deal with duplicate URL's in Joomla are discussed in this recent article on the Joomla Magazine.

About this site

Joomlaseo.com is fully built and written by Simon Kloostra, SEO Specialist and Webdesigner from the Netherlands. I have also published a book and blogs for companies like OStraining, TemplateMonster, SEMrush and others.