Ever heard of Orphan pages? These are pages that are outside the navigation paths of your site but that are still reachable on your site by extensive crawlers (like Google....). Often you may not even know that they are there, but they can seriously harm your SEO efforts. Make sure you are aware of this risk.
First of all: having orphan pages in your site in itself is not necessarily bad, as long as you are aware and mitigate any risks in your workflow. You may think this article won't effect you, but I'm pretty sure it does. Just let me give you some examples, and then look at your site again.... Almost every site has at least a few of these pages, but this is usually not that much of a worry. However, some sites have loads of them, and then you may have issues.
There are plenty of situations where you have orphan pages:
The big problem with all these situations is that the URL's Google has indexed for your site is heavily polluted with either duplicated blocks of content or with pages with hardly any real content (so-called thin-content pages).
Basically you need to ask Google... You can do so in 2 ways. If your site is not too big, just go to Google.com and type in the following query: site:example.com. This will bring up a list with all URL's that Google knows for your site. Even Google may not always be 100% complete, but it will do for our purpose:
This looks perfectly fine, Google will probably start the list with the important pages. However, scrolling down, look out for obscure stuff. A result like the one below will make me suspicious:
In this case, it turned out to be HTML page generated by a slider extension. The site contains hundres of them, while there were only a few dozen "real" pages. Just check the same for your site and compare the search results with what you would expect to see.
For smaller sites, this method works fine, but for larger sites, it helps to use a tool. Personally, I use the Website Auditor by SEO Powersuite (free to use with some limitations). It crawls your entire site, but in the advanced settings, you can also have it check the Google index to see if it contains pages that are outside the navigation. Once finsihed, check the list of URL's, looking at the columns "Orphan pages":
You can also easily export the list for further investigation.
If you found any orphan pages, what should you do? It depends on the nature of the pages you found. The image for the image slider is a headache issue and requires cumbersome work, like 301-redirecting the useless pages to the page where the slider is sitting. Better would be totally ditch extensions like this...
However, if you found pages that you deliberately set up to build up pages from modules or similar, you can quite easily solve this without any damage. As an example of valid use, take a look at the documentation for the PWT extensions, like PWT SEO: extensions.perfectwebteam.com/pwt-seo/documentation:
This looks like one page / article, but actually we created an article for every section. The content-table on the right is used for in-page navigation using anchor-links. So this page is built up using 20 or so articles, that all have a separate URL. Googl only indexed the main documentation page though. The solution: give each article used for this page Robots-setting of Noindex, Nofollow or Noindex, Follow (Publishing-tab):
You see, the solution is really easy. You can simply keep using building your sites the way you did, as long as you are aware what can happen. Also, make sure that your sitemap excludes pages with a Noindex tag. Otherwise you are giving Google conflicting instructions. Most situations with orphan pages are caused by this type of set-up and can easly be fixed.
There are always other situations that require alternative solutions. One more frequent example: even if you have not configured the front-end sign-in page, it will be there. Just type in http://example.com/index.php?option=com_users&view=login for your site. Often this page is indexed by Google. To remove it, create a menu-item of type Login and set it Noindex.
You see, there are lot of possible issues, but most can be solved quite easily. So, check your site and see if you improve your SEO in a few simple steps!
Joomlaseo.com is fully built and written by Simon Kloostra, SEO Specialist and Webdesigner from the Netherlands. I have also published the Joomla 3 SEO & Performance SEO book. Next to that I also sometimes blog for companies like OStraining, TemplateMonster, SEMrush and others. On the monthly Joomla Community Magazine I have also published a few articles.