Joomla sitemap setup
- Why do we need sitemaps?
- HTML and XML sitemaps
- Sitemap basics
- Online and desktop sitemap generators
- Joomla Sitemap extensions
- Image and video sitemaps
- Multilingual sites
- What to include in your sitemaps
- Inform Google of your sitemap
- Advanced tip 1: URL-rewrite the path
- Advanced tip 2: Prevent Google from indexing the XML file
- Advanced tip 3: Use your sitemap for multilingual versions of pages
Once your site is populated with data and pages you should create a sitemap AND submit it to Google, Bing, and other search engines. Note that a sitemap will not in itself raise your rankings in Google. They mainly serve as a way to inform Google about your site structure and content. Especially for new sites and new content for existing sites this usually helps a little to get these pages quickly indexed in the search results pages. In this article, I will discuss various sitemap solutions, Joomla solutions, how to inform Google about them, and finally some more advanced topics.
Why do we need sitemaps?
The main reason we use sitemaps is to inform Google about the important pages of our site. Note that I emphasized the word "important" here. This is really key: almost every site has a large portion of so-called utility pages: pages like login screens, terms and conditions, retrieving lost passwords, etcetera. Those pages are important for the proper functioning of the site, but they do not add value to the SEO of the site. So, you don't want Google to index them. First of all, you should Noindex them (which you can prevent with robots-instructions), but also, make sure not to include them in your sitemap.
Based on the sitemap, Google will have a general idea of which pages are important and which ones are not. This will not result in guaranteed indexation of the listed URLs, but it does help for sure. So, sitemaps are not a magical fix-all solution for your indexation problems, but it is one of the many boxes you need to tick for proper SEO. Note that sitemaps are specifically useful for large sites. Sites with just a few pages could usually do without one, though I usually always create one.
HTML and XML sitemaps
We have to distinguish between XML and HTML sitemaps. The HTML sitemap is mainly for your users so that they have a nice overview of your site. However, users should already have this overview because of the perfect navigation you have, based on the menu and the internal links. In that case, HTML sitemaps are superfluous. I hardly ever use them, therefore. Actually, I only do so in case a client insists on having one....
Much more important for your SEO ranking is the XML sitemap. Note that it will not simply raise your rankings in Google, but your pages will be included in the Google index faster if you use one.
A sitemap is basically a file called sitemap.xml (though this is not a required naming convention) which is usually (but not necessarily) placed in the root of your site. This is where search engines would routinely expect it to be sitting (unless told otherwise in your robots.txt file). The sitemap.xml file contains the structure of the URLs of your Joomla website in a logically structured XML file. This helps Google and other search engines to determine the structure of your website and how to reach all parts of it. Actually, you could see this as a part of the internal link-building process. The file roughly looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" etc. etc.>
Online and desktop sitemap generators
For small sites with relatively little content, or sites that don't change too much, the easiest solution might be to generate one using an online sitemap generator and just place the file in the root of your site. An example of an XML-sitemap generator is www.xml-sitemaps.com. Just supply the URL of your home page, and the tool will crawl your site just like the Google bot would do. After crawling, you can download the sitemap in various formats (open and check it first, to verify it looks fine). Make sure you only include URLs you want to have included in Google. If necessary, edit the file and remove stuff like login pages, etc. For Google, use either the sitemap.xml version or the G-zipped version sitemap.xml.gz. You can also download an HTML version that you can possibly use to offer a sitemap page for your users. By the way, you can also generate sitemaps from the SEO tool Screaming Frog SEO Spider. It's a free desktop tool with many advanced SEO features. For 10 more of these (including commercial ones, check this post: semrush.com/blog/10-awesome-visual-proven-sitemap-generator-tools.
The disadvantage of this solution is that you have a static sitemap. So, for sites with changing content, a sitemap Joomla extension is a better choice.
Joomla Sitemap extensions
Though you can sometimes use the manual method of creating sitemaps, you can also use many Joomla sitemap extensions. For larger and changing sites, you really need one of those. Often, they also offer additional features, like automatic sitemap generation and submission to search engines.
OSmap is a popular free extension, but there are some more. Especially Jsitemap is very nice. It is a commercial extension, but very popular. It was even listed as the Top rated extension on the JED for a while, with only 5-star reviews, so it's probably a safe choice... OSmap is well-suited for simpler sites, where every article is linked to a menu item. Articles not directly associated with a menu item are not included in the sitemap then. Jsitemap does include those articles, so may be better suited for sites with many blog views. A third extension that is very good is PWT Sitemap (from the same developer as ACL Manager and PWT SEO). Finally, 4SEO offers sitemaps as well.
Whichever one you use, make sure it supports your content types. If you only use core Joomla articles, you will be fine, but for 3rd party extensions, this is not always the case. Say you have a webshop, the sitemap extension should be able to understand the URLs for your extension (Virtuemart, Hikashop, etc.), so check this before you install the extension. Often these extensions create fine sitemaps with just the default settings, but it always pays to go through the configuration to fine-tune the output. You are often allowed to exclude stuff (articles, categories) that are only useful for service purposes, like terms & conditions, login pages, etcetera. Good extensions automatically exclude Noindexed pages, saving you the hassle.
Image and video sitemaps
If images are important for your website you might even add the URLs for the images in an image sitemap, though for many sites this might not be necessary. This would look like this:
Also, the image sitemap should be submitted to your Search Console account. Sites with many videos can even contain a video sitemap. Contrary to basic XML generators, there are hardly any that can create image or video sitemaps, so if you need these, it is best to use a sitemap plugin that supports these. 4SEO, OSmap, PWT Sitemap, and Jsitemap all support image sitemaps. Video sitemaps are only supported by Jsitemap.
Note that if your site is multilingual, it is wise to have separate sitemaps per language. The most well-known sitemap extensions all support this. As an example, if you have OSmap installed, with English, French, and German languages, you will have three sitemaps automatically:
You can submit these sitemaps on a per-language basis in Google Search Console then.
What to include in your sitemaps
Many extensions and online sitemap generators include all your URLs by default. However, this may not always be the best way to go. If you carefully examine your site, there may be URLs that do not really add value for Google. Think of pages like your terms and conditions, login pages, etcetera. If you take care only to include the really valuable pages and leave the fluff out, "Google will consider the pages that are in the sitemap as more important and Google will crawl it sooner" (see www.thesempost.com/google-links-partial-sitemaps-crawling). Many extensions offer the possibility to select your URLs to include per menu or per article/category, so make sure to only include your valuable menu items then.
Inform Google of your sitemap
Make sure to submit the location of your sitemap to Google's Search Console. If you do not do so, Google will not see it, and basically, all your efforts so far have been useless... You can also specify the location of your sitemap.xml file in your robots.txt file, by adding a line like this (especially if it is not located in the site root):
Repeat this for every sitemap you have. You can have multiple XML sitemaps, and also the image and video sitemaps should be submitted (for multilingual sites: you can use separate properties in Search Console for each language). After some days you should see that the submitted URLs are included in the index. You can see this in Google Webmaster Tools too:
Advanced tip 1: URL-rewrite the path
The following tips are definitely not necessary, but for advanced usage, they can be useful. One tip is for users that use extensions like OSmap to create their sitemaps. If you do, you will usually have a non-standard location for the file (like index.php?option=com_osmap&view=xml&tmpl=component&id=1), not in the root of the site where search engines would usually expect to find it. You can rewrite the path so that it will be found at the exact location for them:
RewriteRule ^sitemap.xml$ index.php?option=com_osmap&view=xml&tmpl=component&id=1 [L]
This will ensure that the site can be reached from the URL /sitemap.xml. Thanks to Rene Kreijveld for the tip.
Advanced tip 2: Prevent Google from indexing the XML file
This should definitely not worry you, but Google will often treat your sitemap as an ordinary piece of content and index it. You will usually only see it for very exact searches, or when you use the site:website.com command in Google:
You can prevent the file from being indexed though. Just place the following code in your .htaccess file (make sure to include the correct path to the file):
Header set X-Robots-Tag "noindex"
For more information on this: labnol.org/internet/xml-sitemaps-noindex/18041
Advanced tip 3: Use your sitemap for multilingual versions of pages
If you have a multilingual site you will usually use the rel="alternate" hreflang="xx-XX" attribute in the head section of your site to indicate multilingual versions of your pages. This is how the Joomla core works, you don't have to do this yourself. However, if you have a separate site for each language in separate installations you could use your sitemap to indicate the multilingual versions of the page, like this:
You would probably have to create this setup manually since installations will usually be independent, but it could be a way to inform Google that you have a site with duplicate content in multiple languages. By the way, you can use this technique within a single multilingual site as well, but then there are easier techniques you can apply.