2. Protocol proposed by Google in 2005
Sitemap = « plan d’un site web »
It’s a file that shows the architecture of a
website
languages used:
◦ Text
◦ XML
2
3. Facilitate the indexing of a website
◦ Promotes exploration by crawler
◦ File exclusively used by search engine
3
4. Useful if menus are in Flash or in JavaSpript
◦ Not link HTML so not indexed
Useful for a website with many pages
◦ Many URLs so indexing time is high without
Sitemap
4
5. 12 websites:
◦ 6 with Sitemap submitted to Google and Yahoo
◦ 6 without Sitemap
Results:
◦ Average time for passage of crawlers with Sitemap:
Google: 14 min
Yahoo: 245 min
◦ Average time for passage of crawlers without Sitemap:
Google: 1375 min
Yahoo: 1773 min
5
6. « à la mano »:
◦ Requires knowlegde in XML:
<url><loc>http://www.example.com/?id=who</loc>
<lastmod>2009-09-22</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority> </url>
6
7. Many generators online
Extensions for CMS
◦ Google XML Sitemaps (Wordpress)
◦ Xmap (Joomla)
◦ Site Map (Drupal)
7
8. If the file is at the root of the website
Crawler through the file
If (content of one page has changed)
{
Page is indexed with new content;
}
Else
{
Crawler change page or website;
}
8
9. Economy of banwidth:
◦ If the contents are unchanged, crawler changes
website
◦ Only know if a Sitemap is introduced
9
10. Economy of computer time:
◦ Thanks to Sitemap
Thanks to this tool, Google is better
compared to others search engines
10