Ken Moss, General Manager at Live Search announced today —”that Google, Microsoft and Yahoo! are coming together in support of the SiteMaps protocol. The goal of this effort is to improve search results for customers around the world. This protocol enables site owners everywhere to tell search engines about the content on their site instead of having to rely solely on crawl algorithms to find it.”
We can provide site owners with one simple way to share information with every search engine. You just publish a sitemap, and every engine is instantly able to read and use the data to more effectively index your site. Since this is a free, widely supported protocol, our hope is that this will foster an even broader community of developers building support for it.
We are 100% behind this protocol - this kind of collaboration will help improve the search experience for all of our customers, and we are working hard to release full support in 2007. We are starting to alpha test with internal partners such as MSDN and Microsoft Support now.
He further points to SiteMaps.org for the gritty details:
Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.
Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.
Google provides detailed information in Webmaster Tools, including a sample of the xml:
[xml] < urlset xmlns="http://www.google.com/schemas/sitemap/0.84"> < url> < loc>http://www.example.com/ < lastmod>2005-01-01 < changefreq>monthly < priority>0.8 [/xml]