What is Sitemap auto discovery?

Answer:

4/21/2007  Sitemap auto discovery via the robots.txt protocol

By: Melanie Prough

All the crawlers currently recognize the robots.txt protocol, so auto discovery was the natural evolution.  The top 3 engines,  Yahoo; Google; and Ask.com., have announced their support of the sitemap inclusion protocol.  So supposedly  no more submitting sitemaps manually, but I would still submit new sitemaps for a few months to be safe.  Here you can read the 4/11/07 post from Vanessa Fox concerning the development and the protocol.  I played around with this for several hours, and to my dismay could not validate the robots file after adding the sitemap.  After much searching, posting and reading I found some help and suggestions.  Putting all that I read in to force...Below is how to add your sitemap without a syntax error.

Sitemap: http://www.your_domain.com/sitemap.xml

User-agent: *

Disallow: /cgi-bin/

Ok first thing, if your map is titled

# Robots.txt file for www.your_domain.com

Then you will space a line under it before adding the sitemap line.  The sitemap line above is accurate for sitemaps.org protocol.  If you do not space between the top/title and the sitemap command it will not validate in Goggle's Webmaster Tools.  To avoid any other possible syntax issues, I also spaced a line after the sitemap directive.  The spaces in theory mean nothing to a robot.

I went ahead and got on board with this, I will keep this article up to date as the stats develop changes in either direction.  Going forward in this early stage is a risk, but also an opportunity for a lower PR to get a leg up.

Melanie Prough [PRO Webs, Inc.]

Feel free to reprint as long a credit & links remain intact.

© 2003-2012 PRO-Webs, Inc. Woodbine, GA 31569-2051