robots.txt

  • Posted by PRO-Webs, Support
  • 02 May 2009
  • R

A robots.txt is a standardized tool for the robot exclusion standard, also known as the Robots Exclusion Protocol or robots.txt protocol, is a convention formulated to prevent voluntarily cooperating web spiders and other web robots from crawling all or part of a website which is otherwise publicly viewable. Robots are often used by search engines to categorize and index web sites, or by webmasters to proofread source code. The standard complements Sitemaps, a robot inclusion standard for websites.

Worth noting, that blocking a url in your robots.txt WILL NOT prevent indexing, just crawling. These pages can still be indexed, but they lack supporting Meta information as they are not crawled.

© 2003-2012 PRO-Webs, Inc. Woodbine, GA 31569-2051