robots.txt
- Posted by PRO-Webs, Support
- 02 May 2009
- R
A robots.txt is a standardized tool for the robot exclusion standard, also known as the Robots Exclusion Protocol or robots.txt protocol, is a convention formulated to prevent voluntarily cooperating web spiders and other web robots from crawling all or part of a website which is otherwise publicly viewable. Robots are often used by search engines to categorize and index web sites, or by webmasters to proofread source code. The standard complements Sitemaps, a robot inclusion standard for websites.
Worth noting, that blocking a url in your robots.txt WILL NOT prevent indexing, just crawling. These pages can still be indexed, but they lack supporting Meta information as they are not crawled.


One Comment
Pingback: Webmaster’s Dictionary » XML - Google Sitemap