Robots.txt: Blocking from being Indexed / Ranked

Over at WebmasterWorld, a thread is discussing a way to prevent Google from ranking Robots.txt file in SERPs. Google currently shows 182,000 robot.txt files in their index (query [inurl:robots.txt filetype:txt]). Many of them have a decent PR while others have no backlinks at all. A question bring up by a forum member: At any rate, this does […]

Over at WebmasterWorld, a thread is discussing a way to prevent Google from ranking Robots.txt file in SERPs. Google currently shows 182,000 robot.txt files in their index (query [inurl:robots.txt filetype:txt]). Many of them have a decent PR while others have no backlinks at all.

A question bring up by a forum member:

At any rate, this does bring up the crazy question, how can you remove a robots.txt file from Google’s index? If you use robots.txt to block it, that would mean that googlebot should not even request robots.txt - an insane loop. And of course, you don’t use meta tags in a robots.txt file.

because:

  • you can’t use robots.txt to block robots.txt (that’s truly insane, as in this case a search engine would be unable to crawl robots.txt file and thus to find out that it is unable to do that);
  • you are unable to use meta tags in a robots.txt file;
  • you can’t remove the file using Google Webmaster Tools because for that you either need to block it in robots.txt or use meta tags (you are unable to do that) or return 404 header which is also impossible (because it actually exists).[SEJ]

Another board member suggests using an X-Robots-Tag in the HTTP header to block the file:

<filesmatch &ldquo;robots.txt&rdquo;>&lt;FilesMatch &ldquo;robots.txt&rdquo;&gt;<br />Header set X-Robots-Tag &ldquo;noindex, nofollow&rdquo;<br />&lt;/FilesMatch&gt; </filesmatch>

A suggestion from Matt Cutts which he posted in a comment yesterday on his blog:


why not use a noindex directive in the robots.txt file?