Last week we reported that Google was to add an "Unavailable After" META Tag. Since then, we've spoke to Dan Crow of Google, who provided more information on how to use it, as well information on a new way to send robots blocking info within HTTP headers.
The "unavailable_after" Meta tag will allow you to tell Google that a page should expire from the search results at a specific time. For example, if you have a page that you would like to be removed from the search results at 6pm EST on July 23, 2007, you would add the following Meta tag:
<META NAME="GOOGLEBOT" CONTENT="unavailable_after: 23-Jul-2007 18:00:00 EST">
Once Google crawls the page and sees this Meta tag, it will take about a day for the page to be removed from the search results. Also this tag is only supported with web search. To remove the page completely from Google, including the cache copies, use the Google removal tool.
Google has also added support to control access to non-HTML documents that can't have meta tags in them for blocking, such as PDF files, audio, xls documents and so. This is through a new X-Robots-Tag directive issued via the HTTP Header used to serve the file.
Here are examples of how to use the X-Robots-Tag:
Don't index this document:
Don't show cache or snippet in search results for this file:
X-Robots-Tag: noarchive, nosnippet
Don't index after a specific date (i.e. the "unavailable_after" tag):
X-Robots-Tag: unavailable_after: 7 Jul 2007 16:30:00 GMT
You can also combine these tags as well.