Loren reports that Colin Cochrane found this “Over the weekend, Yahoo’s Delicious (del.icio.us) social bookmarking property has been blocking spiders and bots from non-Yahoo search engines from crawling the site and identifying new web pages, sites and bookmarks.” — saying that ‘This isn’t a simple robots.txt exclusion, but rather a 404 response that is now being served based on the requesting User-Agent.’
I took a look at del.icio.us’ robots.txt and found that it was disallowing Googlebot, Slurp, Teoma, and msnbot for the following:
Seeing that the robots.txt was blocking these search engine spiders, I tried accessing del.icio.us with my User-Agent switcher set to each of the disallowed User-Agents and received the same 404 response for each one.
Colin also found that Delicious pages listed in Google are lacking a cache, title, description and other information.
Yahoo!, Search Engine, Spider, Crawling, Search Bots, Delicious, Bookmark, Google, Ask.com, MSN, Slurp, Teoma