Loren reports that Colin Cochrane found this “Over the weekend, Yahoo’s Delicious (del.icio.us) social bookmarking property has been blocking spiders and bots from non-Yahoo search engines from crawling the site and identifying new web pages, sites and bookmarks.” — saying that ‘This isn’t a simple robots.txt exclusion, but rather a 404 response that is now being served based on the requesting User-Agent.’
I took a look at del.icio.us’ robots.txt and found that it was disallowing Googlebot, Slurp, Teoma, and msnbot for the following:
Disallow: /inbox
Disallow: /subscriptions
Disallow: /network
Disallow: /search
Disallow: /post
Disallow: /login
Disallow: /rssSeeing that the robots.txt was blocking these search engine spiders, I tried accessing del.icio.us with my User-Agent switcher set to each of the disallowed User-Agents and received the same 404 response for each one.
Colin also found that Delicious pages listed in Google are lacking a cache, title, description and other information.
Yahoo!, Search Engine, Spider, Crawling, Search Bots, Delicious, Bookmark, Google, Ask.com, MSN, Slurp, Teoma

Recommend this story
Email Newsletter
Missing out on the latest diTii.com news? Enter your email below to receive future announcements direct to your inbox. An email confirmation will be sent before your subscription is activated - please check your spam folder if you don't receive this.
About the AuthorDG