DataSift Historics Unlocking Two Years of Data Archive from the Twitter Firehose

DataSift on February 28, launched a new online service called "Historics," which includes filtering and search capabilities, and allows customers to mine Twitter's past two years of global tweets archive for pay-as-you-go analysis to enterprises and entrepreneurs.On the Datasift website, there are a number of bundles on offer, ranging from $1,000 per month for a […]

DataSift on February 28, launched a new online service called "Historics," which includes filtering and search capabilities, and allows customers to mine Twitter's past two years of global tweets archive for pay-as-you-go analysis to enterprises and entrepreneurs.

On the Datasift website, there are a number of bundles on offer, ranging from $1,000 per month for a 'lite' offering to $15,000 per month for companies wanting access to more data.

The cost of Historics will depend how far back in historic tweets a business wants to go.

DataSift Historics Pricing Chart

Businesses are expected to use Historics to see how their brand is being perceived on social networks, said Datasift chief marketing officer Tim Barker. He said the firm had been working on the offering for two years in partnership with Twitter.

Information Age reports that the Historics service "sits on a Hadoop cluster with over half a petabyte's worth of storage." DataSift adds additional context such as sentiment measures, link content, Klout ratings, gender and location and provides an interface as well as an API to help customers filter the precise information they need out of Twitter's 250 million daily tweets. "You need to build what DataSift has built to consume that," Barker stated. "[Customers] typically want a drink or a gallon of the firehose, not the whole piece."

"It's not only the two years of 140 character tweets we will be offering in the archive, but we will be [including] information on users' location, or with information on the links that that they posted in tweets, for example," Barker said. "Companies have analyzed data off Twitter before but they don't get details of locations and sentiment. In any case this is the first time a two-year archive of tweets has been made readily available as a service for businesses."

He said Datasift had 1,000 companies already on the waiting list for the Historics platform, including 200 of the Fortune 500 companies. "We have been absolutely inundated by interested companies," he added.

Most of DataSift's customers will be using the Historics resource for relatively more mundane purposes, such as measuring responses to past product launches and finding out which tactics worked best and worst. The most valuable information coming out of Twitter is what's being tweeted today, but having access to a two-year archive allows companies and entrepreneurs to find patterns from the past to help them make more sense of today's tweet stream.

Barker said he did not expect there to be privacy concerns about the platform.

"Every social network has to be diligent about privacy of course but Twitter was created to be open and to be valuable to research and development," he said. "Also, if a user deletes a Tweet, it gets deleted from our service and will not be passed on to businesses."

DataSift is a spin-off from TweetMeme, a Twitter search engine that was set up "when there were literally 20 people on Twitter", says founder and CTO Nick Halstead. Thanks to that early relationship, it is now one of only two companies licensed to access the Twitter firehose (the complete stream of all Tweets).

DataSift Historics will be broadly available in April.