The Content Classifier in the File Classification Infrastructure extracts text from files using the IFilter mechanism that enables the Search Indexer. Here is a list of file types that have a corresponding IFilter installed on a Windows Server 2008 R2 (download) install without any other software installed on it. Other free and commercial IFilters also exist. You can start at http://ifilter.org/ to find more IFilters.
More info: Storage Team