Adobe helps Google and Yahoo search engines to crawl Flash files

Adobe announced that Google and Yahoo are adding search capabilities that will enable users to look inside the content of files encoded in Adobe’s Flash file format — SWF. Adobe provided some its Flash technology to Google and Yahoo in order to improve the indexing of SWF files and especially Flex applications. “Although search engines already index […]

Adobe announced that Google and Yahoo are adding search capabilities that will enable users to look inside the content of files encoded in Adobe’s Flash file format — SWF. Adobe provided some its Flash technology to Google and Yahoo in order to improve the indexing of SWF files and especially Flex applications. “Although search engines already index static text and links within SWF files, RIAs and dynamic web content have been generally difficult to fully expose to search engines because of their changing states.”

The content inside SWF files has heretofore been ignored by the search engine giants, but Adobe has worked with both companies to make sure that their search engine technology can now look inside existing and future SWF content, including text, hyperlinks, audio and video content. To solve this problem, Adobe decided to share Flash Player technology that allows search engines to walk through a SWF file and simulate user interactions.

Google's Webmaster Blog explains:

We've developed an algorithm that explores Flash files in the same way that a person would, by clicking buttons, entering input, and so on. Our algorithm remembers all of the text that it encounters along the way, and that content is then available to be indexed. We can't tell you all of the proprietary details, but we can tell you that the algorithm's effectiveness was improved by utilizing Adobe's new Searchable SWF library.

There are three main limitations at present, and we are already working on resolving them:

1. Googlebot does not execute some types of JavaScript. So if your web page loads a Flash file via JavaScript, Google may not be aware of that Flash file, in which case it will not be indexed.
2. We currently do not attach content from external resources that are loaded by your Flash files. If your Flash file loads an HTML file, an XML file, another SWF file, etc., Google will separately index that resource, but it will not yet be considered to be part of the content in your Flash file.
3. While we are able to index Flash in almost all of the languages found on the web, currently there are difficulties with Flash content written in bidirectional languages. Until this is fixed, we will be unable to index Hebrew language or Arabic language content from Flash files.