PCAST Recommends Open Data for Federal Agencies

The President's Council of Advisors on Science and Technology (PCAST) studying the challenges of conserving the nation's ecosystems. The report, titled "Sustaining Environmental Capital: Protecting Society and the Economy" was presented to President Obama on July 18th, 2011, informs Alon Halevy, Senior Staff Research Scientist, Google Research.The full report is now available to the public […]

The President's Council of Advisors on Science and Technology (PCAST) studying the challenges of conserving the nation's ecosystems. The report, titled "Sustaining Environmental Capital: Protecting Society and the Economy" was presented to President Obama on July 18th, 2011, informs Alon Halevy, Senior Staff Research Scientist, Google Research.

The full report is now available to the public and is embedded below (first embed).

Per press release (see second embed below) announcing the report summarizes its recommendations:

The Federal Government should launch a series of efforts to assess thoroughly the condition of U.S. ecosystems and the social and economic value of the services those ecosystems provide, according to a new report by the President's Council of Advisors on Science and Technology (PCAST), an independent council of the Nation's leading scientists and engineers. The report also recommends that the Nation apply modern informatics technologies to the vast stores of biodiversity data already collected by various Federal agencies in order to increase the usefulness of those data for decision- and policy-making.

One of the key challenges we face in assessing the condition of ecosystems is that a lot of the data pertaining to these systems is locked up in individual databases. Even though this data is often collected using government funds, it is not always available to the public and in other cases available but not in usable formats. This is a classical example of a data integration problem that occurs in many other domains.

"The report calls for creating an ecosystem, EcoINFORMA, around data. The crucial piece of this ecosystem is to make the relevant data publicly available in a timely manner and, most importantly, in a machine readable form. Publishing data embedded in a PDF file is a classical example of what does not count as being machine readable. For example, if you're publishing a tabular data set, then a computer program should be able to directly access the meta-data (e.g., column names, date collected) and the data rows without having to heuristically extract it from surrounding text," Halevy stated .

"Once the data is published, it can be discovered by search engines. Data from multiple sources can be combined to provide additional insight, and the data can be visualized and analyzed by sophisticated tools. The main point is that innovation should be pursued by many parties (academics, commercial, government), each applying their own expertise and passions," added Halevy.

Here's the full press release:

And, here's the full PCAST report:

[Source: Public Sector & Elections Lab]