Application Development Trends: Using Government Data
Recent initiatives have set out to make government data more readily available to information consumers -- including app developers.
The White House kicked off the federal Open Government Initiative in late 2009. The program calls for agencies to “expand access to information by making it available online in open formats.” Individual efforts under the open government umbrella include Data.gov, a clearing house of sorts that at press time offered 378,000-plus raw and geospatial data sets, along with 103 government mobile apps.
Open data received a reinforcing push earlier this year with the publication of the Federal Digital Strategy. The policy calls for agencies to design new IT systems for openness and “expose high-value data” as APIs. For established systems, the policy requires agencies to identify at least two significant customer-facing systems with high-value data and make that data available through web APIs.
Developers are currently making use of government data for apps in sectors ranging from healthcare to education. App makers think government open data policies are a step in the right direction, but note that there’s plenty of room for improvement.
More and Better Data for Application Development
Sheetal Shah, health solutions architect at Avanade, a business technology solutions and managed services provider, says the Open Data Initiative has begun to bear fruit.
Shah has been leveraging government health and population data. “The volume [of data] has significantly gone up and the quality is starting to get much better,” he says.
Data is becoming easier to obtain as well. Shah notes that getting at some government data previously would have required a Freedom of Information Act request.
An Avanade team recently participated in the Medicare Claims Data Developer Challenge, building a tool that uses data from the Department of Health and Human Services (HHS) and the U.S. Census Bureau. Shah says that working with government data on the developer challenge was relatively easy.
But the team encountered some challenges -- the volume of data, for example. The health data acquired through HHS’ HealthData.gov repository amounted to hundreds of thousands of rows of data. “Getting the data and staging it in our database and doing manipulation and pivoting off of the Census data required some work,” Shah says.
CollegeCalc.org, meanwhile, taps Department of Education data to create college cost estimation tools. A spokesman representing CollegeCalc’s development team says the Department of Education does a good job publishing data on its IPEDS Data Center in .csv and other standard formats.
“It’s a very flexible site which allows very specific custom tables to be built and exported,” he says. “In terms of application building, a software developer with an average skill set would have no problem importing this data to a database and developing a custom application around it.” The spokesman says the CollegeCalc website was built in a matter of days with Department of Education data providing the foundation.
Open Data Needs Improvement
Michael Shoag, Director, Government Services at Forum One Communications, a digital communications firm, says the government now makes data available in a variety of formats including PDF files, online tables, images, dynamic maps and charts, and raw data files. But the key, he notes, is to make the raw data available.
“Showing key results as charts, graphs or maps is a great way to tell stories with the data -- to help people understand what it means,” Shoag explains. “If that is not partnered with the ability to download the raw data, however, it is much less useful. Researchers, corporations and academics often need to mine the raw data to find answers, verify theories and tell results.”
In some cases, Shoag notes, people must use an online tool to generate dozens of reports, and aggregate the data into one file to get the data they need to begin their research.
Ideally, government agencies would make their raw data available online, Shoag says. APIs are useful for data in flux. “If the data changes frequently, then agencies should include APIs to enable developers to pull in the newest data dynamically,” he says.
The best-case scenario would also have government agencies providing consistency of taxonomies across data sets. But Shoag says obtaining consensus on taxonomies can prove “extremely time consuming and should not be a barrier to making data available.”
Agencies are beginning to make some progress toward the next level of data availability. In July, the Census Bureau, for example, in July released its first-ever public API, which lets developers design web and mobile apps using demographic and socio-economic data. In August, Census reported that 860 developers received keys to access the agency’s data sets.
But some developers have reservations regarding government-provided APIs. “I’m hesitant to rely on live queries from my application to a government API as they offer no SLA guarantee; hence it’s more reliable to download and store data within the context of a custom application,” the CollegeCalc spokesman says.
In addition, he says he encountered a few data sets on data.gov available via web services APIs, but found the data delivered through those services to be out of date. It would be tremendously beneficial if the government were to offer high availability web service APIs with current data, he says. This approach would allow more rapid application development without the need to locally replicate data.
“As it stands now, the timeliness and availability of APIs make their use a non-option for application development,” he says.
Photo: Corbis Images