Most Cities, States and Federal Agencies are working on some type of Open Data initiatives. The most common is an “Open Data Portal” that makes it easy to grab and use datasets:
Some cities are using Open Data to publish performance metrics like the Seattle Police Department or Louisville’s LouieStat.
Civic Leaders working on these initiatives cite promoting transparency in Government, improving performance and providing data for innovation as reasons why Open Data is so important.
As a Product Manager, it’s helpful to be familiar with what’s out there and how you can play around with these datasets to better understand how your product may benefit.
Before you dive into querying APIs, checkout a few of these projects to see the end result of building something with Open Data.
500 Cities Project
Ok, now let’s dig into some datasets you can play with.
Socrata’s Open Data Network
Socrata hosts over one hundred different data catalogs for governments, non-profits, and NGOs around the world. Checkout their Open Data Network where you can search for datasets.
For example, here’s a page about San Bernardino County Employment. Click “View API” to end up on a page giving you data and an API call you can paste into your browser or Postman.
Namara has organized a bunch of public datasets into a beautiful UI. Create a free account, sign in, create a new project then click Open Data in the left column to search and add datasets to your project. You can view the table data and manipulate it or call the data using their API.
In your project settings, you can generate an API key. Then, in each dataset you can click “API Info” and get the data_set_id and version_id.
You can use ProPublica to request data about Congress such as a list of Recent Bills and Member Voting records.
You’ll need to request an API key by emailing email@example.com then pass that in the X-API-KEY header.
For example, to query Rep. Jared Polis’s voting record:
Open Data and the big IaaS Platforms
Another approach is to checkout Public Datasets baked into AWS, Google Cloud Platform and IBM Bluemix.
This is a great example of using Google BigQuery on NYC Public Datasets.
AWS hosts a bunch of Open Data in S3 buckets.
IBM, as part of the NOAA Big Data Project, has built an easy way to download tons of data.
A few hashags to search around on are #govtech, #opendata, #opengovdataand #opengov. Follow people like @Josh_A_New, @JoshData, @DataInnovation, and the @SunFoundation.
Here are a few links related to Open Data policy and relevant news.
Some history on U.S. Federal Open Data Policy
DATA Act passed in 2014, America’s first open data law. It directs the federal government to transform all spending information into open data.
Conversation on the future of Open Data as Administrations change and the Preserving Government Data Act of 2017
The OPEN Government Data Act “directs all federal agencies to publish their information as machine-readable data, using searchable, open formats and requires every agency to maintain a centralized Enterprise Data Inventory that lists all data sets, and also mandates a centralized inventory for the whole government (data.gov)”.
Open Data 500 US is an interesting survey results showing what kinds of companies use which agencies’ data.