Responsible Uses of Open Data
This week I attended the Responsible Uses of Open Data in Government and the Private Sector conference at NYU. As someone who teaches with open data and lectures on the use of open data in government, I have a keen interest in learning about the latest thinking about the costs and benefits of open data.
The conference seemed tilted towards academics and in particular, academics with a background in law. As an open data practitioner, this made the event more abstract than I’d hoped, with some predictable confusion over the use of particular terms of art:
The challenge for those of us who work to train local government officials in creating and managing public open data is to translate the concerns and theories of open data into actionable frameworks for policies and procedures:
The problem with having an academic audience is how concepts tend to become simplified almost to the point of being unusable. One important example of this for me was the assumption that the sharing of open data comes from these two large entities “Government” and “Private Sector”.
In reality, there are many levels of government in the US, city, county, state, and federal. Within each are a multiplicity of agencies and offices that all produce data and disseminate to everyone, not just the “public” but to other government agencies. In fact, government employees are one of the biggest users of open data.
In my own classes, I’ve had students tell me that it’s easier to get data about their agency from the NYC open data portal than from their colleagues in another office. When it comes to getting access to data from other agencies or levels of government, open data is a far more efficient way of getting essential information for action than going through official channels, particularly for time-sensitive analysis.
The “Private Sector” likewise is a nuanced field, from start-ups using public data to determine buildings that are likely candidates for solar panel installations to applications that use public GTFS feeds to give transit directions (like Google Maps, Embark, Citymapper, KickMap, and a host of other applications). They aren’t all greedy corporations intent on harvesting data for nefarious purposes. In some cases, they are enlightened participants in the public debate and release their data to the public. Others do so under legal or regulatory compulsion.
It’s naive to assume that a business would release one of their most valuable assets, their data, unless there was a compelling reason to do so. Social media companies like Google, Twitter, and Foursquare do so as a way to support adoption and use of their platform or monetize their data through usage charges. This isn’t the case for most businesses and it should be expected that unless there’s a carrot of social benefit or positive marketing to be had, governments need to use the stick of laws and regulations to get the data of importance to stakeholders.
When it comes to the public, there is also a wide spectrum with respect to consumers of open data. We all use open data in some way every day, whether its the weather reports that come to us courtesy of data shared by the National Weather Service, transit alerts shared from our local transportation departments, or street maps to guide us to our destinations courtesy of our local planning departments.
Despite near universal use of open data, only a small percentage of public users are able to digest open data in its raw form and make sense of it. The people in this small percentage (which I consider myself part of) are key to making the data available and useful for the larger majority, unlocking the value of open data, whether its analyzing the pattern of parking tickets to the daily experience of taxi drivers in NYC and the pattern of development in Bushwick.
Importantly, these individuals have something to contribute back to the suppliers of open data, whether it’s corrections that address data quality issues, the format of the data being released (PDFs are where data goes to die), or other work that helps improve the overall utility of open data, the public has a greater role to play than just consumers of whatever data governments and private industry decide to pitch over the firewall.
My hope for future sessions is that there will be more practitioners from the various areas of government, local, state, and federal, as well as those engaged in the actual use of open data in their work, like Enigma.io and CityMart. We need more collaboration between those charged with releasing data to the public, those who work with the data itself, and those familiar with the larger legal, ethical, and technical framework in which these activities take place. Biasing too much to any one group runs the risk of silencing important voices that must be heard if we are to fashion clear, practical principles in an area that is proving critical to the new and emerging paradigms of governing in the 21st century.