Plenario: Changing How We Use Open Data
Originally published on 28th July 2015
The majority of open data portals online today are confined to information from a single city government or political jurisdiction. For researchers, policymakers, or other data portal users, this can create problems: we know that urban landscapes are complex, interconnected places that do not exist within the bounds of a single government entity. What if, say, I want to see a map of recent traffic accidents in Manhattan, and understand if weather conditions have an effect? Or what if I want to see if there’s a connection in Chicago between sanitation complaints and environmental inspections?
Finding the answers to such questions is not an easy one; it would require looking at datasets from the City of New York and NOAA, and the City of Chicago and Cook County, respectively. Each question requires an awkward compare-and-contrast between data from separate government portals. What’s a researcher to do?
The solution is a relatively straightforward one – instead, work with “One Database, One Map.” So goes the tagline on Plenario, the University of Chicago-designed, open-access online data hub that makes the way we view, understand, and use open data drastically more convenient.
A “hub of hubs”
Plenario breaks free from borders to provide data from datasets and portals around the country, all on the same continuum of time and space (or, simply put, one map). This means that with one query, users can access, combine, download, and visualize disparate sets of data all in the same place. Plenario currently has data from the portals of Chicago, New York, Los Angeles, San Francisco, Boston, Austin, Washington, D.C., Illinois, and New York State, among others. Since its design accommodates data from any open portal, international data can also be imported. Glasgow’s University of Strathclyde, for instance, has installed Plenario to integrate data from multiple UK cities.
Given the platform’s reach, what does Plenario look like from a user’s point of view? For an example, let’s say that I’m looking for the “story” of Chicago’s Uptown neighborhood, for the first six months of 2015. How many business licenses were granted during this timeframe? How does that contrast with the neighborhood’s graffiti complaints? Or, how about that original question—what does a comparison between sanitation complaints and environmental inspections look like?
Step one would be to specify my timeframe, and step two would be to draw a polygon around the neighborhood’s boundaries:
And step three? See what Plenario can provide:
In less than a minute, I know that 143 business licenses were granted, with a considerable spike occurring in early June; I also know that 472 graffiti complaints and 21 sanitation complaints were registered, and only one environmental inspection occurred.
My observations are not meant to draw any conclusions, of course— instead, they show the remarkable ease in which I can obtain, visualize, and understand a wide range of data in a given place and time.
My brief run-through on the “story” of Chicago’s Uptown neighborhood also reaches towards the bigger picture of what Plenario looks to do: lead to more meaningful, predictive-type data analytics and modeling. For researchers, Plenario’s ability to quickly synthesize vast amounts of data is particularly significant—it allows them to ask questions with as few constraints as possible, changing their way of thinking about and addressing urban issues in the process.
Starting a New Platform
Launched in late 2014 at the Code for America Summit and funded via grants from the National Science Foundation and MacArthur Foundation, the platform was created and is currently managed by the University of Chicago’s Urban Center for Computation and Design (UrbanCCD), as well as researchers from the school’s Computation Institute and Harris School of Public Policy. Its design comes from Chicago-based software developers DataMade.
UrbanCCD, housed within the University of Chicago’s Computation Institute, was established in 2012 as a means to continue the Computation Institute’s marriage between the University and Argonne National Laboratory, specifically focusing on urban sciences. UrbanCCD’s mission is to design computational research tools and resources to foster interdisciplinary and cross-sector collaboration for better urban understanding, planning and design. The center’s Director, Charlie Catlett, has been a key mind behind Plenario’s development.
UrbanCCD and its partners are actively looking to start a conversation with the outside world on how Plenario can be most helpful.
Plenario can also trace its roots back to the WindyGrid application, its conceptual precursor currently in use at the City of Chicago. Former Chicago CIO Brett Goldstein, a Plenario project leader and fellow at the Harris School, had designed WindyGrid during his time with the city as an open-source situational awareness tool. WindyGrid was first launched during the 2012 NATO summit in Chicago, allowing enhanced coordination for the city’s emergency services.
Goldstein’s push to bring together previously disparate, hard-to-combine data for the City of Chicago is evident in Plenario—but on a grander scale. Plenario utilizes Amazon Web Services, a cloud-computing platform, as its scalable back-end infrastructure, which means that storage and computing power for more datasets is not a concern. Its open sourced and cloud serviced implementation also makes Plenario readily replicable, which has enabled the team to prototype new capabilities driven by use cases. This includes the City of San Francisco’s Sustainable Systems Framework, which uses Plenario to track and build sustainable infrastructure projects and measure progress in jurisdictions across the city.
A Platform with a Community
While there’s plenty of room to expand, a significant number of datasets have been added to Plenario since its launch. Perhaps most interesting, though, is who has been contributing to the platform’s growth. When Plenario began, additional data could only be added to Plenario by someone with administrative access—which was limited to UrbanCCD’s development team.
Since the launch, UrbanCCD has opened Plenario up to anyone who wishes to contribute data. “Anyone” doesn’t mean only those in major cities, either: users are free to upload any publicly available URL for a data source that is either a Socrata or CKAN dataset or in CSV format, so long as the data includes the fundamental dimensions of time and location. Jonathan Giuffrida, one of the minds at UChicago working on Plenario, describes the platform’s goal as a “one-stop shop for all things open data, available for all users.”
This ability has implications for a wide range of users. For example, if an entomologist is conducting a longitudinal study on bee population trends in urban areas, and wants to plot her collected data on a map, she does not have to build an application from scratch—rather, she can upload her data to Plenario, saving herself time and money and exposing her research to a greater number of people—as well as gaining access to the wealth of data in Plenario, such as weather data, that could enhance her own work.
Moreover, if that entomologist chooses to, she can join a community of users who wish to use Plenario regularly and stay updated on updates, changes, and new ways the platform can be used.
As an open-source platform, Plenario’s communal aspect is one of its most important features. UrbanCCD and its partners are actively looking to start a conversation with the outside world on how Plenario can be most helpful – for city managers, app developers, researchers, or any person interested in exploring open data.
Giuffrida notes that a possibly game-changing addition is on the horizon for Plenario as well: adding shape-file data, such as zip codes, census tracks, or city wards, to enhance the platform’s search criteria. Currently, users’ only option for choosing a geographical domain for a query is to manually draw it with a polygon tool. By adding search features that account for these areas, as the team has prototyped with San Francisco’s Sustainable Systems Framework, Plenario becomes especially useful for users who do research or compile report within specific jurisdictions.
The platform will also soon allow users to add plugins to Plenario’s core API, allowing for users to deliver customized versions of Plenario for their own or more specified use. These versions can further be shared via Plenario’s page on the code-sharing site GitHub.
Otherwise, Plenario will continue to grow as a “hub of hubs” – and elevate our current data landscape to a new level of operating and understanding.