Open Data and Improving Governance: issues of measurement
I was speaking at the Institute for International Economic Policy this afternoon at George Washington University at a conference on “Known-Knowns and Unknowns about the Internet: Measuring the Economic, Social, and Governance Impact of the Web“. The input I offered, for a session under the title ‘Has the Internet Helped Citizens and Policymakers Improve Governance? Are the effects measure-able? What new innovations might be helpful?’ was around how we approach measuring the impacts of open data. Below are the notes I prepared for the talk: the actual delivery voyaged off this a bit – but the version below is probably a bit more clear and concise.
You can also find a recording from the whole panel session here , including a fantastic talk from my Web Foundation colleague Bhanupriya Rao on No-tech, low-tech and high-tech for transparency.
Open Data and Improving Governance: issues of measurement
This talk will focus in on Open Data on the Internet, and through that explore one route by which the Internet is involved in changing governance. It will look at three issues: definitions; the role of measurement; and emerging impacts from a recent study of open data – the Open Data Barometer.
Definitions: Open Data
We need to start by defining our terms: what do we mean by open data, and by governance and, as a result, what kind of measurement makes sense. It is important to have a strong and focussed analytical definition of open data, to avoid it becoming an all encompassing idea. Definitions are important in developing measurement. The session titles at the Known Knowns and Unknowns conference use terms of ‘web’ and ‘Internet’ interchangeably – yet for many kinds of intervention these are not one and the same.
Similarly, we see a lot of confusion in the open data field between ideas of open data, big data, linked data and so-forth – and so drilling down into an analytical concept of open data is an important starting point for both research and practice. It is particularly important to do this with data, as just about anything can be represented as data – and if we’re not careful we end up confusing form and content. For example, if we attribute the creation of a host of mobile phone apps that use open transport data, and that generate economic impacts for both consumers and producers, to the openness of the data alone, rather than at least partially to the fact it is data about transport and, moreover, it is usually data about transport in an urban centre with good public transport systems, then we end up anticipating that the next dataset ‘opened up’ might see similar returns and impacts just by virtue of being open data. But if that next dataset is data on cattle movements from a department of agriculture, we might not see quite so many smart-phone apps emerging.
When we disentangle ‘dataset form’ from dataset function and subject matter we can more intelligently ask about the potential impacts.
So. What is the form of open data? There are three elements we use in our operational definition in the ODDC research network:
1) Proactive Publishing – the idea that governments (or other parties) should put data online without being asked for it;
2) Machine readability – the idea that the data should be possible to process with a computer – not just read on screen – but possible to sort, sift through, filter and generally manipulate without high technical barriers. In practice this means using standard file formats which can be accessed without expensive software, and which maintain the granularity of the data.
3) Permission to re-use – the idea that there should not be legal restrictions to prevent someone re-sharing or re-using the data they have been given access to. Often government data is placed under copyrights or IP protections that prohibit re-use, and so the open data movement has advocated for the use of clear license statements that, at most, require re-users to attribute the source of the data and that place no other restrictions on those that wish to work with a dataset.
It is important in our definitions to be aware of different legal and cultural practices around the world, understanding open data as a socio-technical construct. For example, in some countries government data is assumed to be open regardless of the present or not of a license statement; in others, state data is copyright by default, and explicit licenses are needed to give re-users the certainty that they have permission to build upon and market products that use the data.
The second important definitional pre-requisite to address the questions in focus in this session is for ‘governance. Wikipedia here demonstrates the ability of a crowdsourced product to provide the best concise definition, describing governance as concerned with “decisions that define expectations, grant power, or verify performance.”
Now – it might be possible to construct a predominantly descriptive or positivist account of how open data improves the verification of government performance, against pre-set objectives or rules: but any discussion of how the Internet and open data improve governance with respect to decisions that define expectations and grant power is necessarily normative. Deciding when we have an improvement in the setting of expectations or the granting of power involved taking a stand over what counts as improvement. Whilst we may be able to agree at one level about negatives of which the removal constitutes improvement: things like extreme corruption for example; when it comes to a positive vision of what open data should do to governance we find quite divergent views.
Let me illustrate this point by setting out three distinct theories of change for how open data might affect how power is assigned, each belonging to different traditions of political theory .
- Firstly, there is the idea of that open data enables citizen oversight of their governments and addresses information asymmetries – enabling citizens as voters to better control elected officials as their agents in power. Here, the electoral mechanisms of a governance in an electoral democracy, or indeed, the pressures of public opinion in a constrained autocracy, are left unchanged, but the ability of journalists, pressure groups and citizens to punch through the veil of secrecy around state decisions can drive decisions more in-line with citizen interests.
- Secondly, we have the idea of a consumer-democracy – one in which citizens engage in governance through individual consumption choices and selection of public services. This is the theory of change prioritised by the David Cameron in his recent speech at the Open Government Partnership – where it is argued that, through open data, citizens can gain more detailed, and personalised, information on public services, and can make more informed choices about which services to select from the ‘marketplace’ of services – thus using market mechanisms to drive better services. Here, ‘governance’ happens through the operation of the market and distributed choices of individuals.
- Lastly, we have the idea of co-production and more collaborative governance in which open data supports groups made up of citizens, civil society organisations and entrepreneurs to work together with each other, and with the state, to improve policy making and practice. Here, governance is improved when it is more inclusive, and when more people participate in determining the outcome of collectively held power.
These are of course not the only theories of change – but outline the divergent ways in which we might approach the question of what ‘improved governance’ is. Indeed – policy makers and citizens might have very different ideas in any situation of what improved governance looks like: we might hypothesise that more transparent and responsive services are compatible with more efficient and low-cost services, but whether this is true is is an empirical question.
The reason for this detour into political theories is not simply to problematise the term of governance, but also to highlight that our approach to measurement also involves taking a stand on normative questions, and to provide a basis for outlining the stand I propose taking.
As Bhanupriya has outlined, many of the kinds of improved governance we want to see involve empowering and enabling groups at the grassroots to engage in policy processes, both acting locally, and speaking out for shared national and global frameworks that better allow them to act locally in the ways that meet their needs. These are not about top-down governance, or enabling policy makers to better control service delivery with a birds-eye view.
The site of acting then, if we are to have actionable measurements on open data and governance, is not simply at the level of policy – but is also at the level of grassroots practice. The measurements we make need to allow grassroots groups to both understand ways of engaging with open data as a resource for improving governance locally, and to make appropriate and effective claims on national and international actions to support them with the data they need for better decision making and monitoring of implementation.
ODDC and The Open Data Barometer
Now – having said all this – we might feel that it adds too much complexity to the process of developing a measurement frameworks, and research into impacts of open data on governance is then necessarily solely a space for action research – with no general measurement possible. But this would not engage fully with the problematisation. Global measurements will be made – and so we should work to make sure that where they are, they are sensitive to the practitioner need at the grassroots, and are a resource for practice – whilst also enabling cross-cutting and comparative global research that can illuminate macro-level trends and feed into national and local policy and practice.
That framed our expiration with the Open Data Barometer, a study launched by Sir Tim Berners-Lee at the Open Government Partnership meeting in London a few week ago that takes a multidimensional look at the readiness of 77 states to secure benefits from open data, the implementation of open data policy via the proxy of dataset publication, and emerging impacts of open data by the proxy of media and academic coverage of it.
It was a pilot study, but one we hope provides strong foundations for future work to understand the governance effects and impacts of opening data. Methodologically is uses an expert survey, combined with a number of secondary indicators – used to create sub-indices and an overall Barometer index number for countries to support overall comparison, and comparison along a number of different dimensions. I want to highlight three key considerations and points of learning from the development of the Barometer:
- We build on learning from qualitative work to look at different aspects of readiness. For the last year the Web Foundation have been running a research network on Exploring the Emerging Impacts of Open Data in Developing Countries – which you can find at www.opendataresearch.org – and in this we’ve been working with research partners across the developing world to look at the use of open data in affecting governance. Through this the importance of a number of different aspects of government readiness have been emphasised, including the importance of RTI laws alongside open data laws; this qualitative work has also brought up issues around the importance of civil society intermediaries. Working with these qualitative insights we sought to find indicators and expert survey questions that would help us understand appropriate aspects of the context around open data in different countries.
- We distinguish different kinds of data. Prior studies of open data publication have used a list of datasets based on those felt to be important in London and Washington, rather than looking for datasets that represent the breadth of government activity, and the breadth of theories of change about how open data operates. We put together a list of 14 dataset categories and asked our expert researchers to assess whether this data was available, online, machine readable, openly licensed and so-on. In our analysis we cluster these datasets according to those most likely to be used as part of an ‘accountability stack’, those most often used in ‘innovation’, and those with a strong impact on ‘social policy’.
- We look at impact based on asking for narratives of change. This speaks to the question of whether effects of open data are measurable. Right now, that measurement is very difficult. Conceptually, open data can be used to achieve a wide range of impacts, so if we had gone in trying to look for one particular kind of data use – data in participatory budgeting for example – we would have risked missing lots of other potential impacts of open data. We’re still looking to find betters methods here – but the approach we took was to ask our expert researchers to look for media mentions of open data having effects in a variety of domains: political, economic, environmental and so-on, and to rate the breadth and depth of impacts cited.
In this talk I won’t go in-depth into the actual Barometer results, as you can find those at www.opendatabarometer.org but I’ll briefly mention a few findings:
- Open data policy has rapidly spread across the globe – over 50% of our sample of countries had an open data policy, many with strong senior government backing.
- However, open data availability is low – just 71 of the over 1078 datasets we looked at were available as open data, and in general, publication of data that meets all the criteria for open data I set out above was concentrated in a small number of states. Politically contentious datasets such as government spending, land registries and company registries were least likely to be available.
- Impacts right now are very low – the average of our 10-point impact scale was below 2 for every category, and remained below 3 even when we took out countries with no open data or open data policy. In terms of the kinds of impacts researchers could locate cited – Transparency and Accountability impacts were most common, with impacts on environmental sustainability, and the inclusions of marginalised groups least likely to be cited.
Returning to the questions that frame this panel: Has the Internet helped citizens and policymakers improve governance? Are the effects measure-able? What new innovations might be helpful?
It seems fair to state as a basic assumption that information does change governance. On the flight here I was reading work by Eleanor Ostrom on governance of the commons, which emphasises the central role of information in governance. Changing how information flows does impact governance – but assessing whether that impact is positive or negative involves normative questions. Right now – the impact of open data on effectively altering the flow of information across society is limited. We see highly-used apps in a limited number of settings like transport, but beyond that we see relatively few datasets that are truly available as open data. When we dig into many of the anecdotes shared about open data impacts, it often turns out that wider contextual variables are much more important in determining outcomes than the particular open properties of the data itself. And yet, policy seems to focus on a replication of a standard model of open data data publication as the primary intervention.
Ultimately in taking measurement forward I’d like to suggest we look in two directions. Firstly, we need to drill down thematically, focussing on data in context in particular settings, looking to generate actionable knowledge for practitioners in these sectors that will help them to advocate bottom-up for open data – rather than focussing on over-generalised measurement that seeks to promises generalised open data impacts without understanding differences of subject matter and context. Secondly, we need to explore developing rigorous shared case study methodologies that can enable us to build measurement and research through controlled cross-case comparisons, informing macro-level assessments, but focussing on micro-level effects and theories of change around open data.
This is something we’re hoping to focus on more in the ODDC project in the coming months – so do join us on the network Linked In group if you would like to explore this more.