10 Challenges for Open Data
It has been six years since Tim Berners Lee called out for more raw data. A lot since then has happened, but critics often point out limited visible impacts and the lack of public awareness. So the question is, what is hindering the success of open data, and what are the real challenges that need to be worked on. Read more to learn about Dutch developments and some of our perspectives on moving ahead.
In 2014 we were surprised by the strong uptake of both open data policy as well as pragmatic experiments by mostly large central government organisations. For instance, in the Netherlands `€˜the Court of Audit’ (ARK) , the Parlement (Tweede Kamer), the ‘National Road Registry’ (RDW), the `€˜Statistics Bureau’ (CBS), the `€˜Education Office’ (DUO) and the `€˜Ordnance Survey’ (Kadaster), were very active on the topic of open data.
Abroad we have seen a similar thing happening; World Bank expanded their data policy, France (finally) joined OGP, and formerly paid-for company registration data opened up in Belgium and in Bulgaria. So what is going wrong?
1. Local governments still struggle
Local governments still seem to struggle with open data. They fail to implement proper technologies for opening up data, they can’t identify relevant data and in result, they don’t see (enough) relevant applications of their data. In the start of 2015 we have found 24 municipalities in the Netherlands experimenting with open data (logged on: 26-01-2015), but very few show impacts and have continued their programs on open data.
2. Paid data: a big problem
Whereas internationally the high impact datasets, such as weather, mobility, company data and landsurvey data are identified, many countries have problems opening datasources that are tied into government bodies’ revenues. For instance in the Netherlands both company data (KvK) as well as ordnance (Kadaster) data is still behind paywalls. The Dutch governments are certainly not the only ones, other examples vary from Canadian securities filings to Hong Kong Company registries.
3. No community, developer engagement or incentives for re-use
Only a few governments see communication to developers and technical entrepreneurs on their open data as part of their Open Data strategy. Not communicating with these audiences is a very big missed opportunity, whereas plenty of examples tell otherwise. The Dutch Road Registry (RDW) successfully does this very light weight by hosting a community mailing list. The Ordnance Survey (Kadaster) engages varying community representatives through out their geo user-groups.
The 12.204 API’s listed on the programmable web attest the opportunities that openness can bring for private organisations. However, the interest in open data and big data is not longer limited to entreprises and startups. SME’s start to explore the possibilities of data for their organisations. For instance in the Netherlands 7 brick and mortar multinationals together organized the Dutch Open Hackathon.
4. CSR not yet there
Corporates are right in aligning their business needs with the possibilities that data can bring. However, as CSR becomes more important and as it needs to be part of public companies, few corporates see the possibilities for CSR transparency by releasing either data on products, processes or supply chains.
5. ’No structured approach
A structured approach to openness seems still to be limited to companies like Amazon, Google, Facebook and Microsoft. They have standard rules on data (@Google, you should always be able to get it) and API’s are a part of the productdesigns.
Their business managers strategise on engaging third parties towards there business goals through API’s. They have evangelists, regular events and incentive programmes for making openness a success. Companies can benefit from adopting such proven strategies.
Education and Research
As the research field of open data matures, the topic moves from technical designs (i.e. Linked Open Data standards, big data analytics) to new applied areas of research in economy, social sciences. Of course open data can also be applied on research itself.
6. Open Access as a missed opportunity
Although Open Access is a growing trend, most of the buzz is limited to free access journals and open licensing on academic publications. We think there is a massive opportunity in opening up research data, and bundle it together with the publications. In the Netherlands researchers can publish their data in ‘DANS’ (Data Archiving and Networked Services) but this is data behind a log-in, with often abusive use of licensing.
But we think “Open Research” is more then Open Access. New possibilities for valuable contributions have emerged from the academic field, beyond the traditional academic publications. For instance GEOPS made an up-to-date data visualisation on all real-time public transport data.
This new approach is of course not limited to data visualisations! Researchers can open up data not only for themselves, but also for others. They can bundle and refine data in public API’s, and publish the software they wrote as open source. Or perhaps contribute to public code projects with add-ons and code contributions.
7. Linked Open Data considered harmful
The academic standards for data publications are very high, which delays and hinders open data publication due to the complexity involved. But according to research, Linked Open Data (LOD) services have average downtimes of 1,5 days offline per month! This is simply unacceptable in case of commercial re-use.
In addition, few developers are knowledgeable on Linked Data standards like RDF, SPARQL and OWL. Most ‘real world’ interfaces (API’s) are run on very basic – but highly effective – standards like REST and JSON. And yes! In this context JSON-LD and LOD fragments are very much appreciated steps to make Linked Standards A Little Easier.
8. To little research on the impact of openness
Although several economic impact studies have been done, few to none are done by academics (instead of consultants) and the studies often take a macro economic perspective. Which is not necessary bad, but these studies don’t asses micro levels of impacts.
The University of Wageningen studied the impact of open data on the micro level of the release of the `€˜basic topographic map’ of the Netherlands. By tracking investments done by companies that apply these open data, the researchers were able to estimate its impact at the very least of 9 million euro a year.
We believe this approach is very suitable for evaluating specific open data sets, and more academic studies (and hence development of the instrument) in this area would be very welcome.
Open Data Ecosystem
In the open data ecosystem various sustainable instruments and tools are found to be succesfull, are used regularly and are improved upon. Examples are: Open Culture Data Masterclasses (2012, 2013, 2014, 2015) , The Global Open Data Index and new tactics like `€˜Bulk data unlocking’. Open Corporates and Open Street Map continue to grow in datavolume, quality and impact. Besides that, a steady flow of mobility Apps on has emerged, including Routeradar, Roudle, Filepret, and Fileindex. They are all based on national road maps and real-time traffic data.
9. Technological barriers for governments
Basic technological components like Datastores and data Catalogs are still not easily accessible to smaller governments. There is a lack in the offering of (their often) local IT suppliers , US-based Cloud products can put EU governments in a pickle , and global vendors often charge beyond the means of a smaller players.
10. Local data doesn’t scale
There are also few examples of local government data creating substaintial value for the local communities . There is too little evidence on this matter, still both experts and civil servants remain confident that local governments have treasures of data. Most people have only very little contact with their local government. One might wonder how big the actual demand and market opportunity is for more of such information.
Still governments keep relying on archaic means of communicating: local newspapers and direct mail. COmmunication channels which provide relevant data in a not very accessible form. We would like to argue that that kind of local data is probably of most interest to the public.
What do you think?
These are challenges that are on our bucketlist for this very year. What do you think? Let us know on Twitter (@OpenStateEU)!