- We need to distinguish between claims to data ownership, and claims to be a stakeholder in a dataset;
- Ownership is a relevant concept for a limited range of datasets;
- Openness can be a positive strategy, empowering farmers vis-a-vis large corporate interests;
- Openness is not universally good: can also be used as a ‘data grab’ strategy
- We need to think critically about the configurations of openness we are promoting;
- Commons and cooperative based strategies for managing data and open data are a key area for further exploration;
Open or owned data?
Following the publication of a discussion paper by the ODI for the Global Open Data for Agriculture and Nutrition initiative, putting forward a case for how open data can help improve agriculture, food and nutrition, debate has been growing about how open data should be approached in the context of smallholder agriculture.
Respondents to the paper have pointed to the way in which, in situations of unequal power, and in complex global markets, greater accessibility of data can have substantial downsides for farmers. For example, commodity speculation based on open weather data can drive up food prices, or open data on soil profiles can be used in order to extract greater margins from farmers when selling fertilizers. A number of responses to the ODI paper have noted that much of the information that feeds into emerging models of data-driven agriculture is coming from small-scale farmers themselves: whether through statistical collection by governments, or hoovered up by providers of farming technology, all aggregated into big datasets that practically inaccessible to local communities and farmers.
This has led to some focussing in response on the concept of data ownership: asserting that more emphasis should be placed on community ownership of the data generated at a local level. Equally, it has led to the argument that “opening data without enabling effective, equitable use can be considered a form of piracy”, making direct allusions to the biopiracy debate and the consequent responses to such concerns in the form of interventions such as the International Treaty on Plant Genetic Resources.
There are valid concerns here. Efforts to open up data must be interrogated to understand which actors stand to benefit, and to identify whether the configuration of openness sought is one that will promote the outcomes claimed. However, claims of data ownership and data sovereignty need to be taken as a starting point for designing better configurations of openness, rather than as a blocking counter-claim to ideas of open data.
Community ownership and openness
My thinking on this topic is shaped, albeit not to a set conclusion, by a debate that took place last year at a Berkman Centre Fellows Hour based on a presentation by Pushpa Kumar Lakshmanan on the Nagoya Protocol which sets out a framework for community ownership and control over genetic resources.
The debate raised the tension between the rights of communities to gain benefits from the resources and knowledge that they have stewarded, potentially over centuries, with an open knowledge approach that argues social progress is better served when knowledge is freely shared.
It also raised important questions of how communities can be demarcated (a long-standing and challenging issue in the philosophy of community rights) – and whether drawing a boundary to protect a community from external exploitation risks leaving internal patterns of power and exploitation within the community unexplored. For example, does community ownership of data really lead to certain elites in the community controlling it.
Ultimately, the debate taps into a conflict between those who see the greatest risk as being the exploitation of local communities by powerful economic actors, and those who see the greater risk as a conservative hoarding of knowledge in local communities in ways that inhibit important collective progress.
Exploring ownership claims
It is useful to note that much of the work on the Nagoya Protocol that Pushpa described was centred on controlling borders to regulate the physical transfer of plant genetic material. Thinking about rights over intangible data raises a whole new set of issues: ownership cannot just be filtered through a lens of possession and physical control.
Much data is relational. That is to say that it represents a relationship between two parties, or represents objects that may stand in ownership relationships with different parties. For example, in his response to the GODAN paper, Ajit Maru reports how “John Deere now considers its tractors and other equipment as legally ‘software’ and not a machine… [and] claims [this] gives them the right to use data generated as ‘feedback’ from their machinery”. Yet, this data about a tractor’s operation is also data about the farmers land, crops and work. The same kinds of ‘trade data for service’ concerns that have long been discussed with reference to social media websites are becoming an increasing part of the agriculture world. The concern here is with a kind of corporate data-grab, in which firms extract data, asserting their absolute ownership over something which is primarily generated by the farmer, and which is at best a co-production of farmer and firm.
It is in response to this kind of situation that grassroots data ownership claims are made.
These ownership claims can vary in strength. For example:
- The farmer can claim that ‘this is my data’, and I should have ultimate control over how it is used, and the ability to treat it as a personally held asset;
- The second runs that ‘I have a stake in this data’, and as a consequence, I should have access to it, and a say in how it is used;
Which claim is relevant depends very much on the nature of the data. For example, we might allow ownership claims over data about the self (personal data), and the direct property of an individual. For datasets that are more clearly relational, or collectively owned (for example, local statistics collected by agricultural extension workers, or weather data funded by taxation), the stakeholding claim is the more relevant.
It is important at this point to note that not all (perhaps even not many) concerns about the potential misuse of data can be dealt with effectively through a property right regime. Uses of data to abuse privacy, or to speculate and manipulate markets may be much better dealt with by regulations and prohibitions on those activities, rather than attempts to restrict the flow of data through assertions of data ownership.
Openness as a strategy
Once we know whether we are dealing with ownership claims, or stakeholding claims, in data, we can start thinking about different strategic configurations of openness, that take into account power relationships, and that seek to balance protection against exploitation, with the benefits that can come from collaboration and sharing.
For example, each farmer on their own has limited power vis-a-vis a high-tech tractor maker like John Deere. Even if they can assert a right to access their own data, John Deere will most likely retain the power to aggregate data from 1000s of farmers, maintaining an inequality of access to data vis-a-vis the farmer. If the farmer seeks to deny John Deere the right to aggregate their data with that of others: changes that (a) they will be unsuccessful, as making an absolute ownership claim here is difficult – using the tractor was a choice after all; and (b) they will potentially inhibit useful research and use of data that could improve cropping (even if some of the other uses of the data may run counter to the farmers interest). Some have suggested that creating a market in the data, where the data aggregator would pay the farmers for the ability to use their data, offers an alternative path here: but it is not clear that the price would compensate the farmer adequately, or lead to an efficient re-use of data.
However, in this setting openness potentially offers an alternative strategy. If farmers argue that they will only give data to John Deere if John Deere makes the aggregated data open, then they have the chance to challenge the asymmetry of power that otherwise develops. A range of actors and intermediaries can then use this data to provide services in the interests of the farmers. Both the technology provider, and the farmer, get access to the data in which they are both stakeholders.
This strategy (“I’ll give you data only if you make the aggregate set of data you gather open”), may require collective action from farmers. This may be the kind of arrangement GODAN can play a role in brokering, particularly as it may also turn out to be in the interest of the firm as well. Information economics has demonstrated how firms often under-share information which, if open, could lead to an expansion of the overall market and better equilibria in which, rather than a zero-sum game, there are benefits to be shared amongst market actors.
There will, however, be cases in which the power imbalances between data providers and those who could exploit the data are too large. For example, the above discussion assumes intermediaries will emerge who can help make effective use of aggregated data in the interests of farmers. Sometimes (a) the greatest use will need to be based on analysis of disaggregated data, which cannot be released openly; and (b) data providers need to find ways to work together to make use of data. In these cases, there may be a lot to learn from the history of commons and co-operative structures in the agricultural realm.
Co-operative and commons based strategies
Many discussions of openness conflate the concept of openness, and the concept of the commons. Yet there is an important distinction. Put crudely:
- Open = anyone is free to use/re-use a resource;
- Commons = mutual rights and responsibilities towards the resource;
In the context of digital works, Creative Commons provide a suite of licenses for content, some of which are ‘open’ (they place no responsibilities on users of a resource, but grant broad rights), and others of which adopt a more regulated commons approach, placing certain obligations on re-users of a document, photo or dataset, such as the responsibility to attribute the source, and share any derivative work under the same terms.
The Creative Commons draws upon an imagery from the physical commons. These commons were often in the form of land over which farmers held certain rights to graze cattle, of fisheries in which each fisher took shared responsibility for avoiding overfishing. Such commons are, in practice, highly regulated spaces – but that seek to pursue an approach based on sharing and stakeholding in resources, rather than absolute ownership claims. As we think about data resources in agriculture, reflecting more on learning from the commons is likely to prove fruitful. Of course, data, unlike land, is not finite in the same ways, nor does it have the same properties of excludability and rivalrousness.
In thinking about how to manage data commons, we might look towards another feature prevalent in agricultural production: that of the cooperative. The core idea of a data cooperative is that data can be held in trust by a body collectively owned by those who contribute the data. Such data cooperatives could help manage the boundary between data that is made open at some suitable level of aggregation, and data that is analysed and used to generate products of use to those contributing the data.
With Open Data Services Co-operative I’ve just started to dig more into learning about the cooperative movement: co-founding a workers cooperative that supports open data projects. However, we’ve also been thinking about how data cooperatives might work – and I’m certain there is scope for a lot more work in this area, helping deal with some of the critical questions that have come up for open data from the GODAN discussion paper.