Data Linkage: Ethical and Social Concerns

libby

Libby Bishop, Manager, Producer Relations: Research Ethics, at the UK Data Service updates us on the National Centre for Research Methods (NCRM) event Data Linkage: Ethical and Social Concerns.

ncrm_logo@2x

Not even a strike on the London Tube could deter the hard-core dataphiles from attending the National Centre for Research Methods (NCRM) event Data linkage: Ethical and Social Concerns on the 9th July. Of course, the opportunity for a view from the 17th floor of The Shard on a clear day more than offset any inconvenience.

We kicked off with Mandy Chessell (who definitely had the best job title: Master Inventor at IBM). She is advising corporations who are linking data, want to do more, and are struggling to define boundaries around what is technically possible, what is legally permitted, and what should be done ethically. She identified multiple major challenges: ethics is “slippery”, technical requirements are complex, and linkage must be automated to be efficient, but machines are not capable of the human judgement needed. For example, it is not possible to classify a variable “diagnosis” as sensitive; sensitivity might change with the value of the variable. “When diagnosis=flu, the data are not sensitive, but when diagnosis=AIDS, they are.” That distinction is difficult to program.

Subsequent speakers addressed key questions. Peter Elias (University of Warwick) made a strong case supporting data linkage, in part to complement surveys, which are plagued with response rates that have fallen from 95% to nearer 40% in some cases. Moreover, our “crown jewels”—the UK longitudinal surveys—are expensive to run, with a cloud over future funding given the current environment.

Rob Procter (University of Warwick) explored how big data, linkage and analytics are challenging core assumptions of much social science research: data are not identifiable, the number of researchers is small, and uses can be specified in advance. Rob pointed out limitations of current procedures for consent and anonymisation. These issues are rapidly moving into the foreground. At the UK Data Service, we have been working on the ethics implications of new forms of data. I looked at core challenges to consent and anonymisation here and our Big Data Reading Group will be discussing “Big Data’s End Run around Anonymity and Consent”, a chapter by Barocas and Nissenbaum in Lane, et al. (ed) Privacy, Big Data, and the Public Good.

Rob asked the audience a brilliant question: How many people work for Facebook? Guesses ranged from a few 100s to over 10,000. In fact, it is about 10,000. For comparison, IBM has very roughly 400,000 employees. Rob’s answer? About one billion. Of course, all Facebook users “work for” Facebook by handing over their data, for free, to the company. It was a memorable way of making the point: if theservice is free, you are the product.

The session closed with a panel. Very insightful issues were raised. It might be useful to distinguish linked data from joined up data. Joined up suggests it will be used to enhance services for those contributing their data. Linked data, more often, seems to imply other purposes, and in some cases, identifying any public benefit from such uses is a challenge. This thread developed into more general discussion about the idea of “public benefit”. We also debated the concept of “personal data stores”, with some concerns raised that there are risks in treating data as an individual commodity to be exchanged by market mechanisms. The day ended on an optimistic note: we identified the need for some kind of data ombudspersons to negotiate access and vet usage, especially in cases with personal or sensitive data collected without consent, and where people cannot opt out (e.g., tax records). A very similar idea of independent “information fiduciaries” has also been proposed by Bruce Schneier in Data and Goliath, another excellent volume on data and privacy.

If you missed this event, but want to know more about Big Data, take a look at the offerings at the Big Data and Analytics Summer School at the University of Essex.

Posted in Posts from feeds Tagged with: , , , , , , , , , ,