Consumer Data Research Centre

Real Case Studies of Failures to Achieve Data Protection Principles

On 25 May 2018 the General Data Protection Regulation (GDPR) comes into force and all processing of personal data taking place on or after that date must be compliant with it. Below is part 5 of a report authored by Veale Wasbrough Vizards (VWV) in partnership with the CDRC on how GDPR will impact upon social science research.

Real Case Studies of Failures to Achieve Data Protection Principles


In 2006, for the purposes of facilitating research, AOL released a list of 20 million web search queries that had been made by over 650,000 AOL users. Each search query was released alongside a number to represent the AOL user who had input the query.

By analysing all search queries made by a search user who had been attributed by AOL with user number 4417749, the New York Times was able to work out the identity of that user.

Among the serious acts of negligence that can be found in this case study are the failures to consider fully the implications of:

  • multiple entries relating to the same individual being available within the database; and
  • the fact that data within the dataset could be corroborated against publicly available data.


In 2006 Netflix released a database of more than 100 million ratings on a scale of 1-5 on over 18,000 movies given by almost half a million users. The data was supposedly “anonymised” according to an internal privacy policy with all customer identifying information removed except ratings and dates. Noise was added in order to slightly increase/decrease ratings.

It was found however that the data could be de-anonymised by looking for corresponding ratings among the publicly available ratings on the Internet Movie Database (IMDB).

This case study highlights the importance of bearing in mind that third parties may compare research datasets against other publicly available datasets. Enhanced computer processing power is making this easier for third parties to achieve.

The Royal Free and Google DeepMind

In 2017 the Royal Free NHS Foundation Trust was found by the ICO to have failed to comply with data protection law when it provided personal data of around 1.6million patients to Google DeepMind as part of a trial to test an alert, diagnosis and detection system for acute kidney injury.

Along with a number of other failures on the part of the Trust, an ICO investigation found that patients were not sufficiently informed that their health data would be used for the trial. Following the ICO’s investigation the Trust was required to take a number of steps including:

  • establishing a lawful basis for the Google DeepMind project;
  • setting out how the Trust’s duty of confidence to patients in any future trial will be met;
  • carrying out a privacy impact assessment; and
  • commissioning an audit of the trial involving Google DeepMind.

Elizabeth Denham the Information Commissioner commented on this case:

“[t]here’s no doubt the huge potential that creative use of data could have on patient care and clinical improvements, but the price of innovation does not need to be the erosion of fundamental privacy rights.

“Our investigation found a number of shortcomings in the way patient records were shared for this trial. Patients would not have reasonably expected their information to have been used in this way, and the Trust could and should have been far more transparent with patients as to what was happening.