The Information Commissioner’s Office (“ICO”) recently published the outcome of its consultations on generative AI. In short, it has retained its position on purpose limitation, accuracy of data and controllership. Also, it has updated its position on legitimate interest and individual rights.
This type of guidance is useful, particularly as the UK currently lacks legislation that directly addresses AI. This article outlines the ICO’s conclusions.
ICO consultation findings:
1. The lawful basis for web scraping to train generative AI tools
Legitimate interest is the only lawful basis for web scraping personal data to train generative AI tools. This is because the others – such as consent or to perform a contract – are unrealistic. Even so, it is still important to fulfil the three-stage test:
- Purpose – specific and clear interest
- Necessity – no alternative sources of data other than web-scraped data
- Balance – technical safeguards and innovative transparency mechanisms.
2. Purpose limitation in the generative AI lifecycle
The different purposes of training and deploying a generative AI tool must be explicit and specific. Also, developers must consider whether their original purpose of collecting data is wide enough to allow them to reuse it to train them.
3. Accuracy of training data and tool outputs
Developers should ensure training data is made up of accurate, factual and up-to-date information. The ICO noted the practical difficulties of ensuring consistent accuracy and the lack of transparency to enable verification of accuracy. They expect generative AI stakeholders to provide novel solutions to ensure use according to the accuracy of the tool.
4. Engineering individual rights into generative AI tools
Developers must address the data protection principles in generative AI. They should implement appropriate safeguards and enable people’s information rights. Also, while UK GDPR allows processing which does not require identification, developers should not overly rely on this and must show any reliance is justified.
5. Allocating controllership across the generative AI supply chain
A developer of a generative AI tool is not always the data controller and the person who deploys the tool is not always the data processor. It is important to identify who has control and influence and the contract between these two isn’t necessarily definitive in this regard. This is especially true if the deployer is unable to influence the arrangements.
What now?
The ICO is taking a practical position here. It recognises that training these tools will likely involve the processing of personal data. The ICO’s conclusions also mirror the recent opinion by the European Data Protection Board. While Brexit means the UK can take a different approach, it is comforting to know the UK is still generally in alignment. This is important as the EU AI Act will likely apply to the UK by stealth.
Developers will likely be comforted that they can use personal data to train the tools and rely upon legitimate interest to do so. However, they must still adhere to safeguards:
- Fulfil the three-stage test to demonstrate legitimate interest
- Show the data is accurate
- Identify the controller where possible
- Allow for individual’s rights.
Those deploying generative AI should also be aware of the issues. It is not enough to simply accept the terms and deploy the tool. They must still consider the data protection implications. It is not a defence to a breach to assume the developer already addressed the issues.