The UK’s copyright laws will hobble its AI ambitions

Without exemptions for commercial as well as academic data mining, the government’s AI for Science Strategy will fall flat, says Benjamin White

Published on
December 11, 2025
Last updated
December 11, 2025
A robot's hands shackled, illustrating restrictions on AI use
Source: Yurii Karvatskyi/Getty Images

Artificial intelligence played a crucial role in enabling the safe and rapid development and roll-out of the Covid vaccine. Using scientific journal articles, published clinical trials, databases and other resources, scientists were able to use machine learning models to quickly prioritise fruitful research pathways, allowing them to bypass traditional bottlenecks associated with vaccine development and enhance the precision of antigen selection.

In the US, where Pfizer-BioNTech and Moderna developed the mRNA vaccine, the relevant data laws support such use of information by commercial entities and universities alike. But under UK copyright law, AstraZeneca scientists – who co-developed the Oxford-AstraZeneca viral vector vaccine – would only have been allowed to “mine” the tens if not hundreds of thousands of potentially relevant papers they had legal access to by sitting in front of their computer with a pen and paper and scrolling through them all manually.

As a for-profit organisation, had it wanted to use AI to automate the process, it would have had to re-seek permission from every publisher. This is because machine learning involves computers automatically making copies – and in doing so, they trigger copyright liability. AstraZeneca would even have been required to seek permission from social media platforms to use AI to identify trends in vaccine scepticism or areas of high infection in order to prioritise vaccine roll-out and target vaccine scepticism.

Confusingly, commercial web-crawling for indexing purposes is explicitly legal in the UK – so Google’s search business (a form of AI, of course) is lawful – even if the legality of training its Gemini AI on copyrighted material is much more contested. Similarly, scraping and crawling websites by academics and NHS doctors for non-commercial research purposes is legal. But they would be required to seek permissions for everything used if they were to share with a commercial entity any data they derived for AI training or any other purpose. At any scale, and at any speed, any licensing professional will tell you that this is clearly an impossible task.

ADVERTISEMENT

As reflected in the government’s newly announced AI for Science Strategy, AI is built on three pillars: people and skills, technical infrastructure and data. However, technical infrastructure is clearly the government’s priority. Even the provision in the strategy focused on data tries to sidestep copyright issues by focusing on new datasets.

The story is similar with the US-UK “Tech Prosperity Deal”, announced in September. This will see US multinationals invest tens of billions of pounds in the UK’s AI infrastructure. But if this comes to pass, one may wonder what data will be being processed in the UK. By definition, Big Data needs to be Big – but that is why companies like Stability AI, as well as not-for-profits like Common Crawl, do their data processing in the US. You can have the best physical infrastructure possible, but unless the law makes it possible for as much data as possible to flow through it, its value will be very limited indeed.

ADVERTISEMENT

The situation is particularly absurd because the goal of copyright is to protect original expression. International copyright treaties explicitly leave facts, trends, relationships and other data in the public domain. Yet because these facts and data are stored within copyrighted “wrappers” (such as books and articles), permission to mine them is needed under UK law – even though no one is looking to reproduce or compete with original materials (and if they did, copyright of course already offers publishers recourse).

The negative implications of all this are obvious for the speed and efficacy of research – and, therefore, for the health of the UK economy given the hope being pinned on commercialising research breakthroughs in the industrial strategy.

The government consultation on AI and copyright, launched a year ago, refers frequently to large language models (LLMs). However, this market has already been sewn up by the Americans and the Chinese; the world isn’t waiting for BritGPT.

Where the UK does have a potential role in AI markets is in the “second tier” – the application of models in economic sectors in which we are already strong. We should be creating a vibrant Big Data ecology for biomedicine, agriculture, green industries, finance and more.

ADVERTISEMENT

The release of DeepSeek is potentially a game changer for UK start-ups, scale-ups, universities and SMEs because while some may be suspicious of its Chinese origin, it is an “open weight” LLM, meaning that its trained parameters, or “weights”, are made publicly available, allowing others to freely use them and build out their own projects. For a government desperate for growth, combining open weights with more flexible data laws is an obvious area for strategic investment, in support of our universities, the NHS and our research-intensive industries.

Only by having best-in-class data laws will we be able to compete on a level playing field with the US and East Asia. This is a  strategic no-brainer.

The US should not be our only model. We should also look at what the Japanese, South Korean and Singaporean governments did nearly a decade ago, introducing proportionate and targeted flexibilities into their copyright laws to support AI and data-driven innovation. We need a comprehensive science-focused strategy that offers meaningful support across all three pillars of a successful AI strategy – but that still feels some way off.

To get there, it is vital that leaders from universities and research organisations focus on informing policy in this area. Whenever the government presents these issues as affecting only the creative industries on the one hand and Big Tech on the other (as was the case, for example, in the recent copyright and AI consultation, whose 33 pages didn’t mention the word science once), the research sector needs to speak up and point out how blinkered this approach is.

ADVERTISEMENT

Flourishing science matters a great deal to a modern research-intensive economy like that of the UK. It is vital that policymakers be made to understand that investing in physical infrastructure alone is not enough. It’s time to focus on what flows through the infrastructure. We need flexible and modern data and copyright laws, too.

Benjamin White is a researcher at Bournemouth University’s Centre for Intellectual Property Policy and Management. He was previously head of intellectual property at the British Library.

ADVERTISEMENT

Register to continue

Why register?

  • Registration is free and only takes a moment
  • Once registered, you can read 3 articles a month
  • Sign up for our newsletter
Please
or
to read this article.

Related articles

Sponsored

Featured jobs

See all jobs
ADVERTISEMENT