Unbundling the IP Labyrinth of Generative AIs: Lessons from Britain’s AI Bill

The exponential rise of Artificial Intelligence (“AI”) technology has not come without costs for society. One of the latest offshoots of AI development has been the Generative AI tools. These AI tools are defined as Large Language Models (“LLMs”) used to produce texts or other content such as images or audio on the basis of the data used to train them. 

However, there are various concerns associated with the LLMs, particularly relating to intellectual property laws. While there are concerns such as the IP ownership of the creations by AI, perhaps a more important question could be – what about the IP-protected data used to train these LLMs? Generative AIs are trained using large amounts of data, including huge archives of images and texts. When responding to a prompt, these AI systems identify patterns and associations among these data, which they use to build rules and eventually produce conclusions and forecasts. These data utilised by the LLMs can be IP protected, and thus, such usage can be considered to be an infringement of the rights of the owners of such rights.

The United Kingdom (“UK”) has recently seen the introduction of a private member bill concerning the regulation of AI in the country. The bill is still in its formative stage and needs to be aided by regulations subsequently. It, however, has some important provisions that can come in handy to resolve the difficulty surrounding the training of generative AIs.

In this piece, I will explore the problem of violation of IP rights such as copyright and trademark in the training of generative AIs using various recent developments. Then, I will critically appraise the UK AI bill to assess its efficacy in solving these problems. Finally, I also argue that the Indian IP law regime can learn from the UK to come up with a comprehensive framework to regulate AI and its IP infringements.

Generative AI, Litigation and the IP Problem

Generative AIs are trained using huge amounts of data. Since these data remain undisclosed, there is a looming concern that such usage might infringe the intellectual property rights of the owners of these data. For instance, Prosecraft was a software used to conduct linguistic analysis of literature. The website provided analysis of various books based on their word count, vividness, passive voice, and the total number of adverbs. The owner of the website was forced to shut down the website after complaints from numerous authors. However, by the time the website shut its operations, it had already analysed thousands of novels without the permission of the authors, and, thus, violated their copyrights. Another concern that persisted was whether the owner of the website planned to delete the collected data.

Similarly, three artists formed a class action lawsuit in Andersen v. Stability AI et al., a case that was filed in 2022. The artists claimed that the generative AI platforms were using their original works without obtaining permission to train their AI in their styles. This enabled users to create works that might not be sufficiently transformative from their existing, protected works, and thus, would be considered unauthorized derivative works. While the issue of transformative usage of copyrighted work continues, countless suits have been filed against generative AIs for their violation of copyright and trademark rights.  

A more recent example concerning the violation of IP rights by generative AIs can be seen in the lawsuit filed against OpenAI and Microsoft for the use of the works of non-fiction authors to train their AI models.

Even if we consider that the courts adjudicate these issues in the favour of the rightsholders, the problem that faces the IP framework across jurisdictions is that the remedies of courts are ex-post reliefs. These are reliefs granted once the harm has already accrued to the owners of IP rights. Consider the case of the Emergency library established by the Internet Archive during the COVID. While the court struck down the establishment of the said library, the harm had already been done to the authors whose books were lent to many people simultaneously. Moreover, in a digital economy, driven by data, it is much easier to unlawfully preserve copyrighted materials even though the legal determination has been against the violator. This has been shown above using the example of Prosecraft which concerned the authors with the data it had gathered, even though the website was deleted.

Thus, in light of these problems of ex-post remedy and accrual of data, it becomes imminent to have a legal framework regulating the training of generative AIs. We shall now look at such a framework proposed in the UK Parliament in the form of AI Bill, 2024.

Evaluating the UK AI Bill

The UK Artificial Intelligence (Regulation) Bill has been proposed with the intention to regulate artificial intelligence and other connected purposes. It defines AI in a broad fashion to include present and even future developments in the field. Interestingly, the bill specifically delineates that AI includes generative AI which is defined as deep or large language models able to generate text and other content based on the data on which they were trained. Thus, there seems to be a clear intent to regulate the generative AI tools.

The bill proposes the establishment of an AI Authority (“AA”) which has various functions including collaboration with other regulators to align the approach towards AI, accrediting independent AI auditors, and undertaking gap analysis of regulatory responsibilities. While section 2 provides for various principles that should guide the AA and businesses developing or using AI, the principle of transparency seems to undergird the entire substructure of the bill.

Section 5 of the proposed bill deals with the IP issues presented by AI. For the training of generative AIs, the section stipulates that the person involved in training AI must supply a record of all third-party data and intellectual property used in that training to the AA. Moreover, these persons training the AI using data are also required to assure the AA that the data used by them has been by the informed consent of the parties and that they have complied with all the applicable IP and Copyright obligations. Logically, the future cases arising under the bill, if it became an act, would turn on the assurance given to the AA.

The Bill also provides for the internal regulation of the businesses developing or deploying AI through the appointment of AI officers who shall ensure the safe, ethical, unbiased, and non-discriminatory use of AI. It seems that to account for unwitting algorithmic discrimination, the bill stipulates that the AI models should be designed to prevent unlawful discrimination arising out of input data.

However, in its present stage, the bill lacks any consequences or penalties due to its violations. These consequences will be later informed through the AI regulations.

The scheme of the UK AI Bill, as highlighted above, seeks to deal with the rising influence of AI with principles such as transparency, goodwill, fairness, and accountability. It seeks to build a collaborative structure of AI regulation imputing responsibilities upon the stakeholders while also ensuring growth. The bill directly deals with the problems highlighted in the previous sections. It accounts for the issue of ex-post relief by providing various principles to be followed in the training of AIs and also deals with the problem of data accumulation by providing for transparency in the system. It seems to attack the root concerns relating to generative AI and IP rights. While it might seem that such a regulatory framework might impede technological development, I submit that such a regulation is necessary for ensuring a robust AI regime consonant with the IP entitlements of people.

Conclusion: What Can India Learn?

The problems concerning the violation of Intellectual Property are felt across jurisdictions. India, too, does not have a dedicated AI enactment to deal with issues of AI such as that of generative AI and IP law. Even though the data used to train AI models, if found to be violating copyright, can be litigated against, the absence of a pre-facto transparency regime, the time and money involved in the entire process, and an overarching body overseeing AI development render it difficult to ensure a robust regulatory framework dealing with issues relating to AI, including IP law. 

In this context, the UK Bill can be seen as an important development that can also be used by India to reassess its existing legal framework and to accommodate the novel characteristics of AI. Such an enactment can help provide a defined framework of regulation which also reduces the costs born due to the ex-post nature of judicial remedies. Moreover, the establishment of an AI body can go a long way in dealing with the issues presented by AI development.

About Suyash Pandey 1 Article
Suyash Pandey is a third-year BA.LLB(Hons.) student at National Law School of India University, Bangalore. His areas of interest include IP law, Competition law, Technology law, Fintech laws, and dispute resolution.

Be the first to comment

Leave a Reply

Your email address will not be published.