With the expansion of Generative AI technology across platforms and applications, developers are battling with a slew of copyright litigations. For instance, recently, a class action suit was filed against Google’s conversational generative AI chatbot, Bard. Google has been alleged for web scrapping (including copyrighted content) for training its chatbot. Even the popular LLM chatbot, ChatGPT has received flak from media and authors and witnessed several class action suits filed against it. The most prominent of these is by the Authors Guild, a group of prominent authors in the US.
In response, companies have started rolling Copyright Indemnification Policies to protect their users from potential copyright lawsuits arising out of the use of their Generative AI technology. A copyright indemnity policy, sometimes referred to as copyright infringement insurance, is a kind of insurance used to shield individuals or entities from litigation arising from potential claims of copyright infringement. In this blog, I examine the copyright indemnity policy of three major tech giants and analyse their legal effectiveness in protecting their users against potential copyright litigation.
What are the copyright concerns with Generative AI?
The copyright concerns with generative AI have been discussed in two blog posts previously (here and here). Briefly, generative AI raises concerns about copyright infringement in various areas. For instance, products of, or generative AI itself may infringe licenses of copyright usage. Whether AI-generated works are initially qualified for copyright protection is one area of contention. Only works of authorship generated by humans are protected under copyright laws. Some contend that because AI-generated products are not the result of human creativity, they do not satisfy this condition. Some contend that the writers ought to be the ones who developed the AI models that produce these works.
The question of whether using copyrighted data to train AI models violates copyright is another source of worry. The set of data used to instruct an AI model on how to carry out a task is known as training data. Training data for generative AI frequently consists of copyrighted media, including text, photos, and music. While some contend that using copyrighted data to train AI models is fair use, others contend that it is infringement.
Lastly, there’s the worry that unapproved derivative works might be produced using generative AI. A work that is based on an already-existing copyrighted work is called a derivative work. Examples of derivative works might include a musical version of a play or a book translation. Without the owner of the copyright’s consent, generative AI can be used to produce derivative works, which may result in infringement. The AI-generated work can be seen as violating someone else’s copyright.
In June 2023, Adobe, the maker of Acrobat and Photoshop, became one of the first companies across the globe to introduce its Intellectual Property indemnification policy for its new generative AI product, the Adobe Firefly. Firefly, a product of Adobe Creative Cloud, is a generative AI-based image creator. The product is specifically developed for commercial use to provide enterprises with a tool to generate professional standard pictures with ease. As such, to ensure, the “commercial safety“ of Firefly, Adobe has been taking due precautions. Illustratively, Firefly has been trained on licensed content like Adobe Stock (an Adobe product with its own set of stock images), (expired copyright) public domain material, and other non-copyright or openly licensed material. Adobe is a founder-collaborator of the Content Authenticity Initiative (CAI), an initiative involving multiple media, and technology companies; academia, and NGOs. The CAI seeks to provide an open industry standard for the provenance and authenticity of content.
Moreover, Adobe is a part of the Coalition for Content Provenance and Authenticity (C2PA). C2PA has established an open technical standard allowing publishers, creators, and consumers to trace the origins of media in different kinds. This includes an option for creators to indicate whether generative AI was utilized through a Content Credential. Content Credentials are a type of “tamper-evident” metadata designed to facilitate greater transparency and identification in the process of creation of images via tools such as generative AI. To use, Content Credentials, creators need to add tamper-evident metadata to their generated works using these Credentials. It allows creators to add more information about their work, including the name/ identity of its creator, the procedure of editing, and other related information. Therefore, unlike OpenAI, Adobe has been cautious in curating its training datasets to protect against claims of copyright infringement. If this was not enough, Adobe introduced another notable feature to strengthen its users against copyright indemnity – the IP indemnity policy. Firefly’s IP indemnity provides for indemnity against any Copyright-related legal proceedings. However, this is, of course, subject to some of the terms and conditions of usage. This move by Adobe seems to be a preventive measure in response to the flak received by other image-generation AI tools like DALL-E, Stable Diffusion, and DreamUp.
Microsoft’s Copilot Copyright Commitment
Following Adobe, Microsoft also released its indemnification policy, Copilot Copyright Commitment. Microsoft’s resolve to address consumers’ worries about copyright infringement about its AI assistants is a significant step. By doing so, Microsoft’s already-existing IP indemnity coverage is expanded to include copyright claims related to its suite of Copilots, which includes GitHub, Dynamics 365, Power Platform, and other products. Interestingly, it also covers the output that these AI helpers produce. This guarantee is intended to shield commercial users against possible monetary losses in the case that they are accused of copyright infringement. The enforceability of these promises, however, has come under scrutiny from certain experts until they are completely incorporated into the commercial agreements and terms of service. This is significant because currently, this indemnity “policy” is merely a commitment statement put out on its website – it has not been formally included in any of its user agreements. Even if we were to consider Microsoft’s commitment as demonstrative of its legal intention, there are further complications. The commitment states that the users must use content filters and safety systems – this reflects the complexity of navigating the various ambiguities and grey spaces of copyright infringement concerning generative AI. Therefore, even at best, Microsoft’s policy does not seem to guarantee assured protection to its users.
Google’s Indemnity Policy for Generative AI Users
Recently, Google also released its indemnity policy for generative AI users, and, joined the fray of indemnity policies by major tech giants by introducing a comprehensive indemnity policy for users of generative AI on its Google Cloud and Workspace platforms. Google’s approach to this issue is thorough, as it addresses it in a two-pronged manner. This policy specifically applies to software on the Vertex AI development platform and the Duet AI system, which primarily focuses on text and image generation in Google Workspace and Cloud programs. Significantly, it is important to note that Google’s indemnification does not extend to cases where users intentionally violate the rights of others, for instance, using copyright material for commercial purposes (and not falling under any fair dealing provision). This distinction is in place to ensure the responsible use of generative AI technology.
Significantly, Google’s indemnity policy does not exist for its popular generative AI chatbot, Google Bard. It is only limited to Vertex AI and Duet AI which is limited to far fewer users. The important area where Google Bard differs from some other generative AI tools, such as Adobe Firefly, is the content or datasets used for its training. Google Bard is Google’s own generative AI chatbot using a Large Language Model (LLM) akin to OpenAI’s ChatGPT.
While Adobe Firefly confines itself to non-copyright material, Google Bard has been alleged to be trained on all sources from the internet (irrespective of the copyright). This has presented a major concern for Google as evidenced by a recent slew of litigation against Bard. Therefore, Google’s policy does not seem to be providing substantial and effective protection to its users, and this probably explains why this policy has not received much attention even a month after its introduction. No doubt that even OpenAI has not announced any indemnity policy for ChatGPT which is rather similar to how Google Bard works concerning copyright concerns.
In the quickly changing field of generative AI, copyright indemnity rules are essential for giving users legal protection. In response to these worries, Microsoft, Adobe, and Google have all modified their rules to address particular facets of the application of generative AI. Even if these plans are a big step forward, consumers still need to be aware and cautious, knowing the extent and bounds of the indemnity offered. These regulations will surely develop in step with the advancement of AI, providing even more legal certainty for companies utilizing this game-changing technology.
Prachi Mathur is an undergraduate student at the National Law School University of India (NLSIU), Bangalore. She is interested in technology law, intellectual property law, law and economics, evidence law, and criminal law, among others. She likes to unpack the workings of the law through her blog posts.