COPYRIGHT INFRINGEMENT BY GENERATIVE AI SYSTEMS

A Legal Analysis with Reference to ANI v OpenAI

ABSTRACTThe advent of generative artificial intelligence systems has posed significant difficulties to copyright law across nations. These systems are trained on massive datasets of copyrighted material, often without the rightsholder’s permission, creating severe concerns about unauthorised reproduction, commercial exploitation, and the limits of statutory exceptions. In India, similar concerns have been crystallised in ANI Media Pvt. Ltd. v. OpenAI Inc., a pending case before the Delhi High Court over the alleged use of copyrighted news content to train a large language model. This paper examines whether such AI training practices amount to copyright infringement under the Copyright Act, 1957. By analysing the technological functioning of generative AI, the Indian copyright framework, and judicial interpretations of fair dealing, the paper argues that large-scale commercial AI training without authorisation is incompatible with existing Indian copyright doctrine. The paper further highlights regulatory gaps and proposes policy pathways to balance innovation with the protection of creators’ rights.

Keywords –  Generative AI, Copyright Infringement, Fair Dealing, ANI v OpenAI, Indian Copyright Law

INTRODUCTION

The rapid advancement of generative artificial intelligence has drastically impacted the processes of content generation and distribution. Systems such as big language models generate text by training on enormous datasets that include books, articles and news reports, the majority of which are protected by copyright law. Although these technologies offer increased efficiency and creative potential, they also exacerbate longstanding tensions between technological innovation and the protection of authors’ rights.

This tension has acquired judicial prominence in India through the case of ANI Media Pvt. Ltd. v. OpenAI Inc.[1], in which a major Indian news agency alleges that its copyrighted journalistic content was used without authorisation to train an AI system. The case raises fundamental questions regarding whether AI training constitutes reproduction or adaptation under the Copyright Act, 1957, and whether such use may be justified under the statutory doctrine of fair dealing.

This paper contends that Indian copyright law, which has traditionally emphasised authors’ rights and narrowly defined exceptions, does not easily facilitate extensive commercial AI training. The paper examines, through doctrinal analysis and policy evaluation, whether current legislation sufficiently addresses the challenges presented by generative AI.

  1. Understanding the Use of Copyrighted Works by Generative AI

Generative AI platforms operate by collecting large volumes of data during a training phase.  This data is often sourced through automated mining of publicly accessible websites, including news platforms, blogs, and digital libraries.  During training, copyrighted works are copied, stored, and converted into machine-readable tokens that enable the system to generate accurate responses.

Although AI developers contend that models do not retain expressive content in a human-readable form, the training process necessarily involves reproducing and storing content. Further, in the output phase, AI-generated responses may closely resemble original works or act as substitutes for them.  Under Indian copyright law, both phases raise legal concerns, as reproduction need not be perceptible to human users to amount to infringement.

  • Copyright Framework under Indian Law and the Limits of Fair Dealing

The Copyright Act of 1957 provides a rights-based structure that confers upon authors a suite of exclusive economic rights, as outlined in Section 14[2]. These rights encompass the prerogatives to reproduce, distribute copies of, alter, and communicate their works to the public. These rights constitute the fundamental elements of copyright protection in India and are intended to guarantee that authors maintain authority over the commercial utilisation of their creative work. Section 51[3] of the Act reinforces this framework by stipulating that copyright infringement occurs when any individual, without a license or authorisation from the copyright proprietor, exercises these exclusive rights or permits their exercise for commercial gain.

Indian courts have consistently embraced a broad interpretation of Sections 14[4] and 51[5] to protect the economic interests of creators. Simultaneously, they have upheld a prudent and restrictive stance regarding copyright exceptions. Unlike jurisdictions such as the United States, which recognise an open-ended doctrine of fair use, Indian copyright law takes a closed-list approach under Section 52[6]. This section clearly defines the instances in which copyrighted work may be used without permission, including private or personal use, research, critique or review and reporting on current events. In contrast, the U.S. Supreme Court in Campbell v. Acuff-Rose Music Inc.[7] held that fair use is a flexible, case-specific doctrine capable of accommodating new forms of expression, even where the use is commercial in nature. The case concerned a parody of the song “Oh, Pretty Woman”, in which the Court emphasised the concept of “transformative use” and clarified that no single factor, including commercial intent, is determinative of fair use. Indian courts, however, have not adopted a comparable doctrine of transformative use, reinforcing the fundamentally different statutory and judicial approaches to copyright exceptions in India.

The judicial interpretation of Section 52 has demonstrated that these exceptions are limited in scope and should not be expanded through judicial interpretation. In Civic Chandran v. Ammini Amma[8], the Kerala High Court reaffirmed that the research exception under fair dealing must be bona fide and non-commercial, and it cannot be used to justify profit-oriented or market-substituting purposes. Similarly, in Super Cassettes Industries Ltd. v. Entertainment Network (India) Ltd.[9], the Supreme Court warned against an overly broad interpretation of statutory exceptions that would erode the exclusive rights provided to the copyright owner and also in Eastern Book Company v. D.B. Modak[10], the Supreme Court emphasised that unlawful reproduction of expressive information, even in digital form, can constitute infringement.

This narrow approach proves particularly relevant in the context of generative AI training. Section 52(1)(c), which allows for the temporary or accidental storage of works during electronic transmission, has been strictly interpreted to cover intermediaries involved in purely technical activities, like as caching by internet service providers. It does not apply to companies that permanently keep and handle copyrighted works as part of their commercial business strategy. As a result, expanding the defence of fair dealing to include large-scale, commercial AI training would necessitate a judicial enlargement of Section 52 beyond its text and purpose, something Indian courts have traditionally opposed.[11]

  • Case Study: ANI Media Pvt. Ltd. v. OpenAI Inc.

The case of ANI Media Pvt. Ltd. v. OpenAI Inc.[12] is the first substantial judicial confrontation in India between a traditional copyright holder and a creator of generative artificial intelligence systems. ANI, one of India’s leading news agencies, claims that its copyrighted news stories were deliberately collected and utilised by OpenAI to train a massive language model without authorisation, license, or reimbursement. According to ANI, this unlawful use immediately violates its exclusive rights under Sections 14 and 51 of the Copyright Act of 1957, specifically the rights to reproduction, storage, adaptation, and public communication.

ANI’s claims are based not on isolated instances of copying, but on the large-scale and systematic absorption of its journalistic content for the purpose of developing a commercial AI product. It contends that its news reports require significant editorial expertise, labour, and financial investment, and that allowing AI developers to freely exploit such information for training purposes would undermine the economic incentive that copyright law intends to safeguard. ANI also claims that AI-generated outputs have the potential to replace original news reporting, which would hurt its readership, licensing business, and overall market value.

OpenAI, on the contrary, contests the allegations of infringement, claiming that their training method does not entail the retention or reproduction of expressive content in a copyright-related manner. It claims that publicly available data is used solely to extract non-expressive statistical patterns and linguistic links, and that the model does not retain or recall copyrighted materials verbatim. On this premise, OpenAI invokes the doctrine of fair dealing under Section 52 of the Copyright Act, which defines AI training as a sort of research or technical processing that does not interfere with the ordinary use of the copyrighted work.

The Delhi High Court filed a notice in the matter and designated amici curiae to help the court with problematic topics such as generative AI, copyright infringement, and territorial jurisdiction over overseas AI corporations that provide services in India. The court’s focus on these considerations demonstrates an appreciation of the doctrinal and technological novelty involved, particularly in determining whether existing copyright principles can be appropriately applied to AI training techniques[13].

The importance of ANI v. OpenAI goes beyond the parties immediate interest. It is India’s first significant judicial intervention into the legitimacy of generative AI training under copyright law, and it raises broader questions about the balance of innovation and creator rights in the digital economy. The case’s outcome is expected to have an impact on future copyright disputes involving AI systems, as well as legislative or regulatory moves aimed at defining the permissible bounds of AI training in India.

  • Comparative Perspectives

The litigation in the United States, Bartz v. Anthropic PBC[14], provides a relevant comparison to the Indian dispute in ANI Media Pvt. Ltd. v. OpenAI Inc. In this dispute, a collection of authors claims that Anthropic illegally stole and utilised copyrighted literature to train its generative AI model, “Claude,” with no agreement or remuneration. The plaintiffs claim that large-scale intake of entire copyrighted works for training purposes is a blatant infringement of the reproduction right, regardless of whether the AI model produces verbatim copies.

Anthropic has defended its practices by citing the United States’s fair use law, claiming that AI training is a highly transformative activity that extracts abstract linguistic patterns rather than exploiting the expressive value of the underlying works. It also contends that such use does not effectively replace the original works in the market. The case highlights the flexibility of section 107 of the U.S. Copyright Act, which enables courts to balance existing copyright theory and accept specific forms of AI training.

A similar line of reasoning evolved in Kadrey v. Meta Platforms, Inc.,[15] in which writers challenged Meta’s purported usage of copyrighted literature to train its massive language models. Meta has maintained that training with copyrighted material is fair use because it is transformative and does not result in the dissemination of protected expression. However, the plaintiffs argue that wholesale copying of complete works for commercial AI research infringes upon authors’ economic rights and displaces legitimate licensing markets.

Bartz v. Anthropic and Kadrey v. Meta demonstrate that even in a fair use-oriented jurisdiction, courts disagree on whether commercial AI training should be granted blanket protection. From an Indian perspective, these decisions highlight the limited significance of US precedents. Indian copyright law does not recognise an open-ended fair use approach or a separate defence for transformative usage. As a result, arguments that may receive doctrinal support are that US courts would need particular legislative permission before being accepted within Indian copyright law.

  • Policy Gaps, Regulatory Challenges, and the Way Forward

Legislative ambiguity relating to using copyrighted works as input for developing generative AI has created many legal and regulatory issues related to using copyright-protected information to train generative AI services in India. The lack of clarity in the law regarding how much of each copyright-protected work is being accessed by generative AIs makes it difficult for a copyright owner to determine whether their copyright-protected work has been accessed for use as training data. Additionally, copyright owners face many barriers related to the confidentiality of AI development processes and the fact that generative AIs can collect large amounts of data from multiple countries worldwide. Subsequently, AI developers are working in an ambiguous legal environment, given that there is no clear information regarding authorized uses of copyrighted works, licensing obligations, and any compliance requirements under the Copyright Act, 1957. This increasing uncertainty regarding copyright use presents a high risk of undermining creator rights and limiting the ongoing technological progress.

These concerns are explicitly addressed in the DPIIT Working Paper on Generative AI and Copyright[16], which recognises the dichotomy between safeguarding human innovation and facilitating the development of robust AI systems. The Working Paper warns against blindly adopting foreign models, such as blanket fair use or text-and-data-mining exceptions, because such approaches may undermine the economic incentives that copyright law seeks to preserve, especially in a country with a large and diverse creative economy. Instead, it underlines the importance of a calibrated framework that enables access to data for AI research while also providing rights holders with proper pay and oversight.

Legislative action would be a better solution than an ad hoc approach to expanding the current scope of exemptions for judicial expansion of current exemptions. Structuring AI Training Data Licensing, creating collective management systems or extending collective management to AI Developers who create/maintain AI Training Data; creating legal rights for Copyright Owners for the use of their copyrighted Material to train AI. Also, making AI Developers disclose the types of data they use for AI training purposes or how they will be monitored may help to create accountability between AI Developers and Copyright Owners and build trust with Technology Companies and hold Technology Companies responsible for compensating Copyright Owners. The Proposed Framework will enable India to promote AI Innovation while protecting the Core Principles of Copyright Law.

CONCLUSION

Large-scale use of copyrighted material without prior permission is now possible through Generative AI technology. This presents many challenges to the existing Copyright regime. As noted above, AI incorporates the basic exclusive rights protected by the Copyright Act of 1957; however, fair dealing language contained within the Copyright Act limits the ability of individuals and businesses to sell and/or commercially benefit from widely using content created by others without seeking a license.

The case ANI Media Pvt. Ltd. v. OpenAI Inc. illustrates the legal tension between Copyright and Generative AI Technology, and the legal outcomes of this case give us insight into how courts and Policy Makers will ultimately address the convergence of Copyright & Generative AI. Since Fair Use does not exist as an open-ended doctrine in India, comparative trends in other countries do not provide easy pathways for India. So, unambiguous legislative action is necessary to create a balance between technological innovation and the protection of creators’ rights, allowing for the advancement of Generative AI without creating instability or infringement of Copyright laws.


[1] ANI Media Pvt Ltd v OpenAI Inc CS (COMM) 1028/2024 (Delhi HC pending)

[2] Copyright Act 1957, s 14.

[3] Copyright Act 1957, s 51.

[4] Copyright Act 1957, s 14.

[5] Copyright Act 1957, s 51.

[6] Copyright Act 1957, s 52.

[7] Campbell v Acuff-Rose Music Inc 510 US 569 (1994)

[8] Civic Chandran v Ammini Amma 1996 PTC (16) 329 (Ker HC).

[9] Super Cassettes Industries Ltd v Entertainment Network (India) Ltd (2008) 5 SCC 488.

[10] MySpace Inc v Super Cassettes Industries Ltd 2017 (236) DLT 478 (DB).

[11] SpicyIP, ‘Generative AI, Copyright, and the Limits of Fair Dealing in India’ (SpicyIP Blog, 2024).

[12] ANI Media Pvt Ltd v OpenAI Inc CS (COMM) 1028/2024 (Delhi HC pending).

[13]  Hariharan KK, ‘Protecting Creators’ Rights in the Era of AI: Defending ANI in the Case of ANI v OpenAI’ (2025) 5(4) Indian Journal of Integrated Research in Law 113.

[14] Bartz v Anthropic: All You Need to Know About the Largest Copyright Settlement in History The Leaflet (18 November 2025)

[15] Kadrey v Meta Platforms, Inc (N.D. Cal, 2023)

[16] DPIIT, Working Paper on Generative AI and Copyright: One Nation One License One Payment (Government of India, December 2025).

Authored by: Ms. Katyaini Vemparala, Jindal Global Law school

Be the first to comment

Leave a Reply

Your email address will not be published.


*