AI voice cloning through the lens of copyright laws: challenges on the rights of singers

June 9, 2024

Introduction

There is a gradual rise in AI and Deepfake technologies, the world has now begun to see moves of its use by the music industry as well. These tools are used to replicate the voices of late, living singers and other famous figures. This is a complex scenario, and should be meticulously analyzed as it has legal, technological and last but not the least ethical elements. The legal ramifications of what the voice specifically is once it has been AI generated, and the rights available to the vocalists are a bit of grey area.Ethically, the perpetual preservation of late singers’ voices, with proper consent from family members and fair compensation, is seen as positive development. However, there are significant concerns about breaching the identity or likeness of musicians. This could lead to a future where trained singers with repute would be replaced by the AI generated voices that are flawless, affordable, and quickly produced. This article has made a legal analysis of voice cloning in India and the prevailing international regime to sensitize all concerned that copyright should be structured so as to include protection not just for facial images but also for voices.

Background

The AI generated voice model can be classified into two types:

User-Input Based:These are created with predetermined inputs, using voices of well-known personalities and singers for training. It involves human (user) involvement as data is fetch for training the AI software. For example, in 2021, a South Korean startup recreated singer Kim Kwang Seok’s voice. Broadcaster SBS paid his estate a lump-sum fee to use his voice, with the content partially released on YouTube.
Autonomously Generated: These require minimal human instructions and are created without user direction. A notable example is the AI-generated podcast of Steve Jobs’ voice, which was created using a text-to-speech system trained on numerous recordings of him.

Protection of artists voice under copyright laws

An artist’s voice is not inherently protected under the copyright laws.However, the voice of the singer embodied in the sound recording, including AI generated cloned work can be protected, as fixation is a sine qua non of copyright protection.

Landmark case Midler v. Ford Motor Co ^[i]

In this case Bette Midler’s unique voice was emulated by a “soundalike” in a Ford commercial, causing public confusion. The ninth Circuit ruled against her copyright claim, stating that a voice is not “fixed” and thus not copyrightable. Although the court acknowledged her voice as distinctive, it emphasized that copyright law only protects works fixed in a tangible medium. Midler’s victory was based on the violation of her right of publicity, highlighting the challenges of protecting voices from misuse through copyright claims.

In the case of Experience Hendrix LLC v. Purple Haze records Ltd ^[ii]it was held that one could safeguard his voice through the performance, but the voice in isolation isn’t protected mostly.

Furthermore, Section 57 of the Copyright Act 1957 deals with authors special rights, known as “Moral rights”. In the case of Genda Phool, a song remade from an old Bengali folk song sung by Ratan Khar was accused of disregarding the performance and moral rights of the singer.

Previously, singers didn’t own copyright or receive royalties. Post the landmark 2012 to the Copyright Act, 1957, the Act recognizes “Performer’s Rights,” under Section 38 granting singers’ royalties and requiring consent for public use of their recordings for 50 years after the year of performance.

Is AI-Generated Music Work Copyrightable?

As far as AI-generated music works are concerned it satisfies the two main prerequisites of copyright protection

Originality: Only vocals are mechanically generated, and the user has to perform all the necessary functions such as lyric writing, audio editing, mixing and mastering.
Fixation: Prima facie the AI cloned work is fixed in tangible form. In stern Electronics Inc v. Kaufman^,[iii]the court ruled that the audiovisual display in a video game meets the fixation requirement, even though it varies each time the game is played. The elements of the video game are considered fixed due to their pre-programmed nature stored in memory. This case exemplifies how courts have applied the fixation requirement to new technologies, offering insights into how this element might be interpreted for AI models.

However, the challenge arises with authorship, as most jurisdictions require human authorship. Three potential candidates for claiming authorship are:

The voice cloning software developer
The user who trains the model using the data.
The artist whose voice has been cloned using the audio recording to train the model.

Joint authorship claims are also debated, with arguments for and against each model. Zarya of the Dawn, a graphic novel by Ms. Kashtanova, faced copyright challenges because its images were created using the AI software Mid journey. The US Copyright Office recognized her authorship of the text and arrangement but denied copyright for the AI-generated images, citing a lack of human creation.AI requires initial data for voice training. That means the software developer utilizes the existing recordings without licensing from music labels, which can significantly amount to infringement.

In that case, Can the AI developer claim Fair use (processing the data. i.e. collection of songs)?

Generally, the fair use doctrine can be claimed by the software developer but its application is highly specific. In the US, codified in 17 U.S. Code § 107 elucidates the four factor test in order to claim fair use in which 17 U.S. Code § 107 (3) explicitly highlights the amount and substantiality of the portion used in relation to the copyrighted work as a whole. Hence, use of voice for training the software needs to meticulously analyzed based on the fact circumstance in each scenario. Another important aspect is whether the work generated using AI-based technology is a derivative work. While music in vivid styles is not usually considered a derivative work and is permitted to foster creativity and growth, AI generated work presents a debatable and complex issue^.^[iv]

Recently, Sonic Music which is a big player in the industry has sent letters to Google, OpenAI, and Microsoft to inquire if their songs were used for voice cloning, they have expressed their receptiveness regarding negotiating licensing terms with them. Failing in which would probably lead to potential infringement legal recourse.

No room for performers right

Performers right aims to grant exclusive control over the use and distribution of their performances. According to the WIPO performances and phonograms Treaty, “performers” are defined as “actors, singers, musicians, dancers and other persons who act, sing, deliver, play in, interpret, or otherwise perform literary or artistic works or expressions of folklore.

In the context of AI voice- cloned works, traditional notions of performance do not apply.AI software replicates the voices of established singers without copying specific performance, making it almost impossible for the singer to claim infringement of traditional performance rights.The India, the copyright statute does not provide any kind of legal protection to the voice of the singers. According to Section 2(q) of the Copyright Act 1957, a “performance” in relation to performers rights as being conducted live by one or more performers, and it may be visual or acoustic in nature. Consequently, voice owners do not possess performance rights nor the right to receive royalties, as the performance is not rendered by the individual.However, Moral rights or special rights violation argument can be put forth. An AI generated work may modify,distortormutilatethesinger’svocalsprejudiciallyaffectingtheartistsreputation.Forexample, using the voice for songs with obscene, racist, hate spreading and anti-nationalist lyrics might deceive listeners into believing the singer originally performed the song.

Appropriation of identity and personality rights

Personality rights protects an individual’s name, likeness or other unique aspects of their persona. However, in most jurisdictions, the right to publicity do not extend post-mortem/ posthumously. Only twenty-six states in the U.S. recognize a post-mortem right of publicity, with Florida, for example, granting this right for forty years after death.

In India, the Madras High court observed that there is no specific statute prevalent in India which defines personality rights, as evidenced by a case involving the renowned actor Rajinikanth. Celebrities such as Amitabh Bachan and Anil Kapoor have initiated legal recourse, as a response to the emerging deepfake technologies like Generative AI and Machine learning potentially leading to online misuse of a celebrity’s persona. In the landmark case Anil Kapoor vs Simply Life Indian & ors ^[v] the Delhi High Court noted that features identified by the public as markers of celebrity’s personality are loosely referred to as “personality rights”. However, it is well-established that most jurisdictions, including India, do not recognize post-mortem rights of publicity. This was affirmed in Krishna Kishore Singh vs Sarla A Saraogi & Ors,[vi] where the court held that rights to privacy, publicity, and personality are not inheritable and cease upon the individual’s death. However, there are still no specific precedents for protecting singers’ voices under copyright laws.

The main ethical considerations in this regard are consent, deception and transparency. Each of these elements presents unique challenges and opportunities for the music industry and beyond^{. [}^vii]

Consent is a critical ethical issue in AI voice cloning. A notable case involved Anthony Bourdain, a celebrity chef, whose voice was reproduced posthumously in a documentary. Fans and audience came to know this after the makers revealed the same. But Bourdain’s wife, Ottavia claimed she has not consented to use of his voice^{. [viii]}

New global laws and initiatives addressing ai voice cloning

ELVIS Act: Tennessee’s Elvis Act marks a significant step by explicitly recognizing a person’s voice as a protected property right for the first time. This includes both their actual voice and any simulations thereof. Violations of the ELVIS Act can lead to civil actions or criminal charges. The Ensuring Likeness Voice and Image Security (ELVIS) aims to protect podcasters and voice actors, at all levels of fame, from the unscrupulous use of their voices.

NO AI FRAUD ACT: At the federal level, the NO AI fraud Act was introduced in October 2023 and is intended to protect the voice and visual likeness of all individuals from unauthorized recreations from generative AI technologies.

In the UK, compositions created by AI may be protected under copyright law if they are “generated by computer without human authorship”, as outlined in the copyright, Designs and Patents Act 1988 (CDPA).

European Union’s AI Act: The EU AI law is very crucial for the music industry; it acknowledges the usage of copyrighted contents in the training of AI software. Voice cloning has led to a surge of copyright infringement suits against AI software developers lately. A salient feature promulgated by this law is the “general purpose AI” models to keep a check on the contents authenticity and credibility used for data training. It mandates the authorization of the rights holder concerned.

Recommendations and suggestions

There is a pressing need for extending copyright law to cover AI generated voice works. This approach should be in adherence with ethical considerations. The suggested framework is as follows:

Ensure that the original singer’s contribution is acknowledged in all uses of the AI-generated voice. This includes credits in music tracks, albums, promotional materials, and digital platforms.
Establishment of prompt licensing standards that outline terms of use of the original singer’s voice. This would include details on how the voice can be used, any limitations and the duration of license.
There must be emphasis on taking explicit “consent” from the person whose voice is taken directly or from the estates in case of posthumous work.
Royalty payments system must be implemented which entails factors like commercial success of the AI-generated content and commercial use.
Mandating the insertion of “disclaimer” whenever an AI-generated voice is used, ensuring transparency for the audience or users.

Conclusion

The proposed framework should focus on balancing the interest of various stakeholders like AI developers, and end users. Control over one’s identity, and responsible use of AI should be the motto without hampering creativity and innovation. These developments indicate growing recognition of the need for legal frameworks to govern the use of AI in creative industries.The irresponsible and unethical use of AI technology in the music industry is dangerous and thought provoking. The common law right of publicity is inadequate as it primarily grants protection against commercial uses of individual’s identity and does not extend to non-commercial uses. Additionally, there is non-recognition of personality rights posthumously. This situation places a fetter upon the family members of the deceased singers to bring a legal action. Cultural and artistic integrity should be considered especially in posthumous releases, ensure that they align with the artists known values, style and principles. Hence, voice protection should enter the copyright regime in the near future.

[i]Midler v. Ford Motor Co., 849 F.2D 460 (9^th Cir. 1988).https://indiancaselaws.wordpress.com/2020/07/08/midler-v-ford-motor-co-others/

[ii]Experience Hendrix, L.L.C. v. Purple Haze Records Ltd., [2003] EWHC 1315 (Ch) (Eng.). https://www.5rb.com/wp-content/uploads/2013/10/Experience-Hendrix-v-Purple-Haze-ChD-24-Feb-2005.pdf

[iii]Stern Elecs., Inc. v. Kaufman, 523 F. supp. 635 (E.D.N.Y.1981)

[iv]AI and Deepfake Voice Cloning: Innovation, Copyright and Artists’ Rights https://www.cigionline.org/static/documents/DPH-paper-Josan.pdf

[v] Anil Kapoor v. Simply Life India & Ors, CS (COMM) 652/2023 (Del. HC Sept.20, 2023)

https://indiankanoon.org/doc/113724486/

[vi]Krishna Kishore Singh v. Sarla A. Saraogi, CS(COMM) 187/2021 (Del. HC July 11, 2023)

[vii]Prachi Pat, AI Voice Enters the Copyright Regime: Proposal of a Three-part Framework, 34 Fordham Intell. Prop. Media & Ent.L.J.2024.

[viii]Bryn Wells-Edwards, What’s in a Voice? The Legal Implications of Voice Cloning, 64 Ariz. L. Rev. 1213 (2022).

Adv. Achyuth B Nandan

Author

Adv. Achyuth B Nandan is a dedicated advocate with a passion for law and a specialized focus on Intellectual Property Rights. Currently pursuing postgraduate studies in Intellectual Property Law at the prestigious Indian Institute of Technology Kharagpur (IIT KGP), Achyuth is committed to advancing knowledge and expertise in this dynamic field.

With a keen interest in Geographical Indications, Traditional Knowledge, and Biodiversity Research, Achyuth is particularly enthusiastic about protecting the unique cultural and natural heritage of various communities. His academic pursuits are complemented by active involvement in research projects related to the legal aspects of biodiversity conservation and the preservation of traditional knowledge systems.