Lyric Lockdown: Music Publishers’ Challenge to Training AI

Josie Croce, Contributing Member 2023-2024

Intellectual Property and Computer Law Journal

I. Introduction

The development of AI models is coming into conflict with copyright law as copyright holders allege that AI companies are infringing on their right by using copyrighted works to train their models. A similar issue ensued with the development of internet search engines, as copyright holders were resistant to permit search engines to use and display their works. In response, courts enabled search engines to continue their unauthorized use of copyrighted works under the fair use doctrine, which “permits courts to avoid rigid application of the copyright statute when, on occasion, it would stifle the very creativity which that law is designed to foster.”[1]

Major music publishers Concord Music Group Inc., Universal Music Corp., ABKCO Music, Inc. filed suit in October against Anthropic PBC for infringement of their copyrighted song lyrics.[2] This suit follows others filed by creative copyright holders against AI companies: in January of 2023, three visual artists filed suit together against AI companies Stability AI, Midjourney, and DeviantArt;[3] in September of 2023, a group of authors filed suit against AI company OpenAI.[4] If the music publishers get the outcome they hope, the AI companies will be required to receive permission and pay licensing fees to use copyrighted works to train their AI models.[5] These cases will test the limits of copyright protections against the training of artificial intelligence.

II. Background of search engine litigation

Search engines operate by combing through the web, indexing the information found on the web pages and permitting users to search the index.[6] In doing so, these search engines necessarily utilize and display copyrighted material to organize and produce results for the user.

This unauthorized use is lawful under the fair use exception, a statutory defense to infringement which “promotes freedom of expression by permitting the unlicensed use of copyright-protected works in certain circumstances.”[7] Section 107 of the Copyright Act explains that certain uses—such as criticism, news reporting, teaching, and research—are examples of activities that may qualify as fair use.[8] To determine whether an unlicensed use of a copyright work falls under the fair use exception, § 107 calls for the consideration of four factors: (1) the purpose and character of the use, including whether the use is of a commercial nature or is for nonprofit educational purposes; (2) the nature of the copyrighted work; (3) amount and substantiality of the portion used in relation to the copyrighted work as a whole; and (4) effect of the use upon the potential market for or value of the copyrighted work.[9]

A few prominent cases—Kelly v. Arriba Soft Corp., Perfect 10, Inc. v. Amazon.com, Inc., Bill Graham Archives v. Dorling Kindersley, Ltd., and Authors Guild v Google, Inc.—demonstrate how courts have considered the invocation of the fair use defense, including by search engines. These decisions provide context to and identify defects in the music publishers’ complaint.

Kelly v. Arriba Soft Corp. (9th Cir. 2003)

In Kelly v. Arriba Soft Corp., the 9th Circuit considered whether the defendant’s internet search engine’s display of small pictures (‘thumbnails’) of plaintiff photographer’s images was permissible under the fair use doctrine.[10] The defendant’s search engine permitted users to type a search term and then it would ‘crawl’ the web for images to index. Then, users were provided a list of results as thumbnails which could direct users to the full-sized image and the link to the original website.[11]

The court held that the defendant’s use of thumbnails of plaintiff’s photographs was fair use, emphasizing that the first and fourth factors weighed in the defendant’s favor.[12] The purpose and character (first factor) of the defendant’s use was entirely different (“transformative”) from that of the plaintiffs.[13] The plaintiff’s works “were intended to inform and engage the viewer in an aesthetic experience,” while the defendant’s thumbnails were “unrelated to any aesthetic purpose” and was intended to “help index and improve access.”[14] Relevant to the first factor was the commercial purpose of the use, but the court noted that the more transformative the work, the less important commercialism is.[15] The effect of the defendant’s use on the market for the photographs (fourth factor) was insignificant; the low-quality of the thumbnails could not be an adequate substitute for the full-resolution images, and the plaintiff’s ability to sell or license their images was unaffected.[16]

Perfect 10, Inc. v. Amazon.com, Inc. (9th Cir. 2007)

In a similar 9th Circuit case involving defendant’s use of thumbnails of plaintiff’s photographs, the court held the use was fair even though the defendant’s thumbnails permitted users to download free images of the photographs on their phones rather than buy the plaintiff’s reduced-size images.[17] The court concluded that the “significantly transformative nature of [the defendant]’s search engine, particularly in light of its public benefit, outweighs [the defendant]’s superseding and commercial uses of thumbnails.”[18] Considering the fourth factor, the court held that that market harm could not be presumed because the potential harm to the plaintiff’s market was hypothetical and because of the highly transformative nature of the use.[19]

Bill Graham Archives v. Dorling Kindersley (2nd Cir. 2006)

This case held that the defendant’s reproduction of plaintiff’s copyrighted artistic images, originally used by the band “The Grateful Dead” for posters and tickets, in a cultural history book of the band was fair.[20] The parties initially tried to reach an agreement to license the images, but the agreement fell through and the book was published anyway.[21] Despite the defendant’s failure to pay licensing fees, the court reasoned that a more transformative use is less likely to cause market harm.[22] The court concluded that the defendant’s use does not harm the market for the plaintiff’s sale of its copyrighted work and that there is no “market harm based on [the plaintiff’s] hypothetical loss of license revenue from [the defendant’s] transformative market” and held that the fourth factor weighed in favor of fair use.[23]

Authors Guild v. Google (2nd Cir. 2015)

Here, the court held that the defendant’s scanning of more than twenty-million books without payment of licensing fees, in order to create an index and help users find books, was a fair use.[24] Following a similar line of reasoning as the Perfect 10 decision, this court concluded that “we see no reason in this case why Google’s overall profit motivation should prevail as a reason for denying fair use over its highly convincing transformative purpose, together with the absence of significant substitutive competition, as reasons for granting fair use.”[25]

III. Landscape of copyright infringement cases against AI companies

In January of 2023, a group of visual artists filed suit against three AI companies: Stability AI, Midjourney, and DeviantArt.[26] In the artists’ complaint, they allege that Stability AI used the artists’ copyrighted visual works without permission to train their AI model “Stable Diffusion” and used this model to Midjourney and DeviantArt.[27] Further, the artists allege that in using the artists’ copyrighted images to train the model, the images generated by the model are derivative works.[28]

In a hearing on the defendant’s motion to dismiss, the judge trimmed down the suit to include only one of the artists and her registered copyrights and required an amendment to clarify the claims against Midjourney and DeviantArt, but permitted the case to proceed to determine whether copying occurred in the training of Stable Diffusion or occurs when Stable Diffusion is run.[29]

In September of 2023, a group of well-respected authors filed suit against the company OpenAI and their model, “ChatbotGPT.”[30] Like the visual artists’ suit, the authors claim that OpenAI infringed on their copyrighted works when it used the authors’ books to train ChatbotGPT, and that ChatbotGPT unlawfully produces derivative works when it generates summaries of the authors’ books.[31] Mary Bly, an author in this suit, explained that “this lawsuit is important because it establishes a line in the sand…If you’re going to train things in the future on my books, you need to license them.”[32]

IV. Music Publishers’ Complaint

In the present case, the plaintiff music publishers allege that Anthropic’s AI model, referred to as “Claude,” infringes their copyrights in multiple ways.[33] First and foremost, the publishers take issue with the way that Claude is trained: “by scraping and ingesting massive amounts of text from the internet…[including] the lyrics to innumerable musical compositions for which Publishers own or control the copyrights.”[34] Because Anthropic lacks the publishers’ permission to use their works to train Claude, the publishers claim that this use is infringement.

Second, arising out of this ‘ingestion’ of publishers’ song lyrics, the publishers allege that when Claude is prompted to provide the lyrics of a song it can generate identical copies of the lyrics.  The publishers explain that there are music lyric websites which pay licensing fees to the publishers to copy and distribute these song lyrics, so Anthropic’s failure to license publishers’ song lyrics deprives publishers of benefits and of their control over the copying and distributing of their copyrighted works.

Beyond the ‘ingestion’ and copying of publishers’ song lyrics, the publishers claim that Claude copies of portions of the publishers’ song lyrics when prompted to do tasks like “write a song about a certain topic, provide chord progressions for a given musical composition, or write poetry or short fiction in the style of a certain artist or songwriter.”[35]

In sum, the publishers allege that Anthropic infringes on copyrighted song lyrics when Claude is trained using these lyrics and when Claude generates portions of or entire copies of these lyrics. Specifically, Anthropic is charged with infringing on publishers’ exclusive rights—under 17 U.S.C § 106—to reproduce, prepare derivative works, distribute, and display publicly. As a result of this infringement, the publishers argue that Anthropic is enriched while publishers and their songwriters are deprived of licensing fees. The publishers go on to assert that, “If left unchecked, Anthropic will continue to infringe Publishers’ rights and cause damage on a broad scale to Publishers and the songwriters they represent, supplanting the fruits of human ingenuity and creativity with automated infringements that simulate genuine expressive works.”[36]

V. Potential Outcomes

It is unclear whether courts will expand the fair use defense to the unlicensed use of copyrighted works like books, art, and musical compositions by AI companies to train their models. On the one hand, courts have favored the defense of fair use where the character of the defendant’s use was truly transformative—totally unrelated to the plaintiff’s original work. Here, the defendant’s use in creating AI chatbots is quite different from the plaintiff’s use of the music compositions, primarily for commercial licensing.[37]

Further, courts have favored the defense even when the plaintiff could have granted a license to the defendant, because a more transformative use implies a lesser likelihood that the plaintiff will be harmed in the marketplace. Here, the defendant may be able to invoke the fair use defense, despite the possibility that they could have paid licensing fees for the plaintiffs’ musical compositions, because the defendant’s use is so transformative that it outweighs commercial purposes.

In a brief written by the American Society of Composers, Authors and Publishers (“ASCAP”) and submitted to the Copyright Office, ASCAP argues that “to the extent the AI industry exploits and benefits financially from the creativity and labor of human artists, writers, and other creators, it must compensate these creators fairly for the use of their works.”[38] Representing nearly one million songwriter, composer, and publisher members, ASCAP is responsible for licensing and enforcing public performance rights of its members’ musical compositions.[39] ASCAP emphasizes the long-standing practice of collective licensing in the music industry (such as when the industry shifted to digital music consumption on platforms like iTunes and Spotify) and argues that AI should be no exception.[40]

Underlying the allegations against AI companies is the fear that authentic human creativity by artists, authors, and musicians will be trampled by the technological superpower of AI. Creators are calling on copyright law, which aims to protect and encourage original human creation, to stop what they believe is unchecked exploitation by AI companies using original copyrighted work to train AI models. It is unquestionable that the use of these copyrighted works are essential to training and creation of AI models: OpenAI’s CEO conceded to Congress that if copyrighted works were not used, it would “lead to significant reductions in model quality.”[41] Although the creators assert that this use is deeply unfair, it remains unclear whether the courts will find that the distinctive and innovative purposes of AI models overcomes the creators’ claims.


[1] Dr. Seuss Enters., L.P. v. Penguin Books USA, Inc., 109 F.3d 1394, 1399 (9th Cir. 1997) (internal quotation marks omitted).

[2] Blake Brittain, Music publishers sue AI company Anthropic over song lyrics, Reuters (October 19, 2023, 2:01 PM), https://www.reuters.com/legal/music-publishers-sue-ai-company-anthropic-over-song-lyrics-2023-10-18/.

[3] Min Chen, Artists and Illustrators Are Suing Three A.I. Art Generators for Scraping and ‘Collaging’ Their Work Without Consent, Artnet News (January 24, 2023), https://news.artnet.com/news/class-action-lawsuit-ai-generators-deviantart-midjourney-stable-diffusion-2246770.

[4] Alexandra Alter and Elizabeth A. Harris, Franzen, Grisham and Other Prominent Authors Sue OpenAI, The New York Times, (September 20, 2023), https://www.nytimes.com/2023/09/20/books/authors-openai-lawsuit-chatgpt-copyright.html.

[5] Comments of the American Society of Composers, Authors and Publishers on Artificial Intelligence and Copyright, Docket No. 2023-6.

[6] In-depth guide to how Google Search Works, Google Search Central, https://developers.google.com/search/docs/fundamentals/how-search-works.

[7] U.S. Copyright Office Fair Use Index, https://www.copyright.gov/fair-use/ (last updated November 2023).

[8] 17 U.S.C. § 107.

[9] 17 U.S.C. § 107.

[10] Kelly v. Arriba Soft Corp., 336 F.3d 811(9th Cir. 2003).

[11] Id. at 815-16.

[12] Id. at 818-822.

[13] Id. at 818.

[14] Id.

[15] Id.

[16] Id.at 821-2.

[17] Perfect 10, Inc. v. Amazon.com, Inc., 508 F.3d 1146, 1165-66 (9th Cir. 2007).

[18] Id. at 166.

[19] Id. at 168.

[20] Bill Graham Archives v. Dorling Kindersley Ltd., 448 F.3d 605 (2d Cir. 2006).

[21] Id. at 607.

[22] Id. at 615.

[23] Id. at 615.

[24] Authors Guild v. Google, Inc., 804 F.3d 202 (2d Cir. 2015).

[25] Id. at 219.

[26] Complaint & Demand for Jury Trial, Anderson v. Stability AI Ltd., No. 3:23-cv-00201 (N.D. Cal. Jan 13, 2023), ECF No. 1.

[27] Id.

[28] Id.

[29] Order on Motions to Dismiss and Strike, Andersen v. Stability AI Ltd., No. 3:23-cv-00201, (N.D. Cal. Oct 30, 2023) ECF No. 117.

[30] Alter and Harris, supra note 4.

[31] Id.

[32] Id.

[33] Complaint, Concord Music Group, Inc. v. Anthropic PBC, No. 3:23-cv-01092 (M.D. Tenn.).

[34] Id.

[35] Id.

[36] Id.

[37] According to the complaint, publishers use of the musical compositions includes “promot[ing] [composers and lyricists’] copyrights, protect[ing] their copyrights, and ensur[ing] that they receive proper renumberation for their creative efforts, through commercial licensing of their copyrighted works, including song lyrics.” Id.

[38] Brief before the Copyright Office, Comments of the American Society of Composers, Authors, and Publishers on Artificial Intelligence and Copyright, Docket No. 2023-6, pg. 4.

[39] Id. at 1-2.

[40] Id. at 4-5.

[41] Oversight of A.I.: Rules for Artificial Intelligence: Hearing Before the S. Judiciary Comm. Subcomm. on Privacy, Tech. and the Law, 118th Cong. (2023) (testimony of OpenAI CEO Sam Altman), available at https://techpolicy.press/transcript-us-senate-judiciary-hearing-on-oversight-of-a-i/ (last accessed Sept. 19, 2023).

Leave a comment

Blog at WordPress.com.

Up ↑