ICYMI: AI-Related Copyright Issues Heat Up at End of June

That heat wave we recently experienced in DC was brutal—it seems like it’s all everyone was talking about (besides the so-called One Big, Beautiful Bill). But the weather was not the only thing that was scorchingly hot. Over the past week or so, cases and other issues related to the impact of AI on copyright have been equally hot.

Over the last two weeks of June alone, there were two big court decisions in AI-related copyright cases, big developments in two other cases, three new class action suits filed, and a few other significant activities. There has been so much activity on copyright and AI that we thought we would compile the biggest AI-related copyright news for this period.

Two AI Training Cases Decided

Bartz v. Anthropic: On June 23, the district court for the Northern District of California issued an order on summary judgment in the Bartz v. Anthropic AI case, finding that “the use of the books at issue to train Claude and its precursors was exceedingly transformative and was a fair use under Section 107 of the Copyright Act” but also that “[c]reating a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy.” The order assesses fair use as applied to three separate acts: (1) the use of the works to train a generative model, (2) the conversion of purchased print copies to digital, and (3) the downloading of pirated copies of books to build a “library.” Further, the order finds that the first two qualify as fair use while the third does not. The order concludes that “[w]e will have a trial on the pirated copies used to create Anthropic’s central library and the resulting damages, actual or statutory (including for willfulness).” The Copyright Alliance published a blog post about the decision.

Kadrey v. Meta: On June 25, the district court for the Northern District of California issued an order on summary judgment motions in the Kadrey v. Meta AI case, finding that the use of the books at issue was “highly transformative” under the first fair use factor and that based mostly on a lack of “meaningful evidence on the effect of training LLMs like Llama with their books on the market for those books” and given the slight weight of the transformative nature of the use, the fourth fair use factor weighed in favor for Meta. [However,] the court notes that “[i]n cases involving uses like Meta’s, it seems like the plaintiffs will often win, at least where those cases have better-developed records on the market effects of the defendant’s use. No matter how transformative LLM training may be, it’s hard to imagine that it can be fair use to use copyrighted books to develop a tool to make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing works that could significantly harm the market for those books. And some cases might present even stronger arguments against fair use.” The Copyright Alliance published a blog post about the decision.

Three New Class Actions Filed

Justice v. Udio and Justice v. Suno: On June 16, country music artist Anthony Justice (aka “Tony Justice”) filed lawsuits against AI music generators Uncharted Labs (aka Udio) and Suno, alleging the companies used his sound recordings without authorization to train their models and that the models reproduce exact or near exact replicas of his songs. The complaints, filed in the Southern District of New York against Udio and the District of Massachusetts against Suno, argue that the unauthorized use of the sound recordings do not qualify as fair use, citing to the U.S. Copyright Office’s recent report on AI training for support. The proposed class actions include one cause of action for violation of the right of reproduction and one cause for violation of the right to prepare derivative works.

Bird v. Microsoft: On June 24, a group of authors including Kai Bird, Jia Tolentino, Daniel Okrent, and others filed a lawsuit against Microsoft over the use of pirated copies of plaintiffs’ books to train the AI model, Megatron. The complaint alleges that Microsoft used a dataset of nearly 200k pirated books that were copied to train Megatron to enable the AI model “to generate a wide range of expression that mimics the syntax, voice, and themes of the copyrighted works on which it was trained.” Additional information is available on Reuters’ website

Thomson Reuters v. Ross: On June 17, the Court of Appeals for the Third Circuit granted a petition made by legal research service, Ross Intelligence (Ross), in its appeal of (1) whether Westlaw headnotes are protected by copyright and (2) whether Ross’ copying of those headnotes constituted fair use. In 2020, Thomson Reuters sued Ross, a competing legal research service, for copyright infringement, alleging that Ross obtained copyrighted legal content from a Westlaw subscriber to develop its own competing product based on machine learning. In February of this year, the district court for the District of Delaware had granted Thomson Reuter’s motion for summary judgment, finding that its Westlaw headnotes were copyrightable and rejected Ross’ fair use defense.

Getty v. Stability AI (UK Case): On June 25, Getty Images dropped its primary copyright infringement claims related to AI training and outputs in its ongoing UK lawsuit against Stability AI. The case is continuing with Getty’s claims of secondary copyright infringement as well as its trademark infringement claims. In Getty’s closing arguments, it said the claims were dropped due to weak evidence (of infringing activity occurring in the UK as opposed to the U.S.) and a lack of knowledgeable witnesses from Stability AI, and that the move was strategic to allow it and the court to focus on what Getty believes are stronger and more winnable allegations. (The events in this case do not impact Getty Images’ U.S. lawsuit against Stability AI, which is still active.)

Important New Study

A study conducted by a team of computer scientists and legal scholars from Stanford, Cornell, and West Virginia University, titled Extracting Memorized Pieces of (Copyrighted) Books from Open-Weight Language Models, shows that open-weight large language models (LLMs) are able to replicate substantial parts of ingested copyrighted works in generated output, and that this is evidence that LLMs have “memorized” content, copies of which are inside the model parameters. Specifically, the study showed that a version of the Llama model “memorizes” some books, such as Harry Potter, 1984 and The Hobbit, almost entirely. Evidence of increased levels of memorization by the Llama model of certain books, from the first to subsequent versions of the AI model, shows that despite potential legal liability, Meta did not take many steps to prevent memorization in later versions of the Llama model. More information about the study can be found in the Understanding AI’s article.


If you aren’t already a member of the Copyright Alliance, you can join today by completing our Individual Creator Members membership form! Members gain access to monthly newsletters, educational webinars, and so much more — all for free!

get blog updates