Fair use and AI has been a popular topic recently given the flurry of developments to generative AI systems. As discussed in the first installation of this blog, it is impractical to make broad predictions about how courts might rule on fair use and AI. However, after a high-level analysis of the four fair use factors, it is helpful to look at a few influential fair use cases to determine how they may be useful to guide us in determining how courts might analyze the four fair use factors when copyrighted works are used as training materials for AI. We picked the cases most frequently referenced when discussing fair use of training materials for AI.
Authors Guild v. Google
This is the case most AI developers reference when they say AI use of copyrighted training materials qualifies as a fair use. It is important to revisit the court’s analysis, because the principles from this case as applied to the typical AI-training materials fact pattern, particularly on the court’s transformative fair use reasoning, do not appear to be so clear-cut in favor of a fair use finding.
While the Second Circuit held Google’s copying of books to create a searchable database in Authors Guild v. Google qualified as fair use, its decision was limited significantly to the specific facts of that case. Those facts included not only the actions Google took to secure the reproductions of the books but also the fact that that Google was using the books to create a searchable database that would provide information about them. Instead of using copyrighted works to create a new product that could usurp the market for the underlying work, the court found that Google used the books to shed light on information about the book, for example, pinpointing where and how many times the word “whale” is used in Moby Dick. The court noted that Google also took significant steps to secure the copies of books it used in its database, such as only showing “snippets” of works to highlight a search term and implementing anti-hacking measures. Due to these security measures, the court concluded that there was little risk that Google’s actions could serve as a substitute for the copied works.
Moreover, the court found that Google’s reproduction of copyrighted works did not create significant market harm for copyright owners. The Second Circuit held that, although snippet view would surely cause some decrease in sales, demand satisfied by snippet view would be for the work’s factual elements, not its creative, protected elements.
Unlike in Authors Guild v. Google, the generative AI training does not provide factual information about the copyrighted works. Instead, most generative AI reproduce and draw on the expressive elements from the copyrighted works as part of a process that results in works that would often act as market substitutes for the training materials—to say nothing of the harm caused to copyright owners who already offer licenses for AI training. A market exists for copyright owners to license their works for use in AI training datasets, and a court granting a fair use exception would destroy that market.
Texaco v. American Geophysical Union
Texaco v. American Geophysical Union may be a particularly important case in considering AI as an emerging technology. In Texaco, a commercial research company created an internal policy of photocopying and disseminating journal articles to hundreds of scientists. On appeal, the Second Circuit held that considering the existing licensing market for photocopying, the company’s practice harmed the publisher’s right to derive value from its copyrighted work. Although the licensing market in Texaco was relatively new and still developing, the court held that its existence weighed against fair use because the researchers had an alternative to infringement by simply licensing their photocopies. Although reproduction of the work was internal to the company and article copies were not distributed to the public, the Second Circuit focused on the fact that the Copyright Act grants the exclusive right of reproduction to the copyright holder. Texaco affirms the importance of this right, standing alone, even when the right to distribution has not been implicated. The research company also argued that its use was transformative because photocopying converted articles into a “useful format,” and because the copying aided in scientific research. The court disagreed, holding instead that a fair use exception cannot apply to a process, only to a work of authorship, and that the defendant could not “gain fair use insulation. . . simply because such copying is done by a company doing research.” The fact that the company’s research was done for commercial purposes further supported the denial of a fair use exemption.
The general process of training generative AI on copyrighted materials shares several characteristics with the defendant’s photocopying in Texaco. The technological process of generative AI training seems to wholly reproduces a copyrighted work. Furthermore, like the burgeoning photocopying license market in Texaco, copyright owners offer AI training licenses. Just as the court ruled in the fair use case discussed above, Authors Guild v. Google, allowing a fair use exception for AI training would effectively destroy this licensing market.
Perfect10 v. Amazon.com
Perfect10 v. Amazon may become a battle ground for what constitutes a transformative means to an end rather than a non-transformative use. In this case, Perfect10 sued Google (and others) for its use of thumbnail versions of Perfect10’s copyrighted images on its search platform. The Ninth Circuit held that the defendants’ use was transformative because the thumbnails merely functioned as a “pointer,” providing social benefit by merely pointing to where a consumer could find the full images. The court’s decision was also influenced by the fact that because the image copies Google made were so small, they could not act as substituted for Perfect 10’s copyrighted works (and were also significantly smaller than the smallest resolution copies Perfect 10 licensed). It should be noted that Google’s use in this case did not merely repackage copyrighted works to recapture the artistic value they provided, the court found that Google instead created an entirely novel value by providing information about the works copied, which is not the case with AI generated art and the training of AI systems on copyrighted works.
Fox News v. TVEyes
Fox News v. TVEyes further demonstrates the tension between the first and fourth factors of the fair use analysis. In this case, defendant TVEyes offered a subscription service allowing consumers access to television programming to find, download, and share specific content like certain dialogue or other features of that programming. While the Second Circuit held that TVEyes’ service of increasing technological efficiency of viewing and sharing relevant clips was somewhat transformative, this transformative purpose could not outweigh the great harm to Fox by usurping its licensing market for clips. The court emphasized that transformativeness requires more than a “repackage” of a copyrighted work by “altering the [copyrighted work] with new expression, meaning or message.” Although TVEyes “modestly” transformed the work, the court held that the fourth factor outweighed the first because TVEyes “undercut Fox’s ability to profit from licensing searchable access to its copyrighted content to third parties.” The court correctly repositioned the importance of potential market harm and displacement of a copyright owner’s market within the fair use analysis.
Generative AI trained on copyrighted works may not fare much better than the TVEyes program under a similar analysis. Many developers argue that training generative AI creates transformative value from copyrighted works. However, like TVEyes, these developers undercut copyright owner’s ability to license their works for AI development— a market that had already existed before the recent AI boom. The Second Circuit strongly emphasized the importance of the fourth factor in its analysis of TVEyes’ program, and the gravity of this factor, particularly in the light of existing markets must be given its proper weight in the fair use analysis of the use of copyrighted works in a generative AI training case.
The cases discussed in this blog may provide insight to how courts will rule in cases of generative AI training on copyrighted works. While future cases will be highly fact dependent, keeping in mind the principles of the fair use cases discussed in this blog may provide insight into how courts will rule in this area.
If you aren’t already a member of the Copyright Alliance, you can join today by completing our Individual Creator Members membership form! Members gain access to monthly newsletters, educational webinars, and so much more — all for free!