Former Copyright Office GC Warns Against Blanket Assertions That AI Ingestion of Copyrighted Works ‘Is Fair Use’

by Jon Baumgarten

On May 17, Sy Damle, a former General Counsel (GC) at the U.S. Copyright Office, was one of five witnesses who testified before the House Judiciary Committee’s IP Subcommittee hearing, titled “Artificial Intelligence and Intellectual Property: Part I—Interoperability of AI and Copyright Law.” During the hearing, Mr. Damle stated, among other things, that AI ingestion of copyrighted work was categorically a fair use. Another former Copyright Office General Counsel, Jon Baumgarten, was watching the hearing and was so troubled by this assertion that he was compelled to write to the Subcommittee to express his views on why he disagrees with Mr. Damle. Mr. Baumgarten’s goal was to provide Subcommittee members with the views of not just one former general counsel, but two. To provide others with the opportunity to read Mr. Baumgarten’s views, and to more clearly understand the different issues at stake regarding the ingestion of copyrighted works by AI systems, we have posted the contents of his letter below.

Re: Hearing on Artificial Intelligence and Intellectual Property: Part I — Interoperability of AI and Copyright Law of the Subcommittee on Courts, Intellectual Property, and The Internet of The House of Representatives Committee on the Judiciary (May 17, 2023)

Dear Chairman Issa and Ranking Member Johnson:

I feel compelled to respond (speaking solely on my own behalf) to what is—or could be easily taken to be—assurance during the May 17 hearing by a well-respected copyright lawyer and former General Counsel of the U.S. Copyright Office, Sy Damle, that outside of some unspecified cases of machine memorization or close reproduction that might occasionally “go too far,” the input side of ingestion and processing by generative AI is almost categorically privileged as “fair use”.[1]

I am also a (now-retired) former General Counsel of the Copyright Office (1976-1979). During my tenure, and in other roles, I participated in formulation of the 1976 Copyright Act, was responsible for all Copyright Office regulations initially implementing the law, liaised with Congressional Committees, and represented the government in world copyright affairs. Based upon my lengthy copyright and policy experience, I could not disagree more regarding Mr. Damle’s categorical treatment of fair use. At best, the assertion is over-generalized, oversimplified and unduly conclusory. As both the majority and dissenting opinions in the Supreme Court’s decision in Warhol Foundation v Goldsmith emphatically reminded us—once again—just last week, the question of fair use is subject to detailed analysis of various factors in each case. Furthermore, Mr. Damle’s blanket assertion that input for generative AI “is fair use” may well be simply wrong. See, e.g., Coffman, Does the Use of Copyrighted Works to Train Artificial Intelligence Qualify as Fair Use? (April 11, 2023).[2]

Mr. Damle’s and other statements made during the hearing remind me of the posture taken by many laypersons and legal experts alike, following the introduction of photocopying in the 1960’s and its rapid growth into business and education. (See David Owen’s aptly titled Copies in Seconds: The Biggest Communications Breakthrough Since Gutenberg (Simon & Schuster 2004)). The concerns of authors, scientific and textbook publishers, and others were largely dismissed by many as “clearly” and “certainly” fair use because:

“everyone is photocopying all the time”;

photocopying had become both customary and essential in schools, offices, and non-commercial and commercial research;

copyright restrictions would thwart education, learning, and science;

reprography was a dramatically beneficial new technology that should not be inhibited;

seeking permission from thousands or more dispersed authors with diverse interests was impossible; and

a need to obtain photocopying permissions would place this country at a serious competitive disadvantage with research and development elsewhere.

Yet, after thorough fair use analysis in a number of leading cases of continuing precedential importance, those emphatic assertions of clarity and certainty were definitively proven to be wrong![3]

In light of the understandable concerns of several Subcommittee Members at the hearing that copyright might constrain domestic development of AI, I must emphasize that judicial determination that not all photocopying was fair use did not diminish or inhibit reprography. Instead, it led to a regime of voluntary collective licensing that facilitates copying, enhances access to knowledge, and compensates authors and copyright owners. This system is proving its worth in new markets and technologies and is mirrored in purpose and effect—though not in all details—in many countries abroad. See generally, Copyright Clearance Center, Making Copyright Work (CCC 2023); and WIPO/IFFRO Collective Management in Reprography available at https://www.wipo.int/edocs/pubdocs/en/copyright/924/wipo__pub_924.pdf.[4]

During the hearing, Mr. Damle warned that “a collective licensing regime for AI training would eliminate fair use in this area, replacing it with a rigid assumption that AI training is infringing.” (from the exchange with Rep. Lieu, Tr.pg.23; emphasis added). However, even assuming for the moment—as a matter of existing law or future policy—some categorical fair or privileged use, that is not the case. Collective licensing regimes can in different ways reasonably account for it in negotiating rates, adjusting projections of copying, defining scope of license, providing or accepting exceptions or otherwise. (As one example in the realm of reprographic rights, I understand that the Copyright Clearance Center accepts that libraries will not pay for copies made under congressionally endorsed fair use type guidelines for inter-library copying of articles. And in some countries, I understand that collective licenses excuse copying below certain percentages of a book. But it should also be recalled that drawing fine or other lines or classes of copying does not always serve the interest of users in simplicity of access and clearance and avoidance of burdensome record keeping.)

In any case, there is quite another side from Mr. Damle’s to the story of collective licensing that should be heard by Congress. As recognized in many countries and in many contexts worldwide, collective licensing not only assures compensation (including monetization of micro-transactions and flow of funds across borders) to individual creators for the widespread use of their creative works, it also offers great advantages to users of copyrighted works, including:

avoiding the otherwise unavoidable risk of liability inherent in making decisions that involve ambiguous legal standards on a case-by-case basis;

enabling cogent business planning as reproductive and other user technologies change format (e.g., from analog to first generation digital and beyond);

facilitating simple clearances, authorizations, or licenses from innumerable, geographically dispersed authors and rightsholders; and

through cooperative arrangements with and among entities in other countries, resolving complex issues of varying copyright laws and territorially fragmented copyright ownership that would otherwise multiply user risks and confound licensing logistics in a digitally borderless, networked world.

Further, from a governmental and public interest perspective, collective licensing can reduce potential friction among nations party to copyright and trade agreements that arise from differing national copyright principles and traditions and help establish common registries and information exchanges that facilitate national and international commerce in rights and copyrighted works. And, taking into account existing and emerging licensing arrangements for text and data mining and machine learning, collective licensing need not preclude private bilateral licenses should users or copyright owners prefer them.

This is just a skeletal outline of a full story that Congress should hear, rather than dismiss collective licensing as an overreach or otherwise.

Thank you for considering these comments.

Respectfully,

Jon Baumgarten

[1] See “. . . the training of AI models will generally fall within the established bounds of fair use.” (S. Damle introductory statement, Tr. pg. 6); and “Just on the training side, if that’s all that happens, then under a long line of cases that I’ve laid out in my written testimony, that is fair use.” (S. Damle exchange with Rep. Lieu, Tr. pg. 23); and: Rep. Lieu: “to train the model, you need to actually download the Taylor Swift songs? “ S. Damle: “That’s correct. That’s correct.” Rep. Lieu: “And that — do you view that as fair use?” S. Damle: “That would be fair use.” (Tr. Pg. 23). Witness Callison-Burch agreed: “I believe, like Sy [Damle] that pre-training these systems squarely falls within fair use . . .” (Tr. pg. 8).

[2] https://copyrightalliance.org/copyrighted-works-training-ai-fair-use/#:~:text=While%20some%20AI%2Drelated%20uses,disregards%20the%20rights%20of%20creators.

[3] E.g, American Geophysical Union v. Texaco, Inc., 802 F.Supp.1 (SDNY 1992), aff’d, 60 F.3d 913 (2d Cir. 1994); Princeton University Press v. Michigan Document Service, Inc., 99 F.3d 1381 (6th Cir. 1996) (en banc). Disclosure: I was counsel to plaintiff publishers in these and other actions. However, this statement is submitted solely on my own behalf.

[4] Please note that “reprography” and “reprographic rights” today go beyond photocopying to include microform, database storage, digital copying, and sometimes networked transmission. Like the appetite of generative AI, collective licensing of reprographic rights among countries can include more than text, e.g., photographs, illustrations, art works and sheet music. And collective licensing itself extends to more than reprographic rights, including, depending upon country, public performance of music, retransmission of audio-visual works, public performance of sound recordings, “mechanical rights” in music, and more. I have focused on photocopying in the U.S. only for illustrative, comparative, and historical purposes.

get blog updates