AI and Copyright: AI Policies Must Respect Creators and their Creativities

The exponential development of Artificial Intelligence (AI) systems represents a profound achievement of the digital age that brings with it tremendous opportunities. In fact, many in the creative community are already using or plan to use AI for the creation of a wide range of works and are developing and pushing AI capabilities to explore new creative horizons. But as with many advances in technology, these new opportunities come with challenges and often raise difficult legal questions.

Artists, authors, and many other types of creators are increasingly concerned (and rightly so) about the tendency of some to ignore or discount issues relating to copyright in the AI context. Though the application of copyright law to AI may be tricky at times, it is essential that these issues not be ignored or given short shrift in the AI discussion just because AI is the shiny new toy. Rather than sweep important copyright issues under the rug for fear of slowing AI’s progress, policymakers, lawmakers, stakeholders, and the public must respect the rights of creators and copyright owners and recognize and appreciate the underlying goals and purposes of our copyright system.

The Relationship Between AI and Copyright Law

Well before the recent explosion in AI development, the creative community has been using AI technologies and innovating in the space. On the output side of AI, creators and copyright holders use or actively develop AI technologies as part of their larger creative process. For example, video game developers utilize AI systems and technologies to provide new and improved gaming experiences, such as when players interact with in-game, non-player characters. Television and filmmakers also often incorporate computer generated images (CGI) into their works that are created with the help of AI programs. In the music industry, artists and producers are employing AI tools for everything from beat creation to voice modulation.

On the input side of AI, the primary way the creative community drives AI innovation and development is by creating and disseminating copyrighted works that are used to train AI systems. At best, the creative community collaborates with and innovates as a part of the AI community by developing and improving AI technologies, licensing works for use as training material, or using AI as a tool to generate or make new works. At worst, copyrighted works are used by AI technologies without authorization or licenses—sometimes for the purpose of creating works that serve as direct market substitutes for the ingested works—undermining the rights of artists, creators, and copyright owners and their abilities to protect, license, and enforce their copyrights.

Sadly, it is the latter situation that is becoming more widespread in the AI world where the works of countless creators and copyright owners are being used without permission or compensation. This especially true and especially harmful with commercial AI uses.

Using Copyrighted Works as AI Inputs

An AI’s creative output is only good as the corpus of creativities it ingests. For example, if a human prompts an AI machine to generate an image to accurately imitate the work of a famous visual artist, like Jean-Michel Basquiat, the AI machine must analyze and copy the expressions that are unique to Basquiat’s works. But what is sometimes lost on AI system developers and users is that the underlying works, which the AI draws from, are more often than not created by a human creator. That human creator depends on the rights and protections granted to them by copyright law to commercialize and control their works, including the ability to license their works for use as AI input (or to stop others from using their works without authorization).

Creators and copyright owners also make contributions to the development of AI technologies by priming copyrighted works for optimal AI application and development through activities like semantic enrichment, metadata tagging, content normalization and data cleanup. Copyright owner-curated and prepared AI training data sets, databases, or collections of works also feature additional benefits like secured licensing and permissions from third parties, which reduce privacy and infringement risks for AI developers and users. It is copyright laws which incentivize and protect the investments creators and copyright owners make when creating and preparing these kinds of works for AI.

Text-and-data mining (TDM) is the process through which AI machines develop their unique algorithms and capabilities by analyzing, reproducing, and otherwise using valuable data and expressions often contained in copyrighted materials. TDM uses input materials to develop trends, algorithms, and methods that can generate output works. During TDM, an AI machine might very well be analyzing numbers, statistics, and other non-copyrightable pieces of information. But AI machines that generate output like images, videos, or songs, risk displacing or substituting for the very underlying copyrighted works that are used to train the AI. During TDM of copyrighted works, the AI machines are often culling the expressive, copyrightable value contained in the ingested works, as mentioned previously with the Basquiat example.

Many creators and copyright owners currently offer TDM licenses so their works can be used to train AI systems. Copyright law enables creators and copyright owners to create works that are fuel for AI development, and there must be respect and recognition of the laws and rights that protect their ability to license (or not license) their works. Particularly where AI machine outputs serve as replacements or substitutes in the markets for the ingested works, any artificial disturbance or heavy-handed approaches to manipulate the existing market for TDM licenses for copyrighted works, without being supported by evidence, results in creators and copyright owners subsidizing AI development. When works are used without authorization, licensing markets—the fundamental ability of a copyright owner to control and commercialize their works—are effectively destroyed.

No Broad Exceptions or Justifications Exist for AI Use of Copyrighted Works

The disturbing reality is that many AI companies do not license copyrighted works to train AI machines. Nor are they transparent about the sourcing of their training data sets. Some companies simply scrape existing copyrighted content from the internet including images, text, and software code to use as training data sets. Other companies engage in a practice called “data laundering” where they fund or use data sets created by academic or research institutions for initially noncommercial purposes to train commercial AI machines.

In the discussion surrounding the use of copyrighted works to train AI systems, some stakeholders wrongly justify these practices with the argument that the fair use exception would wholesale permit such methods or could justify broad copyright exceptions for AI use. That view is inaccurate— especially in a case where a TDM license is available, the use is commercial, or the resulting AI generated work harms the actual or potential market for the ingested work. Fair use is such a fact-specific exception that it is an unreliable basis to build any broad AI exceptions on or to make general claims that these AI practices are excused from blatant copyright infringement.

AI and Copyright Laws Around the World

For many years now, lawmakers and policymakers in a number of countries, including the United States, have been carefully examining the intersection of copyright law and AI and the implications of this rapidly evolving technology. Even as a global leader in AI technologies, the United States has not deemed it necessary to enact or recommend any new exceptions to copyright law for AI purposes. And for good reason: licensing to support AI application is robust and without contrary evidence there is potential for significant harm by prematurely upending creators and copyright owners’ copyrights.

The United States is not alone in its treatment of the AI licensing market and its relation to copyright law. Very few countries have considered AI regulations and policies with respect to copyright laws including, Hong Kong, South Korea, Australia, and Canada. Significantly, each country has declined to take action, postponed decision making as premature, or otherwise not taken action. In varying degrees, only the European Union, Japan, Singapore, and the United Kingdom have AI policies and regulations within their copyright laws.

One example of a country with problematic AI-copyright regulations is Singapore, which overbroadly permits unauthorized TDM of copyrighted works, including for pirated works, for any purpose with no ability for rightsholders to opt out or contract around the exception. Policies such as Singapore’s severely undermine the fundamental ability of creators and rightsholders to be compensated for the use of their copyrighted works, discouraging them from creating the works and depriving them of the fruits of their labor. Unfortunately, the United Kingdom is also considering following this troubling precedent, with a proposed exception for TDM of copyrighted works for noncommercial and commercial uses, with no ability for creators and copyright owners to contract around the exception. Needless to say, many in the U.K. creative community have decried the proposal, and we can only hope that the U.K. will reverse course to avoid undermining a critical part of their economy.

As AI technologies continue to progress at a rapid pace and make incredible advancements, AI stakeholders, courts, policymakers, and the public should keep in mind several key principles when analyzing the intersection between AI and copyright.

  • When formulating new AI laws and policies, it is essential that the rights of creators and copyright owners be respected.
  • Long standing copyright laws and policies must not be cast aside in favor of new laws or policies obligating creators to essentially subsidize AI technologies.
  • Education is paramount in the AI space. Those leading AI projects are aware of the legal implications of using copyrighted works input material, and those that arise from AI-generated output

Along these lines, we the Copyright Alliance recently published our position paper on AI and copyright law issues that outline the above key principles and also points out some other detailed positions of the copyright community on AI. It is critical to AI innovation that the creative contributions and impact of the creative community to AI are acknowledged and that the foundations of copyright law that made AI possible in the first place are preserved and respected.

If you aren’t already a member of the Copyright Alliance, you can join today by completing our Individual Creator Members membership form! Members gain access to monthly newsletters, educational webinars, and so much more — all for free!

get blog updates