AI Licensing Isn’t the Enemy of AI Innovation, It’s the Backbone

As copyright lawsuits against AI companies continue to mount, there’s a common thread that runs through most of the over 100 cases that have been filed: fair use. At a high level, AI companies argue in these cases that training their AI systems on copyrighted works is a permissible fair use because their use is “transformative”. While that argument may sound appealing in theory, it is unlikely to hold up when all four fair use factors are applied by the courts, especially the fourth fair use factor: “the effect of the use upon the potential market for or value of the copyrighted work,” which is often referred to simply as “market harm.” The fourth fair use factor presents a fundamental problem for any AI company that has ingested copyrighted works without permission, especially those that have sourced the copyrighted works from known pirate websites, often referred to as shadow libraries.

The Fourth Factor—the Big Kahuna of Fair Use Factors

Courts must consider all four factors in a fair use analysis, but the fourth factor has long been recognized by the courts as the most important. The Supreme Court made that explicit decades ago in Harper & Row v. Nation Enterprises (1985). It recently reaffirmed that notion in Warhol v. Goldstein when the Court tightened the connection between the first and fourth factors, emphasizing that when a new work’s purpose is the same or highly similar to the original, it will most likely directly substitute for the original in the market. The reason is straightforward: copyright exists to ensure that creators can earn a living from their work. If a use undermines that ability, it undercuts the very purpose of the law.

When AI companies scrape and ingest copyrighted books, articles, music, works of visual art, and other creative works without authorization, their actions affect real markets. AI can generate outputs that can substitute for and compete with the very same works that they are trained on without permission. Most importantly, they bypass a rapidly developing market for AI training licenses—a market that creators and publishers have every right to participate in.

The AI Training Market Is Not a Hypothetical Market

One of the most persistent misconceptions in this debate is that the market for AI training data is speculative or theoretical. It is not. It exists today, and it is growing quickly. AI companies, large and small, are already entering into licensing agreements with publishers, news organizations, record labels, movie studios, and other rightsholders. These deals are not hypothetical—they are substantial, wide-ranging, and increasingly central to how responsible AI development is taking place. Those deals are very important, not just from a business perspective but also from a legal one.

Courts have made clear that fair use analysis must account not just for existing markets (i.e., “actual markets”) but for emerging markets and those likely to develop. The fact that a market is new or emerging does not make it irrelevant for fair use purposes. In fact, the opposite is true. Emerging markets are worthy of the highest level of protection because it strengthens the incentive structure of copyright law, working towards the Constitutional mandate to promote the progress of the creativity and the arts.

The protection of emerging markets is something our legal system takes very seriously, whether it’s the protection of markets for copyrighted works or technologies and services built upon them. Decades ago, Congress saw the need to protect emerging markets being created by small internet companies and passed several laws (like Section 230 and the Digital Millennium Copyright Act) to help protect those emerging markets and companies so they could flourish. Those companies have now grown into behemoths—like Google and Meta. But the need to protect emerging market flows in both directions, it’s not just a fee pass for big AI companies. It’s also imperative that the emerging markets of the copyright community be likewise protected.

Allowing some companies to take copyrighted works for free while others pay for them would undermine markets for copyrighted works entirely. It would discourage licensing, distort competition, and ultimately deprive creators of a legitimate and growing source of income. And holding that AI training is fair use would be even worse, because it would completely eviscerate the emerging AI training licensing market.

In addition to licensing deals between copyright owners and large, established tech companies and AI developers, there are a growing number of smaller AI companies and startups like Bria.ai, Moonvalley, ProRata, KLAY Vision, and others who have pioneered striking licensing deals and developed ethical AI models with copyright owners. These companies demonstrate not only that permission-based training is possible, but that it also supports free markets and allows both copyright owners and AI developers to thrive. But when big AI companies train on copyrighted works without permission or payment it puts these smaller AI companies at a competitive disadvantage and threatens to drive them out of business.

“You Keep Using that Word (“Transformative Use”). I Do Not Think It Means What You Think It Means”

AI companies often argue that their use is “transformative” and that should be controlling of any fair use analysis. But even if their uses are transformative (which is not true in most cases), transformative use is just one part of the fair use analysis. In fact, transformative use is only a part of the first fair use factor, it is not a stand-alone factor, and it is outweighed by the fourth factor.

A use that causes significant market harm cannot be excused simply by labeling it “transformative.” The Supreme Court in Warhol and many other courts have warned against that kind of overreach. If “transformative use” becomes a blanket justification, it risks swallowing the exclusive rights that copyright law is designed to protect.

In fact, more evidence has come to light in lawsuits establishing that AI models can and do reproduce copyrighted material, sometimes verbatim. That raises serious questions about whether these AI models are simply storing and reusing the training material in compressed form—acts that are clearly not transformative. Either way, the legal conclusion is the same: when a use competes with or displaces the original market, it cannot and should not qualify as a fair use.

Licensing Is Not a Burden—It’s a Solution

What often gets lost in this conversation is that licensing is not just a legal requirement—it is a practical and mutually beneficial solution. For creators and publishers, licensing provides compensation for the use of their work. For AI developers, it provides access to high-quality, reliable, professionally produced content. That is exactly the kind of material that leads to better, more trustworthy AI systems.

Companies that are licensing copyrighted works understand this. They are not simply complying with the law, they are improving their products. By contrast, unlicensed scraping produces datasets that are inconsistent, unverified, and legally risky, especially when the unlicensed scraping is from pirate websites. It is not just problematic from a copyright perspective; these illicit actions form a weaker foundation for building advanced AI systems and exacerbate the harms already caused by online piracy.

The Stakes are High

This issue at stake here go to the heart of whether copyright law will continue to function as intended in the digital age. All creative industries, whether its news, books, music, motion pictures, visual arts or other creativities, depend on sustainable economic models. Those models rely on the ability to license and monetize content. That includes new and emerging markets like AI training.

If courts allow unlicensed uses that undermine those markets by holding AI training to be categorically fair use, the result will send a thunderbolt through the copyright community that will have devastating effects not just for creators but also for every person who enjoys and uses creative works. A holding of fair use would mean fewer resources to create and distribute creative and informational works, less investment in new authors and new bands, and reduced support for high-quality copyrighted works of all type across the board.

This isn’t just about harm to the copyright community and the public. There is also a basic question of fairness that must be considered. Many AI companies are already doing the right thing—negotiating licenses, compensating creators, and respecting the law. They should not be placed at a competitive disadvantage compared to the AI behemoths who chose to use their billions to alter the law and common sense.

Conclusion

The fourth fair use factor provides a well-established framework for addressing these issues. The market for AI training data exists. It is valuable. And unlicensed use is direct threat to that market—especially when the unlicensed use emanates from pirate websites. Courts should apply the law as written and as consistently interpreted: when a use harms an actual or potential market, it is not fair use.

AI innovation and copyright protection are not in conflict. They can operate on the same playing field and both be successful. Licensing is how that balance is achieved. It is the path forward for a sustainable and responsible AI ecosystem.

If you want to keep up to date on AI and copyright news, please sign up for our AI Copyright Alert.

get blog updates