Fair use is a legal doctrine in 📝United States copyright law that permits limited use of copyrighted material without requiring permission from the rights holder. Codified in Section 107 of the Copyright Act (17 U.S.C. § 107), it functions as an affirmative defense — the user admits copying occurred but argues the use was legally permissible.
The Four Factors
Courts evaluate fair use claims using four factors, weighed together:
- Purpose and character of the use — commercial vs. nonprofit/educational, and whether the use is transformative (creates new meaning, expression, or purpose rather than substituting for the original)
- Nature of the copyrighted work — factual works receive less protection than creative works; published works are more available for fair use than unpublished ones
- Amount and substantiality — how much of the original was used relative to the whole, and whether the "heart" of the work was taken
- Effect on the market — whether the use competes with or diminishes the market value of the original
No single factor is dispositive. Transformativeness (Factor 1) has become increasingly dominant in modern case law.
Fair Use and AI Training
The question of whether using copyrighted works to train 📝Large Language Models (LLMs) constitutes fair use is the defining 📝Intellectual Property (IP) question of the current AI era. The core argument for fair use: training is transformative because the model learns statistical patterns, not copying text for redistribution. The core argument against: the scale of ingestion and the commercial value derived from it undermine the transformative claim.
In 📝Bartz v. Anthropic, Judge William Alsup issued a split ruling that drew a critical line: training on legally acquired books was "quintessentially transformative" and protected as fair use, but training on pirated copies was not. The distinction established that provenance — how 📝training data is acquired — is a legal boundary, not just the downstream use of the model.
This ruling does not resolve whether AI training on lawfully acquired copyrighted works requires licensing. That question remains open across multiple cases involving 📝OpenAI, 📝Stability AI, and others.
Fair use was designed for a world where copying was discrete and identifiable — a critic quoting a paragraph, a teacher photocopying a chapter. AI training operates at a scale that stretches the doctrine past its original architecture. The Bartz ruling was elegant in drawing the line at piracy rather than at training itself, but it left the harder question untouched: what happens when the acquisition is legal but the scale is unprecedented? That's the question the next wave of cases will answer.
