The dispute arises from a May 13th federal court directive requiring OpenAI to “preserve and segregate all output log data that would otherwise be deleted on a going forward basis until further order of the Court.” This order could have massive implications for how user data is stored, especially for services like ChatGPT, where users regularly delete chat histories.
OpenAI’s Response To The Lawsuit
Source: OpenAI
The legal battle began in December 2023, when The New York Times filed a lawsuit against OpenAI and Microsoft, accusing them of unlawfully using its copyrighted content to train AI models, including ChatGPT and Microsoft’s Bing Chat.
The alleged unauthorized use of the content not only infringes upon various intellectual property rights but also poses a threat to the traditional journalism business model. There is concern that evidence of potential copyright violations could be lost as users continue to erase their conversations with AI tools.
At the center of the lawsuit is the legal interpretation of “fair use”, a longstanding doctrine that allows limited use of copyrighted material without permission under certain conditions, such as commentary, criticism, or education.
The New York Times argues that OpenAI’s models reproduce near-verbatim excerpts from its articles and, at times, circumvent paywalls by generating AI-written summaries. If proven true, these actions could severely undermine the publication’s content monetization strategies and brand value.
In contrast, OpenAI maintains that its use of publicly available data falls within the bounds of fair use, asserting that the content is transformed through training rather than directly reproduced.
Both parties claim to be acting in defense of larger principles.
The company also criticized The New York Times for what it described as “cherry-picking data” in its legal arguments, claiming that the examples cited do not represent typical model behavior.
As generative AI becomes more embedded in daily life, legal challenges surrounding data use, privacy, and copyright are becoming increasingly common. Courts are now pivotal in deciding how AI companies can train their models, and what kind of data is fair game.
The lawsuit from The New York Times is far from an isolated case. In April 2025, Ziff Davis, owner of outlets like PCMag and Mashable, also filed suit against OpenAI, alleging the misuse of its media content in AI training.
Just this week, Reddit filed a lawsuit against Anthropic, another AI company, accusing it of scraping Reddit data without proper licensing. Anthropic is also under fire from music publishers and authors who claim it used protected works without authorization.
Reddit’s Lawsuit Against Anthropic
Source: Reddit
In any case, these lawsuits indicate a broader industry reckoning as creators, tech firms, and courts wrestle over how intellectual property laws should apply in the age of generative AI.
The Times alleges that OpenAI used its copyrighted content to train ChatGPT without permission, which could harm its business and violate intellectual property laws.
The court has instructed OpenAI to preserve all user data, including deleted chats, to prevent potential evidence from being lost during the lawsuit.
OpenAI is currently appealing the decision, arguing that the order is an overreach and threatens user privacy.
Fair use allows for limited use of copyrighted material without permission for purposes like commentary, news reporting, or education. The case will hinge on whether training AI models qualifies under this doctrine.
Yes. Companies like Anthropic and OpenAI are facing lawsuits from media outlets, music publishers, and authors over the unauthorized use of content for AI training.
Subscribe to stay informed and receive latest updates on the latest happenings in the crypto world!
Content Strategist
Subscribe to stay informed and receive latest updates on the latest happenings in the crypto world!