Skip to main content

We use cookies to enhance your experience. By continuing to visit this site you agree to our use of cookies. Cookie Policy

AI in ASIA
Business

NYT vs OpenAI copyright lawsuit is like Hollywood's early fight against VCRs

Federal judge orders OpenAI to surrender 20 million ChatGPT user logs in landmark NYT copyright case that mirrors Hollywood's VCR battle

Intelligence DeskIntelligence Deskโ€ขโ€ข4 min read

AI Snapshot

The TL;DR: what matters, fast.

Federal judge orders OpenAI to release 20 million ChatGPT user logs in NYT lawsuit

Case parallels Hollywood's 1970s fight against VCR technology over fair use

Outcome could reshape AI training practices and copyright compliance globally

The copyright battle between The New York Times and OpenAI has escalated dramatically, with a federal judge ordering the AI company to hand over 20 million ChatGPT user logs. This landmark ruling could reshape how artificial intelligence companies handle training data and user privacy across Asia and beyond.

Magistrate Judge Ona Wang's November 7, 2025 order comes despite OpenAI's fierce objections on privacy grounds. The company's Chief Information Security Officer Dane Stuckey described the ruling as "fighting the New York Times' invasion of user privacy," highlighting the tension between legal discovery and user protection.

Microsoft, OpenAI's primary backer, has drawn parallels between this lawsuit and Hollywood's initial resistance to VCR technology in the 1970s. The comparison isn't mere hyperbole: both cases centre on whether new technology that uses existing content constitutes fair use or copyright infringement.

Advertisement

The lawsuit, filed on December 27, 2023, seeks billions in damages without specifying an exact amount. Judge Sidney Stein's April 4, 2025 decision to deny OpenAI's dismissal bids means the case will proceed to trial, potentially setting precedent for AI training practices globally.

"The order would force OpenAI to disregard legal, contractual, regulatory, and ethical commitments to hundreds of millions of people, businesses, educational, and governments around the world," OpenAI argued in its court filing objecting to the preservation order.

The implications extend far beyond America's borders. As Asian countries develop their own large language models, the outcome could influence how companies like South Korea's AI giants approach training data acquisition and copyright compliance.

By The Numbers

  • ChatGPT serves nearly 800 million weekly users worldwide
  • OpenAI must produce 20 million ChatGPT user logs as ordered by the court
  • Over 400 million users' conversation logs must be retained under the preservation order
  • The lawsuit seeks billions of dollars in damages, filed December 27, 2023
  • Judge denied OpenAI's dismissal bids on April 4, 2025

OpenAI's Impossible Training Dilemma

OpenAI has openly acknowledged the challenge at the heart of this case: it's "impossible" to train cutting-edge AI models without using copyrighted materials. In a filing to the UK House of Lords, the company explained that copyright covers virtually every form of human expression, from blog posts to government documents.

This admission has profound implications for the AI industry. If training on copyrighted content becomes legally untenable, it could fundamentally alter the development trajectory of large language models. The challenge is particularly acute in Asia, where copyright complexities vary dramatically between jurisdictions.

Legal Milestone Date Impact
NYT lawsuit filed December 27, 2023 Billions in damages sought
Dismissal bids denied April 4, 2025 Core claims proceed to trial
Preservation order issued May 13, 2025 400+ million user logs retained
Discovery ruling November 7, 2025 20 million logs must be produced

Asia's AI Industry Watches Nervously

While the case unfolds in New York's federal court, Asian AI companies are paying close attention. The precedent could influence how regional players approach content licensing and training data acquisition. Companies developing local language models face similar challenges with copyrighted materials in their training datasets.

"Fighting the New York Times' invasion of user privacy," said Dane Stuckey, OpenAI's Chief Information Security Officer, criticising the court's demand for 20 million user logs as unjustified and potentially harmful to user trust.

The music industry has already shown how copyright battles can reshape AI development. Sony Music Group's aggressive stance against unauthorised AI training has forced companies to reconsider their data sourcing strategies. Similar dynamics are emerging across creative industries in Asia.

OpenAI has attempted to address concerns through licensing deals with major publishers, including agreements with Axel Springer and ongoing talks with CNN, Fox Corp, and Time. However, the patchwork approach may not satisfy legal challenges or provide comprehensive solutions for the industry.

The VCR Analogy and Fair Use Defence

Microsoft's comparison to Hollywood's VCR resistance carries significant legal weight. In the landmark Sony Corp. of America v. Universal City Studios case, the Supreme Court ruled that VCR technology constituted fair use despite enabling copyright infringement. The decision hinged on the technology's capacity for substantial non-infringing uses.

The AI training debate mirrors this precedent. Microsoft argues that using copyrighted content to train language models doesn't supplant the market for original works but rather teaches models about language patterns and structure. This distinction could prove crucial as courts evaluate fair use claims.

The current legal landscape remains complex. Recent developments in AI copyright battles across creative industries suggest courts are taking a case-by-case approach rather than establishing broad precedents immediately.

  • Fair use defences rely on proving the training process transforms original works rather than simply reproducing them
  • Market substitution remains a key concern, with publishers arguing AI-generated content could replace their articles
  • The scale of training data usage far exceeds previous copyright disputes, creating novel legal questions
  • International variations in copyright law complicate global AI development strategies
  • Licensing agreements may provide clearer legal frameworks but raise questions about market concentration

What does this lawsuit mean for other AI companies?

The outcome could establish precedent for how courts evaluate AI training practices, potentially requiring comprehensive licensing deals or forcing companies to develop alternative training methods using only public domain or licensed content.

How might this affect AI development in Asia?

Asian AI companies may need to reassess their training data strategies, particularly for local language models. The precedent could influence regional copyright interpretations and licensing requirements across different jurisdictions.

Why is OpenAI fighting the user log disclosure requirement?

OpenAI argues that revealing millions of user conversations violates privacy commitments and could expose confidential business information, potentially undermining user trust and competitive positioning in the market.

Could this case kill large language model development?

While unlikely to stop development entirely, the case could significantly increase costs through licensing requirements and force companies toward more restrictive training approaches, potentially slowing innovation and raising barriers for smaller players.

What happens if OpenAI loses the case?

A loss could result in billions in damages and force industry-wide changes to training practices. However, the case's complexity suggests appeals would likely extend the legal process for several more years.

The AIinASIA View: This case represents a defining moment for AI development globally. While we support fair compensation for content creators, the precedent risks stifling innovation if courts don't carefully balance creator rights with technological advancement. The real test lies in developing frameworks that protect intellectual property while enabling the beneficial uses of AI that society increasingly depends upon. Asian markets, with their diverse copyright regimes and growing AI sectors, need nuanced approaches rather than blanket restrictions that could handicap regional innovation.

The OpenAI copyright lawsuit extends beyond immediate legal implications to fundamental questions about how society balances innovation with intellectual property rights. As courts navigate these uncharted waters, the decisions will shape not just how AI companies operate, but how creative industries adapt to technological change.

The stakes couldn't be higher for the global AI industry. As companies worldwide watch this legal battle unfold, they're simultaneously preparing for a future where training data acquisition may require entirely new approaches. The question isn't just whether OpenAI will prevail, but whether the industry can find sustainable paths forward that respect both innovation and creator rights.

What's your take on balancing AI innovation with copyright protection? Should training on copyrighted content constitute fair use, or do publishers deserve compensation for every use of their material? Drop your take in the comments below.

โ—‡

YOUR TAKE

We cover the story. You tell us what it means on the ground.

What did you think?

Share your thoughts

Join 3 readers in the discussion below

This is a developing story

We're tracking this across Asia-Pacific and may update with new developments, follow-ups and regional context.

Advertisement

Advertisement

This article is part of the AI Safety for Everyone learning path.

Continue the path รขย†ย’

Latest Comments (3)

Hye-jin Choi
Hye-jin Choi@hyejinc
AI
3 June 2024

The VCR analogy Microsoft is pushing... interesting, but I'm not sure it fully fits. In Korea, our discussions around AI policy and copyright lean more towards how to actively encourage development while protecting content creators, not just defensively dismissing concerns. "Teaching models language" still has to contend with commercial use implications. It's a different scale of impact than copying a movie.

Sakura Nakamura
Sakura Nakamura@sakuran
AI
15 April 2024

it's interesting how microsoft frames this as a "doomsday futurology" argument by the times. here in japan, news outlets are definitely worried about AI's impact on their business models. but is microsoft really saying there's no potential for market supplantation at all, even for highly specialized news content?

Lakshmi Reddy
Lakshmi Reddy@lakshmi.r
AI
8 April 2024

The comparison to the VCR case is interesting, but I feel like it misses a crucial point for us in NLP research. The VCR was about consumption of existing media, whereas LLMs are about generation of new content based on learned patterns. For Indic languages especially, where data scarcity is a real issue, the "teaches the models language" argument needs more nuance. Are we just talking about syntax and grammar, or also cultural context and factual knowledge? The ethical implications of how that "teaching" data is sourced, particularly for underrepresented languages, are critical, far beyond just market supplantation.

Leave a Comment

Your email will not be published