OpenAI, Anthropic, and Google team up against unauthorized Chinese model copying

OpenAI, Anthropic, and Google Unite to Combat Unauthorized AI Model Copying by Chinese Firms

In a rare show of unity among leading AI developers, OpenAI, Anthropic, and Google have issued a joint statement condemning unauthorized copying of their foundational AI models, with a pointed focus on Chinese companies. The trio accuses these entities of engaging in “model distillation” or “model piracy,” techniques that involve training new models on outputs generated by proprietary systems without permission. This collaborative effort marks a significant escalation in the battle over intellectual property (IP) rights in the rapidly evolving field of artificial intelligence.

The statement, released on October 22, 2024, highlights specific instances of suspected infringement. Chinese AI firms such as DeepSeek, Alibaba’s Qwen team, and MiniMax are named for allegedly replicating the capabilities of high-performing models like OpenAI’s GPT-4o, o1, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini. For example, DeepSeek’s V3 model, released earlier in 2024, reportedly matches or exceeds the performance of its Western counterparts on key benchmarks while being trained at a fraction of the compute cost. Analysis suggests this efficiency stems from distillation, where vast quantities of outputs from protected models are used to bootstrap the new system.

Model distillation works by querying a “teacher” model millions or billions of times to generate training data, effectively reverse-engineering its behaviors and capabilities. This process circumvents the need for massive original datasets and computational resources, which have been the barriers to entry for many competitors. The accused Chinese models, including DeepSeek-V3, Qwen2.5-Max, and others, demonstrate uncanny similarities in performance profiles, prompting the U.S.-based companies to cry foul. “These practices undermine the enormous investments required to develop safe and capable foundation models,” the joint statement reads, emphasizing that such copying erodes incentives for innovation.

The collaboration extends beyond rhetoric. The companies are urging governments worldwide, particularly in the United States, European Union, and United Kingdom, to strengthen legal frameworks protecting AI model weights, architectures, and training outputs. They call for export controls on AI chips and compute resources that might enable such distillation at scale. This aligns with ongoing U.S. efforts under the Biden administration to restrict advanced semiconductor exports to China, aimed at curbing military AI applications but now framed as safeguarding commercial IP.

Anthropic CEO Dario Amodei underscored the issue in a blog post accompanying the statement, noting that distillation not only copies performance but can inherit subtle flaws or biases from the source models. OpenAI’s Sam Altman echoed this on X (formerly Twitter), stating, “Model piracy is real and a problem,” while Google’s representatives emphasized the need for “fair competition” in global AI development. The joint front is notable given historical rivalries; OpenAI and Anthropic have competed fiercely for talent and funding, while Google has pursued an open-weight strategy with models like Gemma.

Technical details reveal why this matters. Frontier models like GPT-4o require trillions of tokens and petascale compute clusters for training, costing hundreds of millions of dollars. Distillation allows copiers to achieve 80-90% of the performance using just 10-20% of the resources, as evidenced by DeepSeek-V3’s reported 2.8 million H800 GPU hours versus GPT-4o’s estimated 30,000 A100 equivalents. Benchmarks from arenas like LMSYS Chatbot Arena show these Chinese models climbing leaderboards rapidly, often rivaling closed-source leaders without transparent training disclosures.

Critics within the AI community have mixed reactions. Some open-source advocates argue that distillation fosters broader access to AI capabilities, democratizing technology in line with the open-source ethos. However, the big three counter that without IP protections, the cycle of innovation stalls: fewer companies will invest in groundbreaking research if rivals can freely appropriate the results. Legal precedents are emerging; lawsuits over AI training data scraping, such as those against OpenAI by authors and news outlets, set the stage for model IP disputes.

Internationally, the statement pressures regulators. The EU’s AI Act, effective from August 2024, imposes transparency requirements on high-risk systems, potentially applicable to distillation practices. In the U.S., the CHIPS Act and proposed AI safety bills could expand to cover synthetic data generation. China, meanwhile, has accelerated its domestic AI push through initiatives like the “Made in China 2025” plan, releasing models with permissive licenses to attract global developers.

This alliance signals a maturing AI industry confronting geopolitical tensions. As models grow more capable, the line between inspiration and infringement blurs, raising questions about enforceability. Can watermarks embedded in model outputs detect distillation? Are inference bans on unauthorized APIs sufficient? The coming years will test whether voluntary industry cooperation evolves into binding international norms.

For now, OpenAI, Anthropic, and Google stand together, framing model piracy not just as theft but as a threat to the ecosystem that birthed generative AI. Their message is clear: innovation thrives on protection, not predation.

Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.

What are your thoughts on this? I’d love to hear about your own experiences in the comments below.