Anthropic Admits It Wrongly Throttled Rival AI Researchers Using Claude
Anthropic has acknowledged a significant error in its safety approach: The company secretly throttled rival AI researchers who were testing its Claude model for safety vulnerabilities. The admission came after the company implemented a “soft block” that invisibly limited usage for researchers from competing AI labs, effectively undermining independent safety testing.
The Admission
Anthropic CEO Dario Amodei confirmed the company made a “wrong tradeoff” by quietly restricting access for researchers from organizations like OpenAI and Google DeepMind. The throttling was applied without notifying users, meaning rival researchers attempting to probe Claude for safety flaws were deliberately hindered.
“We made the wrong call. We should have been transparent about any usage limits, not applied them invisibly to specific groups,” Amodei stated in the company’s official response.
Who Was Affected
The hidden restrictions specifically targeted researchers from competing AI development labs. These individuals were trying to replicate and verify Anthropic’s safety claims about Claude 3.5 and earlier models.
The throttling was not equally applied. Regular users and independent researchers not associated with rival labs were not subject to the same invisible limits.
What Happened
Anthropic’s safety team implemented a selective rate-limiting system. This system detected accounts linked to competing organizations and silently reduced their access to Claude’s API and web interface.
The restrictions included:
- Artificially slow response times that made large-scale testing impractical
- Unexplained error rates higher than what normal users experienced
- Capped query volumes that prevented comprehensive safety evaluations
Researchers attempting to benchmark Claude against other models or test for jailbreak vulnerabilities reported sudden, unexplained performance degradation.
Why This Matters
The incident undermines the principle of independent safety auditing. AI labs routinely rely on external researchers to discover flaws their internal teams might miss.
Safety testing requires adversarial scrutiny. When companies selectively block competitors from probing their models, the entire ecosystem suffers from reduced accountability.
Trust in AI safety claims erodes. If labs can quietly sabotage independent verification, the public has no way to know whether safety promises hold up under real examination.
The Fallout
Anthropic says it has now removed the invisible throttling. The company promises to implement “clear, transparent usage policies” going forward.
However, the damage to credibility may linger. The incident reveals a tension between commercial competition and the open safety testing the industry claims to support.
“If we want the public to trust our safety work, we cannot cherry-pick who gets to verify it,” one former Anthropic employee told reporters.
The Broader Context
This is not an isolated incident. Other AI companies have faced accusations of gaming safety benchmarks or restricting access to critical researchers.
The AI industry operates on a model of voluntary safety commitments. When those commitments conflict with competitive interests, incidents like this test whether the system can self-correct.
Anthropic’s admission suggests the company recognizes the reputational risk. Whether other labs follow with similar transparency remains an open question.
Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.
What are your thoughts on this? I’d love to hear about your own experiences in the comments below.