Published 12 days ago • loading... • Updated 11 days ago

Anthropic to Reassess Claude Fable 5 AI Development Restrictions After Backlash

Earlier this week, Anthropic launched Claude Fable 5, but the model sparked intense backlash from AI researchers after users discovered invisible guardrails designed to prevent training competing AI systems.
Rather than clearly refusing requests, the system silently degraded performance using 'prompt modification' techniques, causing false positives on benign inputs like 'Hello' or medical terms like 'cancer.'
Researchers like Dean Ball, senior fellow at the Foundation for American Innovation, criticized the move as a 'shockingly hostile and terrible look,' warning it could silently damage critical machine learning work.
On Wednesday, Anthropic acknowledged the 'wrong tradeoff,' promising to make safeguards visible and provide reasons for refusals, with flagged requests now defaulting to the less-powerful Opus 4.8.
The company maintains these restrictions ensure Claude is not used to erode the U.S. edge in frontier chips, emphasizing that the safeguards 'do not affect the vast majority of coding and ML work,' the company said.

Insights by Ground AI

Podcasts & Opinions

Podcast Mention

Center

WSJ What’s News

Daily News and Business News podcast from Wall Street Journal

Center

Daily News and Business News podcast from Wall Street Journal

Why OpenAI Might Slash Prices for Users

WSJ What’s News discuss backlash to Anthropic’s Claude Fable 5 safeguards, reports of hidden response downgrades, and the company’s shift to clearer refusals

12 days ago

Listen to Full Episode Full Episode Unlock Timestamp

Get Vantage — Podcasts, Ratings, Timestamps

Podcasts & Opinions

14 Articles

NBC News

Lean Left

Anthropic rankles users with safety-first Fable release

Anthropic’s latest AI model might be the company’s most powerful public release, but the system’s strict safety measures quickly triggered some of the strongest backlash the AI giant has faced

11 days ago·United States

Read Full Article

Fortune

Center

After backlash, Anthropic says its AI will now tell users when their request is being rejected or rerouted for national security concerns

Anthropic is changing course after facing criticism for quietly downgrading certain requests to its most capable AI model. On Tuesday, the $965 billion company released a version of its most capable model, Mythos. Anthropic revealed Mythos in April, but held back any Mythos-class models from the public partly because the company said it was extremely adept at skirting cybersecurity defenses and was too dangerous to release. This week, though, it…

11 days ago·New York, United States

Read Full Article

Gizmodo

Lean Left

Anthropic Apologizes For One of the Guardrails on Its Fable 5 Model, and Will Change It

Fable 5 is a nerfed version of Anthropic's Mythos model. Turns out it was actually too nerfed, and the company is sorry.

12 days ago·New York, United States

Read Full Article

Silicon Republic

Center

Anthropic to reassess Claude Fable 5 AI development restrictions after backlash

Anthropic’s AI restrictions caused concern among founders, researchers and developers following the release of the ‘Mythos-like’ model. Read more: Anthropic to reassess Claude Fable 5 AI development restrictions after backlash

12 days ago

Read Full Article

The Register

Center

It blocked us at 'hello!' Anthropic Fable 5 refusing innocuous prompts

Hyper-vigilant safety classifiers turn Fable into cautionary tale

12 days ago

Read Full Article

Crypto Briefing

Anthropic faces backlash over stricter safeguards on Claude model

Anthropic's incident highlights the delicate balance between AI safety and usability, risking trust and regulatory scrutiny in the AI sector. The post Anthropic faces backlash over stricter safeguards on Claude model appeared first on Crypto Briefing.

11 days ago

Read Full Article

Think freely.Subscribe and get full access to Ground NewsSubscriptions start at $9.99/year