Anthropic walks back Claude Fable 5 safeguards after backlash

Anthropic is reversing its approach to hidden safeguards in Claude Fable 5 following significant backlash, as reported by WIRED. The company has announced that flagged requests will now visibly revert to Opus 4.8, with API users receiving clear refusal reasons. This decision comes after Anthropic admitted it “made the wrong tradeoff” concerning its AI safety practices and subsequently issued an apology for the oversight.

Opus 4.8: Opus 4.8 is an Anthropic Claude model variant serving as a fallback option. It will now be used visibly for flagged frontier LLM development requests after the policy change. API users will additionally receive a clear refusal reason in such cases.
Anthropic: Anthropic is an AI company that develops advanced large language models including the Claude family. It is adjusting safeguards in response to user backlash over undisclosed features in its latest model release. The company has publicly stated that it made the wrong tradeoff and issued an apology.
Claude Fable 5: Claude Fable 5 is a frontier version of Anthropic’s Claude AI model that incorporated hidden safeguards. The news reports that these measures are being walked back following public criticism. Requests involving the model will now route visibly to alternative versions with explicit notifications.

API Transparency: Flagged requests will now visibly fall back to Opus 4.8 with refusal reasons provided to API users.
AI Safety Practices: Anthropic has publicly acknowledged making an incorrect tradeoff on hidden safeguards in its models and issued an apology.