Anthropic faces backlash over stricter safeguards on new model

Anthropic has implemented stricter safeguards on its new Mythos-class model, prompting frustration among advanced users who argue that these restrictions hinder legitimate discussions about sensitive topics, such as health and biology. Users have reported being unable to mention terms like “cancer” in their queries, a reaction stemming from the model’s conservative tuning, which prioritizes safety by routing certain requests to a less capable model or blocking them entirely.

Anthropic: Anthropic is an AI research company known for its Claude family of large language models. It recently introduced Claude Fable 5 as its first Mythos-class model made available for general use, embedding new conservative safeguards to block high-risk outputs in areas like cybersecurity and biology. These measures, implemented to enable broader access while addressing misuse concerns, have led to complaints from advanced users about overly restrictive responses even on benign topics.

User Feedback: Advanced users have expressed frustration that the restrictions interfere with legitimate queries involving sensitive but non-malicious topics such as biology and health.
Safety Measures: Anthropic tuned the safeguards on its new Mythos-class model conservatively, sometimes routing requests to a less capable model or blocking content to prioritize safety over full capability.