Skip to content
Artificial Intelligence

Anthropic Wants You to Know Its New AI Model Is Definitely Not Too Dangerous to Release

Claude Sonnet 5 delivers impressive agentic capabilities at a relatively low cost. It’s also really bad at cybersecurity—probably for the reason you’d expect.
By

Reading time 3 minutes

Comments (0)

AI developers today face a dual challenge: build state-of-the-art models that deliver big benefits at the lowest possible cost, and do so in a way that won’t attract the ire of the federal government. Anthropic—which knows that ire better than any other company in Silicon Valley—has tried to thread that two-eyed needle with its latest model, Claude Sonnet 5.

Released on Tuesday, the new model is designed to balance agentic capability with frugality. Its performance across a suite of benchmarks is comparable to the more powerful Opus 4.8, but with a smaller price tag: When accessed through Claude Code, Sonnet 5 costs $2 per million input tokens and $10 per million output tokens—less than half the price of Opus 4.8. 

Sonnet 5 “can make plans, use tools like browsers and terminals, and run autonomously at a level that, just a few months ago, required larger and more expensive models,” Anthropic wrote in its announcement. Sonnet 5 is now the default model on Claude’s free and Pro tiers, and also available to Max, Team, and Enterprise subscribers.

It arrives at a time when tech developers have been facing mounting pressure to provide customers with cheaper AI tools. That’s largely been driven by the proliferation of so-called AI agents throughout the business world, which can autonomously handle complex tasks over relatively long time horizons. They therefore tend to gobble up many more tokens—the basic unit measuring AI usage—than more limited systems, like a chatbot trained only to, say, field customer service questions. Both Anthropic and OpenAI have reportedly been considering big price cuts in order to attract new users, and keep current ones.

Dumbed-down cybersecurity capabilities

Anthropic’s new announcement was also notable, however, for what it says Sonnet 5 can’t do.

Specifically, the company wrote that Sonnet 5 “shows substantially poorer performance” on cybersecurity-related tasks than Opus 4.8 and Mythos 5, the latter being one of the two models—along with Fable—which Anthropic took offline earlier this month following an opaque order from the federal government. When an AI developer underscores what a new model can’t do, it’s typically for safety reasons (as in, Our model won’t respond to requests to generate realistic images of real people, or provide recipes for bioweapons). That’s also the case with Anthropic’s new model announcement—the company has gone to great lengths to position itself as the leading voice of safety in the AI industry—but it’s more than likely for political reasons, too.

Concerns around cybersecurity have very much been at the heart of Anthropic’s latest snafu with the federal government. That’s the official line from the Trump administration, at least, though plenty of others have floated the idea that ideological differences and personality clashes between the two parties have also played a role. Anthropic’s Mythos model, which was first unveiled in April, was said to be so good at finding cybersecurity vulnerabilities in software that the company opted for a phased-out release among trusted partners. One of those was the National Security Agency (NSA), whose supposedly iron-clad cybersecurity systems were no match for Mythos. Crucially, however, the model didn’t bypass the NSA’s security systems; it just identified flaws in them.

Fable 5 was released to the public with safety guardrails so stringent that many users found the model to be almost unusable. But after being led to believe that the model could be subjected to a jailbreak (i.e., prompted to bypass its own security guardrails) by Amazon CEO Andy Jassy, the government deemed it a national security risk.

Anthropic seems intent to avoid another altercation with the federal government following the release of its newest model. “We did not deliberately train Sonnet 5 on cybersecurity tasks,” the company wrote in it’s announcement. The company added that although Sonnet 5 had shown “partial success” in developing a working cybersecurity exploit targeting Mozilla’s Firefox browser, that was “likely due to improvements in general intelligence rather than specific training.”

Share this story

Sign up for our newsletters

Subscribe and interact with our community, get up to date with our customised Newsletters and much more.