Skip to content
Artificial Intelligence

Anthropic Lets Claude Mythos Spread Its Glasswings

It's dangerous, but not TOO dangerous to give to more organizations.
By

Reading time 3 minutes

Comments (0)

Anthropic, fresh off becoming the most valued frontier AI lab in the game and announcing plans for its initial public offering, has expanded access to Claude Mythos Preview, its model that would supposedly destroy the very concept of cybersecurity as we know it if released in the wild. According to the company, it is expanding access to the model via its Project Glasswing initiative, making it available to 150 new organizations across 15 countries.

Per Anthropic, its new partner organizations include those in the power, water, healthcare, communications, and hardware industries. Many are vendors that “maintain codebases that are relied upon by lots of other organizations around the world, including governments.” They also are potential targets for “catastrophic” cyberattacks. By Anthropic’s math, a major cyberattack against any of them could impact more than 100 million people and have ramifications for national and global security.

Their access to Claude Mythos Preview will be the same as past partners: They’ll get to use the model in a limited capacity to test and identify security vulnerabilities so they can be patched and addressed before being exploited by hackers or other models for malicious purposes.

Mythos remains a bit of a mystery to those who aren’t part of the in-group (makes its name fitting!), but we have gotten a couple of peeks at the model and how it’s being used. Cloudflare, for instance, shared that Mythos Preview was particularly adept at exploit chain construction, which is basically spotting how several bugs can be used to create a series of attacks that do more damage than a single exploited flaw.

But Anthropic also revealed that Mythos isn’t necessarily ready for primetime, which might also be part of why it’s actually keeping it to such a small and controlled base of users. It found that the model’s organic safeguards (i.e., what requests it will say “no” to) were inconsistent and could change after seemingly unrelated changes. It also found that while Mythos Preview is good at spotting vulnerabilities, it’s not nearly as equipped to patch them. The company found that letting the model write its own patches would typically result in breaking another part of the codebase in the process.

Cloudflare also noted that other models found a lot of the same bugs as Mythos—an observation that has been made elsewhere. A security company called Aisle tested several small, open-source models and was able to find the same vulnerabilities that Anthropic highlighted when it announced Mythos—vulnerabilities that went unnoticed by humans for decades.

That’s not to say that Mythos Preview isn’t a step up from other models, but it does make Project Glasswing and the slow rollout feel more like a marketing gimmick than a true safety precaution. Some cybersecurity experts suggested to the New York Times that keeping the model under lock and key doesn’t actually solve the problem of widespread security vulnerabilities—it just gives a handful of players a head start.

It also has the added effect (and benefit to Anthropic) of making it extremely hard to evaluate Mythos Preview if you aren’t one of the few organizations with access. It’s a modern version of “security through obscurity,” a method that is widely derided in modern cybersecurity practices. But it seems Anthropic’s plan for now is to see just how high it can build the walls of its black box.

Share this story

Sign up for our newsletters

Subscribe and interact with our community, get up to date with our customised Newsletters and much more.