Anthropic Withholds Claude Mythos Over Autonomous Exploit Risk
AI

Anthropic Withholds Claude Mythos Over Autonomous Exploit Risk

May 1, 20263 min read
TL;DR

Claude Mythos autonomously builds exploit chains; Anthropic limits access to 11 partners in Project Glasswing while rolling out Claude Security to enterprises.

Anthropic built its most capable model to date and immediately chose not to release it. Claude Mythos Preview, disclosed on April 7, can autonomously discover zero-day vulnerabilities, construct and chain exploits, then erase evidence of the intrusion, according to the company's own system card. Those capabilities placed Mythos in a risk category that Anthropic says justifies keeping it off the market entirely.

Access instead flows through Project Glasswing, a controlled defensive program assembled before the disclosure. Eleven organizations are inside: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Each receives Mythos Preview exclusively for defensive security work.

The threat model

Former U.S. national cyber director Kemba Walden laid out the core concern in Fortune after the disclosure. Speed of vulnerability discovery is only part of the danger. Mythos constructs the exploit, chains it with others, executes, and covers its tracks, all without human direction at any step. That fully autonomous loop, running at AI speed, is a different order of threat from today's AI-assisted penetration testing tools.

Nextgov reported the federal response moved quickly, though not always in unison. Treasury Secretary Scott Bessent and Federal Reserve Chair Jerome Powell called a closed-door briefing with major bank CEOs on April 7, the same day as the public disclosure. The White House Office of Management and Budget began negotiating related policy measures simultaneously, though specifics have not been made public.

Anthropics system card described Mythos as showing "a striking leap" on evaluation benchmarks and said the model's coding capability can surpass most skilled humans at exploitation tasks. That language is deliberate: system cards are written carefully, and a threshold claim like that signals the company believes Mythos has crossed a meaningful capability line rather than advanced incrementally.

The defensive rollout

On April 30, Anthropic moved Claude Security into public beta for Enterprise customers. Formerly called Claude Code Security, the offering automates vulnerability discovery and fix generation across codebases. CRN reported that hundreds of organizations tested it during a limited research preview before the broader launch. Claude Security does not use Mythos.

The product strategy is legible. Anthropic is deploying a broadly available defensive tool while keeping the model most capable of damaging critical infrastructure inside a controlled coalition. "Today's models are already highly effective at finding flaws in software code," the company wrote. "The next generation will be more capable still, and will be particularly effective at autonomously exploiting these flaws." That single passage simultaneously justifies the Claude Security product and the Glasswing restriction.

Whether the Glasswing perimeter holds is not rhetorical. Eleven partner organizations means thousands of engineers with varying access levels and security cultures. Controlled releases of sensitive artificial intelligence capabilities have leaked before, sometimes accidentally, sometimes by design.

Policy catches up, unevenly

Connecticut's legislature sent comprehensive AI regulation to the governor's desk on May 1. The bill cleared the House 131 to 17 and the Senate 32 to 4 in a bipartisan vote. CT Mirror reported that debate centered on establishing parameters for AI systems without blocking economic development. Governor Ned Lamont, who threatened a veto on similar legislation last year, has not yet indicated whether he will sign.

The bill does not address autonomous exploit capability specifically, but its passage signals that state lawmakers are no longer treating artificial intelligence regulation as optional. The more consequential question is whether federal regulators eventually carve out autonomous offensive AI as its own risk category, separate from general-purpose models. OpenAI closed a $122 billion funding round in late April at an $852 billion valuation, illustrating the scale of capital now flowing into a sector that policy is still struggling to keep pace with.

Anthropics statement that "no one organization can solve these cybersecurity problems alone" is accurate. It is also structurally self-serving: the company that built the capability is now indispensable to any coalition formed to manage the risk. That position strengthens as the models improve.

Project Glasswing buys time. For defenses to catch up, for policy to clarify, for norms to form. What happens when a model with Mythos-level autonomous exploit capability exists outside any controlled program is the question the coalition has not answered publicly.

---

FAQ

What is Claude Mythos Preview?
Anthropics unreleased frontier AI model, capable of autonomously discovering zero-day vulnerabilities, building exploit chains, and covering its tracks. Anthropic has set no timeline for public release.

What is Project Glasswing?
A restricted Anthropic program granting eleven partners, including AWS, Apple, Google, and Microsoft, access to Mythos Preview for defensive cybersecurity work only.

How does Claude Security differ from Claude Mythos?
Claude Security is a public beta product for Enterprise customers focused on finding and fixing code vulnerabilities. It does not use the Mythos model and carries no special access restrictions.

What is Connecticut's new AI law?
Senate Bill 5, passed May 1, 2026, sets regulatory parameters for artificial intelligence systems operating in the state. Governor Lamont has not yet signed or vetoed it.