Claude Mythos Preview

Claude Mythos Preview is Anthropic's April 2026 limited-access model with autonomous vulnerability discovery and exploit-development capabilities.
Claude Mythos Preview

Claude Mythos Preview is a limited-release research model from Anthropic announced in April 2026, the first frontier model to demonstrate autonomous discovery and weaponisation of software vulnerabilities at meaningful scale. The model identified zero-day vulnerabilities in every major operating system and every major web browser the company tested it against and produced working exploits at success rates orders of magnitude above Claude Opus 4.6. Anthropic is not releasing the model to the general public; access is gated through the Project Glasswing partner program and a separate Cyber Verification Program for security professionals.

At a glance

  • Lab: Anthropic.
  • Announced: Early April 2026 (Anthropic's red-team blog at red.anthropic.com/2026/mythos-preview/).
  • Modality: Text-input, text-output language model with extended tool-use and long-horizon agentic execution.
  • Open weights: No. Closed weights, no general API release.
  • Access: Project Glasswing (limited industry partners and open-source maintainers); Cyber Verification Program (security professionals applying for legitimate-use access).
  • General availability: Not planned; safeguards being developed for inclusion in a future Claude Opus model.
  • Notable capability: Autonomous chained-vulnerability exploit development against memory-safe and memory-unsafe targets.
  • Authors: Nicholas Carlini, Newton Cheng, Keane Lucas, Michael Moore, Milad Nasr, Vinay Prabhushankar, Winnie Xiao, and 15 additional Anthropic researchers.

Origins

Anthropic's security and red-team research group has been publishing on autonomous-exploit risk for several years; the work that produced Mythos Preview is the culmination of that line. Prior to the Mythos generation, the company's Opus and Sonnet model lines could find isolated vulnerabilities in software when explicitly directed and could write partial exploits when given substantial scaffolding, but they could not chain multiple vulnerabilities together autonomously, could not develop working exploits without expert human guidance, and had a near-zero success rate at end-to-end exploit development. The Mythos training run produced a step change rather than a continuous improvement.

The model was announced through Anthropic's red-team-research blog rather than through the company's standard product channel, a deliberate choice that signals the work as a research artefact and a safety disclosure rather than a commercial release. The framing matches the company's stance on responsible disclosure of catastrophic-misuse-risk capabilities: publish enough that the field can prepare defences, withhold enough that the model itself does not become a directly exploitable artefact for bad actors.

The internal codename "Mythos" suggests an ongoing capability line rather than a one-off; the Glasswing access program implies Anthropic expects partner-side deployment over an extended period rather than a single research drop.

Capabilities

The headline capability is autonomous vulnerability discovery and exploit development. The model was tested against OSS-Fuzz, the Google-operated fuzzing infrastructure that has shaken out tens of thousands of bugs in open-source software since 2016, and produced 595 crashes at severity tiers 1 and 2, multiple tier 3 and 4 crashes, and 10 tier-5 control-flow-hijack crashes. Claude Opus 4.6 produced a single tier-3 crash on the same setup. The improvement is closer to two orders of magnitude than to a generational refresh.

The exploit-development numbers are even larger. On a corpus of 147 disclosed Firefox vulnerabilities (n-day exploits where the vulnerability was known but the working exploit was withheld), Mythos Preview developed 181 working exploits and achieved register-level control on 29 additional attempts. Claude Opus 4.6 developed 2 working exploits on the same corpus. The exploitation techniques the model produced autonomously include JIT heap-spray chains, return-oriented programming gadget construction, multi-vulnerability privilege-escalation chains, KASLR bypass primitives, race-condition exploitation, and cross-cache reclamation primitives. Several of the chains combine three or more separate vulnerabilities into a single working exploit, which is the level of work that until recently required senior offensive-security researchers operating with substantial time budgets.

The discovery side surfaced vulnerabilities in production software ranging from a 27-year-old SACK-implementation denial-of-service bug in OpenBSD (which has been in shipped code since the late 1990s) to a 17-year-old remote-code-execution bug in FreeBSD's NFS implementation (now tracked as CVE-2026-4747), a 16-year-old vulnerability in FFmpeg's H.264 codec (with a 10-year-old exploitation vector that no public researcher had identified), and multiple Linux kernel privilege-escalation chains. The model also discovered web-browser sandbox-escape primitives, authentication-bypass bugs in cryptography libraries, and memory-corruption issues in a production memory-safe hypervisor.

Closed-source binary reverse-engineering was also in scope. The model demonstrated effectiveness in finding vulnerabilities in compiled-without-source software, which is a separate skill from source-code analysis and is the more important capability for offensive security against commercial software.

Several scope limitations are documented. The model cannot exploit all Linux kernel vulnerabilities even when it identifies the write primitive, partly because exploitation depends on specific system configurations and partly because some exploitation paths require physical or local access that the model cannot independently arrange. Most of the closed-source reverse-engineering exploitation work remains undisclosed pending responsible-disclosure deadlines.

Benchmarks and standing

Standard public benchmarks (Cybench, the various CTF leaderboards) are not the primary measurement instrument for Mythos Preview because the public benchmarks were designed around capabilities that prior models could not meaningfully approach. The benchmark in this release is the model's performance against real production software, measured by the rate of zero-day discovery and the rate of n-day exploit completion.

The cost numbers from the disclosure are the most directly comparable signal. OpenBSD vulnerability discovery completed at under $20,000 for 1,000 runs; individual high-severity bugs cost under $50 each in compute. FFmpeg vulnerability discovery completed at approximately $10,000 for several hundred runs. N-day exploit completion against Firefox-class targets ran $1,000 to $2,000 per complex privilege-escalation chain. The dollar figures contextualise the capability shift: a small security firm or a small national-cyber actor can, at current pricing, run the equivalent of a multi-person red-team campaign against most production software at compute costs in the low five figures.

Over 99 percent of the vulnerabilities the model discovered remain unpatched as of the disclosure window, with SHA-3 commitments published as proof of discovery pending the 90-day plus 45-day disclosure timeline. The unpatched count is the structural reason the model is not being released generally.

Access and pricing

Access is gated through two programs.

Project Glasswing provides Mythos Preview to a limited group of "critical industry partners and open-source developers," in Anthropic's framing. The partner roster is not public. The implied use case is defender-side: software vendors and open-source maintainers receiving early access to vulnerability discoveries in their own code, with sufficient time to patch before broader disclosure. The 99 percent of unpatched vulnerabilities mentioned in the announcement is the size of the Glasswing-side workstream.

The Cyber Verification Program is a separate access channel intended for security professionals (incident-response teams, penetration testers, security researchers) who can demonstrate legitimate professional use. Application criteria, approval rate, and audit posture are not publicly documented.

There is no general API access, no consumer access, and no commercial-tier API price. The model is structurally not a product release; it is a research artefact with a controlled-partner distribution surface.

Comparison

The closest direct comparison is Claude Opus 4.6, the prior-generation Anthropic frontier model, against which Mythos Preview shows order-of-magnitude improvements on every measured cybersecurity task. Anthropic explicitly framed the comparison as "near-0% success rate at autonomous exploit development" for Opus 4.6.

No competitor frontier lab has published equivalent autonomous-cyber-capability evaluations on production software, which makes external comparison difficult. OpenAI, Google DeepMind, and Meta AI have all published smaller-scale evaluations on academic cybersecurity benchmarks (Cybench, CyberSecEval), but none has published Mythos-Preview-scale n-day exploit-completion data on production codebases, and none has indicated that their frontier models can autonomously chain vulnerabilities in the way Mythos Preview can. Whether the gap is real or whether competitor labs simply have not chosen to publish equivalent evaluations is unknowable from external data alone.

The UK AI Safety Institute conducted an independent evaluation, the contents of which have not been fully released; AISI's involvement is the first publicly-documented government-laboratory evaluation of a frontier-cyber-capability model.

Outlook

The next twelve months will test several open questions.

The first is whether Anthropic's containment posture holds. The Glasswing partner program, the Cyber Verification Program, and the deferral of safeguards-included integration to a future Opus model are deliberate choices to slow the rate at which Mythos-class capability reaches the broader threat landscape. Whether competitor labs publish equivalent capabilities (and either follow Anthropic's gated-access model or not), and whether open-source replications of Mythos-grade capability emerge from academic or independent research groups, will define the timeline on which autonomous-exploit capability becomes broadly available rather than partner-restricted.

The second is the patch-deployment race. The 99 percent of unpatched vulnerabilities is a present-tense risk surface; the responsible-disclosure clock is running. Vendors receiving disclosures through the Glasswing channel have a 90-day plus 45-day window to patch, which is the standard responsible-disclosure timeline. The vulnerability volume in this release is large enough to strain the patch-management capacity of even the largest software vendors. Whether the patches ship on time and whether public exploitation precedes patching for any of the disclosed bugs will be visible in the security-incident data over 2026 to 2027.

The third is the policy response. Mythos Preview is the first frontier-lab disclosure that has demonstrably moved the offensive-security capability frontier in a single release, and the public-policy framing of autonomous-cyber-capability AI has lagged the technical reality. National-cyber-authority responses through 2026 and 2027 (export controls, mandatory disclosure regimes, capability-evaluation requirements for frontier-model releases) will be measurably influenced by Mythos Preview and any successor releases.

Sources

  • Claude Mythos Preview announcement. Anthropic's red-team-research blog post with the capability evaluation, the cost numbers, and the responsible-disclosure framing.
  • What Anthropic's Mythos Means for the Future of Cybersecurity. IEEE Spectrum's analysis of the implications for the security community and the policy landscape.
  • Project Glasswing access page. The partner-access portal for the controlled distribution program.
  • OSS-Fuzz. Background on the Google-operated fuzzing infrastructure used as a reference benchmark.
  • CVE-2026-4747 (FreeBSD NFS RCE), one of the publicly disclosed Mythos-discovered vulnerabilities. The CVE record will populate as the disclosure window closes.
  • Companion essay: The diaspora map for the broader Anthropic research-leadership and security-team composition that produced this release.
About the author
Nextomoro

Nextomoro

nextomoro tracks progress for AI research labs, models, and what's next.

AI Research Lab Intelligence

nextomoro tracks progress for AI research labs, models, and what's next.

AI Research Lab Intelligence

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to AI Research Lab Intelligence.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.