Anthropic's Project Glasswing: AI Just Found 10,000 Vulnerabilities in Critical Software
Security Alert 5 min read

Anthropic's Project Glasswing: AI Just Found 10,000 Vulnerabilities in Critical Software

Analysis of Anthropic's Project Glasswing first-month results using Claude Mythos Preview, and what it means for enterprise vulnerability management and patching velocity.

By Keith Rose

Executive Summary

On May 22, 2026, Anthropic published the first results from Project Glasswing, an initiative launched last month to secure critical infrastructure software using the company’s Claude Mythos Preview model. The headline number: more than ten thousand high- or critical-severity vulnerabilities found across approximately 50 partner organizations in the first month alone. The bottleneck in software security has shifted. It is no longer about finding bugs. It is about patching them fast enough.

What Project Glasswing Is

Glasswing is Anthropic’s collaborative program pairing its cybersecurity-focused model (Mythos Preview) with organizations that maintain systemically important software. Partners include Cloudflare, Mozilla, Palo Alto Networks, Microsoft, Oracle, and major financial institutions. The goal is defensive: find vulnerabilities in critical code before offensive actors with comparable AI capabilities do the same.

Anthropic is also independently scanning more than 1,000 open-source projects that underpin much of the internet.

The Numbers

Partner findings (first month):

  • 10,000+ high- or critical-severity vulnerabilities found collectively
  • Most partners report bug-finding rates increased by more than 10x
  • Cloudflare: 2,000 bugs (400 high/critical) across critical-path systems
  • One partner bank used Mythos Preview to detect and block a $1.5M fraudulent wire transfer

Open-source scanning:

  • 1,000+ projects scanned
  • 6,202 estimated high- or critical-severity vulnerabilities
  • 1,752 independently assessed by six security research firms
  • 90.6% true-positive rate after human triage
  • 62.4% confirmed as high or critical severity

At current rates, even with no new findings, Mythos Preview is on track to surface roughly 3,900 valid high/critical vulnerabilities in open-source code alone.

External Validation

The UK AI Security Institute reports Mythos Preview is the first model to solve both of their multi-step cyber range simulations end to end. Mozilla found and fixed 271 vulnerabilities in Firefox 150 using Mythos Preview, compared to far fewer in Firefox 148 with Claude Opus 4.6. Independent security platform XBOW calls it a “significant step up over all existing models” on web exploit benchmarks.

Academic benchmarks ExploitBench and ExploitGym also rank Mythos Preview as the strongest performer for automated exploit development.

The Real Problem: Patch Velocity

Anthropic’s post makes a point that infrastructure engineers should internalize: “Progress on software security used to be limited by how quickly we could find new vulnerabilities. Now it is limited by how quickly we can verify, disclose, and patch the large numbers of vulnerabilities found by AI.”

Evidence is already showing up in vendor release cycles:

  • Palo Alto Networks shipped over five times as many patches as usual in their latest release
  • Microsoft says patch volume will “continue trending larger for some time”
  • Oracle is fixing vulnerabilities multiple times faster than before

The 90-day coordinated disclosure window is starting to look like a bottleneck. When AI can find thousands of bugs in weeks, the pipeline for verification, patch development, and deployment becomes the constraining factor.

What This Means for Enterprise Defenders

1. Vulnerability management pipelines need scaling

If you are running a standard VM program with quarterly scanning and monthly patching SLAs, that model is about to break. The incoming volume of findings from AI-assisted tools will overwhelm traditional triage queues.

Consider:

  • Automating verification with isolated test environments
  • Prioritizing by exploitability rather than just CVSS score
  • Integrating AI-generated findings directly into CI/CD gating

2. Open-source dependency risk is growing

With 6,200+ estimated high/critical findings in open-source projects alone, organizations need better SBOM visibility and faster patching cycles for third-party dependencies. Tools that can ingest AI-generated vulnerability reports and auto-generate patches (or at least test cases) will become essential.

3. The offensive capability gap is narrowing

Glasswing is framed as a defensive initiative, but the same model capabilities will inevitably become available to attackers. The 90-day disclosure window that protects users also gives attackers a 90-day head start on unpatched systems. Organizations need to move toward continuous patching and blue-green deployment models that allow zero-downtime security updates.

A Note on the WolfSSL Example

Anthropic highlighted one confirmed finding: a vulnerability in wolfSSL, a cryptography library used by billions of devices. This matters because wolfSSL is exactly the kind of code that is supposed to be heavily audited already. If AI can find bugs in mature, security-critical crypto libraries, it can find bugs in your code too.

What to Expect Next

Anthropic says it intends to continue scanning open-source code and expanding the partner program. Full technical details of specific findings are being withheld until patches are widely deployed, following standard 90-day coordinated disclosure. The company is also considering how to release Mythos-class models more broadly, which raises questions about access controls and responsible deployment.

Conclusion

Project Glasswing is not just a research exercise. It is an early signal that AI-augmented vulnerability discovery is moving from experimental to operational at scale. For security teams, the implications are immediate: your patching pipeline is now your most important defensive control. Finding bugs is cheap. Fixing them fast is hard.

References