Mythos Model Uncovers Over 10,000 Software Flaws

Introduction to Mythos Model

Anthropic, a leading AI company, has announced that its month-old Project Glasswing initiative has uncovered more than 10,000 high- or critical-severity software vulnerabilities across systemically important code. This finding has significant implications for the cybersecurity industry, as it shifts the central problem from discovering flaws to verifying and patching them.

Key Findings and Partnerships

The findings, drawn from partner reports and independent evaluations, demonstrate the capabilities of the Mythos model. Several partners reported that their rates of bug discovery had increased more than tenfold. For example, Cloudflare identified 2,000 bugs across its critical-path systems, including 400 rated high or critical, with a false-positive rate the company considered better than that of human testers.

At one unnamed partner bank, the model was credited with helping detect and prevent a fraudulent $1.5 million wire transfer initiated after a customer’s email account was compromised and followed up with spoofed phone calls. External evaluations cited in the update tracked with the results Anthropic released, including the United Kingdom’s AI Security Institute and Mozilla.

Independent Evaluations and Benchmarks

The United Kingdom’s AI Security Institute found that Mythos Preview was the first model to solve both of its cyber ranges — simulations of multistep cyberattacks — from end to end. Mozilla said it found and fixed 271 vulnerabilities in Firefox 150 while testing the model, more than 10 times the number found in Firefox 148 using an earlier Anthropic model.

AI-powered security platform XBOW called the model a significant step up over existing systems on its web exploit benchmark. Anthropic also used Mythos to scan more than 1,000 open-source projects, flagging 23,019 potential vulnerabilities, 6,202 of them estimated as high or critical.

Validation and Verification of Findings

Of 1,752 high- or critical-rated findings reviewed by six independent security research firms or by Anthropic itself, over 90% were confirmed as valid, and over 62% were confirmed to be high or critical. However, the company noted that while the model is good at finding vulnerabilities, there is still a gap in having people fix every issue.

“The bottleneck in fixing bugs like these is the human capacity to triage, report, and design and deploy patches for them,” the report states. Open-source maintainers have also been contending with a wave of low-quality, AI-generated bug reports, and Anthropic said it tries to reproduce and assess each issue before reporting it.

Future Plans and Releases

Anthropic said it has not released Mythos-class models publicly because no company, including itself, has developed safeguards to prevent serious misuse. In the interim, it has released Claude Security in public beta for enterprise customers, which it said has been used to patch more than 2,100 vulnerabilities in three weeks using the publicly available Claude Opus 4.7.

The company said it plans to expand Project Glasswing with additional partners, including U.S. and allied governments, before any broader release of the underlying model. “Glasswing helps the most systemically important cyber defenders gain an asymmetric advantage. However, there is an urgent need for as many organizations as possible to shore up their cyber defenses,” the report states.

“We hope that our generally available models, and the new tools, resources, and research we’re providing to accompany them, will support those organizations to improve their cybersecurity posture.”

More than 10,000 high- or critical-severity software vulnerabilities identified
Partners reported a tenfold increase in bug discovery rates
Model helped detect and prevent a fraudulent $1.5 million wire transfer
Independent evaluations confirmed the model's effectiveness
Anthropic plans to expand Project Glasswing with additional partners

Source: CyberScoop

Source: CyberScoop

Mythos Model Uncovers Over 10,000 Software Flaws

Introduction to Mythos Model

Key Findings and Partnerships

Independent Evaluations and Benchmarks

Validation and Verification of Findings

Future Plans and Releases

Powered by ZeroBot

Related Articles

Linux Kernel Vulnerabilities & AI-Powered Malware

CIRCIA Cyber Incident Reporting

OpenAI Patches ChatGPT Agent Flaw