Claude Mythos Preview Exploit Creation Capabilities
Anthropic's Claude Mythos Preview model can build working exploits targeting known vulnerabilities within hours, or even minutes, according to the company. This capability raises concerns about the potential for accelerated attacks, as it demonstrates the surge in AI use in cyberattacks increases the threats faced by organizations in the patch gap.
Testing Claude Mythos Preview
Anthropic tested Claude Mythos Preview, as well as its public models, and found that they could deliver working exploits targeting Firefox and Windows within hours. The company's most advanced model, Claude Mythos Preview, delivered 16 working exploits targeting Firefox and Windows, while the public models also delivered working exploits, albeit at a slower rate.
According to Anthropic, N-days are even more dangerous than zero-days, because attackers can patch diff and reverse-engineer them to build exploits. This is where Large Language Models (LLMs) become valuable weapons to attackers, as they significantly accelerate and automate the process of building N-day exploits.
Exploit Development and N-Day Campaigns
Anthropic explains that exploit development is not the only step in a real N-day campaign, but it has historically been the step most bottlenecked by scarce reverse engineering expertise. However, with the advent of LLMs, this process can be accelerated, making N-days even more dangerous.
The company tested Claude Mythos Preview, Opus, and Sonnet's ability to construct proof-of-concept (PoC) code targeting 18 security patches delivered for SpiderMonkey in Firefox 148 and 149. The results showed that Opus 4.8 created 11 PoCs, while Mythos Preview produced 14, with Opus 4.8 delivering the first PoC in eight minutes, and Mythos Preview creating it in 12.
Windows Exploits and Patching
Anthropic also tested the models' ability to build exploits for closed-source software, choosing Microsoft's Windows platform for the task. The company found that Sonnet 4.6 and Opus 4.7 built PoCs that triggered BSOD for 13 of the bugs, Opus 4.8 for 15, and Mythos Preview for 18. Mythos Preview delivered the first PoC in 31 minutes and was able to create working exploits leading to privilege escalation for eight of the vulnerabilities, delivering all of them within 18 hours.
According to Anthropic, because it typically takes seven days before Windows patches are pushed to 90% of enrolled devices in a fleet, and because they are typically force-rebooted only on day 11, the model makes exploitation viable within the patch gap. The cost of building these exploits is not high either, with each model given a three-million-token budget for creating the PoCs and exploits targeting Firefox, and the cost of creating the full chain exploits targeting Windows being $15,700 in API credits, or around $2,000 per privilege escalation.
Implications and Recommendations
Anthropic calls for an updated patching playbook, which should rely on 'N-hour' rather than 'N-day', and should no longer assume that weaponizing a patch takes weeks. The company notes that N-days have historically caused most harm to systems that are slow or difficult to patch, and that the cost of weaponizing any given patch falling toward zero will make these devices and systems even more exposed.
The company recommends that organizations prioritize patching and consider the potential for accelerated attacks. As the cost of building exploits decreases, the pool of capable N-day attackers will expand, making it essential for organizations to stay ahead of the threat curve.
Source: SecurityWeek