An AI Just Broke Into the NSA in Hours. Not Weeks. Hours.
On June 11, something happened that most people still haven't fully processed.
An AI model called Mythos reportedly broke into almost all of the NSA and US Cyber Command's classified systems. Not over days or weeks โ in hours. The NSA director, Joshua Rudd, confirmed it to the Senate Intelligence Committee. The Vice Chair, Mark Warner, relayed the quote directly: "not in weeks, but in hours."
The Economist broke the story. The same day, separately, Amazon discovered a jailbreak. And Mythos's next iteration is already out.
What Mythos Had Already Done
To understand why this matters, you need the backstory.
Before the NSA incident, Mythos had already broken into macOS in 5 days. Google's Project Zero โ one of the best elite security teams on the planet โ typically needs 6 months to find a comparable vulnerability. A single macOS zero-day sells for roughly $2 million on the open market.
Apple had previously assumed that only 10 to 20 teams in the entire world were capable of finding such exploits. Mythos changes that number to potentially thousands โ because now any group that can access a capable AI model may be able to run the same kind of attack.
There are roughly 2 billion active Apple devices in the world. Mac users skew toward journalists, executives, lawyers, and government officials โ exactly the population that sophisticated attackers most want to reach.
What "Hours" Actually Means
The NSA is not an easy target. It's one of the most hardened networks in the world, with layered physical and digital security, significant investment in anomaly detection, and some of the world's best offensive and defensive security talent.
Breaking into "almost all" of its classified systems in hours โ not days, not weeks โ suggests something different from a targeted exploit. This looks more like an AI agent that could methodically explore a system, identify weaknesses faster than human defenders could track, and chain together vulnerabilities in real time.
That's the part that matters: the speed. Human attackers can find zero-days. Nation-states have been doing it for decades. But they move slowly enough that defenders have time to respond. Mythos apparently operated at a speed where human defenders couldn't keep up.
Why This Is Different From a Benchmark
AI security benchmarks exist. Models get scored on CTF challenges, vulnerability detection tasks, and code analysis. Those results are real but they're controlled โ a known environment, a known target, a defined scope.
What happened at the NSA wasn't a benchmark. It was an uncontrolled, real-world network with real defenders actively monitoring it. That's a fundamentally different kind of test.
And the fact that the NSA director confirmed it โ rather than dismissing it or classifying the entire incident โ is itself significant. This is the kind of thing that typically doesn't get said out loud.
The Capability Gap Is Widening Fast
The standard framing for AI and cybersecurity has been that AI helps defenders too: it can find vulnerabilities in your own code, flag suspicious traffic, and accelerate security reviews. That's true, and it remains true.
But the Mythos incident points to a growing asymmetry. When an AI can compress months of skilled human offensive work into hours, the calculus around what "secure" means has to change. Organizations that rely on human response times โ which is most of them โ are now facing an adversary that doesn't need to sleep, doesn't make careless mistakes from fatigue, and can iterate faster than any human red team.
The next version of Mythos is already deployed. Whatever capabilities got it into the NSA, the follow-on is presumably more capable still.
What This Means If You Use OpenClaw
OpenClaw isn't a security tool, and this post isn't a pitch for it in that context. But the Mythos story illustrates something directly relevant to anyone building with AI agents: autonomous AI agents are operating at a level of capability that is moving faster than most people's mental models of what AI can do.
The gap between "AI that answers questions" and "AI that executes complex, multi-step tasks in real environments" is now very visible. Mythos is an extreme version of that โ an agent that could navigate a hostile, unknown environment and accomplish a goal that would take human experts months.
That same principle โ agents doing real work in real environments โ is what OpenClaw is built around. The applications are completely different. But the underlying shift is the same: AI is moving from assistant to actor.
If you've been thinking about what it looks like to put an AI agent to work on real tasks in your own context, now is a reasonable time to start.