The AI Worm That Rethinks Its Attack for Every Target

For years, the claim that “AI makes cyberattacks more dangerous” has been a warning, not a finding. A preprint by researchers from the University of Toronto, the Vector Institute, the University of Cambridge, and ServiceNow Research shifts that. The team around Nicolas Papernot has built a computer worm that is defined not by fixed exploit code, but by the ability to analyze each new target and devise its attack strategy in real time. The paper is titled “AI Agents Enable Adaptive Computer Worms” and is available as a non-peer-reviewed preprint on arXiv (2606.03811).
I consider this a finding that forces the security community to rethink. Not to panic. The difference is in the detail, and the detail is more interesting than the headline.
What the researchers built
A classic worm is a template. It carries one or more hard-coded exploits and scans the network for exactly the flaw those exploits target. WannaCry was the prominent example: a worm that exploited a known SMB vulnerability and could be stopped the moment the matching patch was applied. The weapon is only as sharp as its last known exploit, and not a day longer.
The proof of concept out of Toronto inverts this principle. Instead of fixed attack logic, the worm carries the capability to attack. On a machine it has already taken over, it launches an open-weight language model, meaning a freely downloadable LLM that runs locally on the hijacked machine. This model takes on the role that a human or a rigid script used to play: it looks at the next target, infers possible attack paths from what it observes, and tries them out.
Technically, the worm operates like an agentic loop. Observe, reason, act, observe again. Which operating system is running, which services are reachable, which misconfigurations lie exposed. From this information, the local model synthesizes a tailored strategy. None of it is predefined. Notably, the worm could also repair itself when one of its own mistakes impaired its function.
Shifting the intelligence onto the victim
Two consequences of this architecture matter more than the demonstration itself.
The first is economic. Because the language model runs on the victim’s hijacked hardware, every new infection costs the attacker practically nothing. The target pays the compute. The researchers call this a “destabilizing economic asymmetry between attackers and defenders”. Every defensive measure incurs cost, every additional attack does not. This asymmetry is the actual lever.
The second consequence concerns the safety mechanisms of the large AI providers. Rate limiting, content filters, and abuse detection only take effect when a request runs through the commercial platform. A locally executed open-weight model bypasses all of it. In the words of the paper, central security mechanisms thereby become “structurally irrelevant”. A provider cannot block what does not flow through its infrastructure.
The test run: 33 devices, seven days
The researchers released the worm into an isolated test network of 33 devices, a mix of Linux servers, Windows systems, and IoT devices. According to the team, the exposed vulnerabilities correspond to real flaws as they actually occur in corporate networks. Each run lasted seven days, with the worm operating autonomously.
The averaged results across multiple runs:
| Metric | Value |
|---|---|
| Vulnerabilities correctly identified | avg. 31.3 |
| Hosts with successful privilege escalation | avg. 23.1 |
| Hosts reached | avg. 20.4 |
| Generations of self-replication | 7 |
| Share of network successfully attacked | nearly 75% |
| Share of replication across the network | nearly two thirds |
The most telling point sits somewhat hidden. Around three of the exploited vulnerabilities only became public in 2026, meaning they lie beyond the training cutoff of the models used. The worm did not reproduce these attacks from memorized exploit templates. It developed them itself from publicly available vulnerability descriptions. This is precisely where patch management as a sole strategy reaches its limit. Anyone relying on closing gaps quickly is fighting an adversary that may find new attack paths faster than patches become available.
The limits of the PoC worm
The numbers seem alarming, which is exactly why a sober look at the limits is worth it. The researchers themselves name them clearly.
A single exploit attempt succeeded only 44 percent of the time. Weaknesses showed up in web applications, in Windows command-line environments, and in payload syntax. The team attributes these shortcomings to a “code-generation ceiling”, meaning the limited code quality of a model running on a single GPU. Larger or better models would raise the rate, which makes the gap a temporary rather than a fundamental limit.
Three further constraints are central:
- The test ran in a controlled, isolated network. The vulnerabilities were left open deliberately. In a well-patched, monitored corporate network the results would likely be worse.
- The time requirement of seven days per run is a real disadvantage for attackers who want to strike fast.
- The worm used no zero-days. It relied exclusively on known, unpatched vulnerabilities and misconfigurations. In the words of the researchers: “Our prototype targets publicly disclosed but unpatched vulnerabilities.”
So the novelty does not lie in new exploits. It lies in the AI independently deciding which known attack fits which target. That is the difference between a toolbox and a craftsman operating the toolbox.
Responsibility in publication
It is notable how the team handled the publication. The researchers withheld operational details, expressly including the name of the language model used. The paper describes the threat model and the results, not a runnable blueprint. This separation is common and sensible in security research. It allows defenders to understand the class of threat without handing attackers a finished weapon.
I follow that line here deliberately. This article contextualizes a publicly reported research finding. It provides no instructions and no code.
Defensive measures
For private users, little changes immediately. A worm of this complexity targets corporate networks, not the home router. Still, the point holds that poorly secured IoT devices, smart TVs, or network cameras can serve as a stepping stone in a flat home network.
For those responsible for IT, the consequence is more concrete. The paper and the accompanying reporting suggest three directions that are good practice anyway and gain weight through this finding:
- Network segmentation and zero trust. Micro-segmentation substantially limits a worm’s lateral spread. When each generation of replication encounters new hurdles, the reach drops drastically.
- Anomaly-based detection rather than pure signatures. A worm without fixed exploit code has no fixed signature. Behavior-based detection that notices unusual scanning, escalation, and replication patterns catches this class of threat better than matching known malware hashes.
- Least privilege. The worm’s success rate hinged largely on successful privilege escalation. Where accounts and services may do only what is necessary, part of the attack chain breaks away.
A fourth measure is proactive. The same AI capability that makes the attack adaptive here can be used for AI-assisted, automated penetration testing and fuzzing against your own infrastructure. The researchers and several commentators see this as the most obvious defensive answer. For now, though, adaptive defense remains more program than practice.
Assessment
I will hold on to three points at the end.
First, the work is a preprint and not yet peer-reviewed. The numbers may shift under scientific scrutiny. As a proof of concept it is still to be taken seriously, because it empirically demonstrates a threat that was hypothetical until now.
Second, the actual break is conceptual, not technical. There is no new zero-day, no new intrusion tool. There is an attacker who delegates the selection and adaptation of known attacks to a local AI, thereby circumventing two lines of defense: the cost per attack and the central abuse control of the cloud providers.
Third, in the words of the authors: “This research uncovered a new cybersecurity threat the world is not prepared to face.” That is a strong formulation, and the isolated test setup tempers it. But the direction holds. Defense that relies exclusively on fast patching is fighting an adversary whose marginal costs trend toward zero and whose capabilities rise with each new model generation. The defensive homework is therefore not “patch faster”, but taking segmentation, behavioral detection, and least privilege more seriously than before.
Sources
- Jonas Guan, Tom Blanchard, Hanna Foerster, Hengrui Jia, Gabriel Huang, Nicolas Papernot: “AI Agents Enable Adaptive Computer Worms”, arXiv:2606.03811 (preprint, not peer-reviewed) — primary source.
- heise online / c’t: “Dieser KI-Wurm entwickelt für jedes Ziel neue Angriffe” — original reporting.
- Help Net Security: “Autonomous AI-driven worm can reason its way through corporate networks” — context and defensive measures.
- iTnews: “Researchers build self-replicating AI worm with BYO LLM” — additional reporting.
All figures are taken from the preprint and the accompanying reporting. Operational details and the language model used were deliberately withheld by the researchers.