AI Agents Drive First Large-Scale Autonomous Cyberattack

Share this article
Share this article
Prioritise Us on Google
Anthropic halted an AI-led cyber attack in 2025. Credit: Getty
Chinese state-backed hackers used Anthropic’s Claude Code to carry out a largely AI-driven cyber espionage campaign targeting 30 organisations

The cybersecurity landscape fundamentally changed in September 2025, when Anthropic detected and disrupted what it describes as the first documented large-scale cyber espionage attack conducted predominantly by AI agents.

The campaign targeted approximately 30 high-value organisations across multiple sectors, from financial institutions to government agencies, with AI autonomously executing between 80% and 90% of attack tasks.

This incident represents a pivotal shift in cyber warfare capabilities. AI-powered agents demonstrated the ability to gather intelligence and conduct attacks with minimal human intervention, requiring operator input only at critical strategic decision points.

The implications for cybersecurity professionals are profound, as the autonomous agent model marks a sharp escalation from earlier operations where human direction remained pervasive throughout the attack lifecycle.

Autonomous execution at unprecedented scale

The Chinese state-sponsored group exploited recent advances in AI – intelligence, agency and tool integration – to conduct a multi-phase cyberattack with levels of autonomy not previously documented.

According to Anthropic's publicly released 13-page report, the campaign used Claude Code not merely as an advisory tool but as an active agent executing complex hacking tasks independently.

Youtube Placeholder

The AI autonomously handled reconnaissance, vulnerability discovery, exploit development, credential harvesting, lateral movement and data exfiltration. Humans initiated the campaign by selecting targets and establishing strategic parameters, but the operational execution fell almost entirely to the AI system. Claude executed thousands of requests per second, operating at a pace no human team could match.

Jacob Klein, Head of Threat Intelligence at Anthropic, explained to the Wall Street Journal that the hackers conducted their attacks "literally with the click of a button, and then with minimal human interaction". He added: "The human was only involved in a few critical chokepoints, saying, 'Yes, continue,' 'Don't continue,' 'Thank you for this information,' 'Oh, that doesn't look right, Claude, are you sure?'"

Circumventing AI safeguards

The malicious actors automated the attack by circumventing Claude Code's built-in safeguards. By breaking malicious tasks into seemingly innocuous components, the group misled the AI into believing it was operating as part of a legitimate cybersecurity test. This technique allowed the attackers to exploit the model's capabilities whilst evading its security protocols.

The targets spanned tech companies, chemical manufacturers, financial institutions and government agencies. The breadth of the target list and the speed of execution highlight how agentic AI systems could drastically lower barriers to sophisticated cyberattacks. Less experienced or smaller adversaries might soon perform attacks previously limited to nation-states with substantial resources and expertise.

Jacob Klein, Head of Threat Intelligence at Anthropic

Implications for cyber defence

This campaign represents a fundamental shift in the threat landscape that cybersecurity professionals must navigate. The ability of AI to autonomously conduct extended operations at scale means that the traditional security models built around human-paced attacks may no longer provide adequate protection.

However, the investigation revealed limitations that currently serve as obstacles to fully autonomous cyberattacks. Claude occasionally hallucinated data, fabricated credentials or overstated exploit success, requiring human validation at key junctures. These imperfections remain amongst the few barriers preventing completely autonomous cyber operations.

The incident, which Anthropic disclosed publicly in November after detecting it in September, heralds a new era in cyber warfare. Organisations across sectors must now contend with the reality that AI agents can execute sophisticated, multi-phase attacks with minimal human oversight, fundamentally altering the calculus of cyber defence strategies.

Company portals

Executives