May 11, 2026ai security

How an AI model found a hidden 2FA bypass and wrote the exploit before any scanner could see it

Every vulnerability scanner your security team runs is optimized to find the same class of problem: code that crashes, overflows, or calls a dangerous function. In early 2026, a criminal group skipped that entire category. They fed a popular web administration tool's source code into an AI model and asked it a different question: where does the developer's assumption not match what the code actually does? The model found a flaw that no scanner would have flagged, because the code was syntactically perfect. It just trusted the wrong thing.

The flaw was a hardcoded trust exception buried in the authentication flow. If a specific object or state was present in a request, the system treated it as already verified and skipped two-factor authentication (2FA) entirely. The code did exactly what the developer wrote. The developer's assumption was simply wrong. An attacker with a valid username and password, the kind obtained in any ordinary credential theft, could walk straight past the second factor.

What makes this the first confirmed case of its kind is not just that AI found the flaw. It is that the same AI model then wrote a working Python exploit script, complete with help menus, color-coded terminal output, and a hallucinated vulnerability score for a CVE that does not exist. The script was so thoroughly annotated and so textbook-clean that Google's threat researchers recognized it as machine-made before they even ran it. The AI had, in effect, signed its own work. Google's Threat Intelligence Group (GTIG) detected the script during proactive research, coordinated a quiet patch with the unnamed vendor, and disrupted the campaign before it launched. GTIG chief analyst John Hultquist told CyberScoop: "We finally uncovered some evidence this is happening. This is probably the tip of the iceberg and it's certainly not going to be the last."

🟡 Level 2

Read the narrative

How the attack unfolded, phase by phase.

6 min read

🔴 Level 3

Go deeper

Technical mechanism, CVE, ATT&CK mapping, primary sources.

3 min read

Does this apply to you?

If your organization runs any open-source web-based administration tool with an internet-exposed login page, you may have a direct exposure to this attack class: a valid set of credentials may be all an attacker needs to attempt a 2FA bypass of this type. If your environment has experienced any credential theft in the past 12 months, or if staff reuse passwords across services, your blast radius could extend to full administrative access to whatever systems that tool manages. This pattern applies beyond the specific unnamed tool: any system where authentication logic contains a hardcoded trust condition, a "if this state is present, skip verification" shortcut, may be vulnerable to the same semantic class of flaw that AI models are now demonstrably capable of finding. You should check whether your administration interfaces are restricted to internal networks or a VPN (virtual private network) rather than exposed directly to the internet. If they are not, an attacker with stolen credentials may no longer need to defeat your 2FA to get in.

Narrative · 6 min read

The Context

Web-based administration tools are the control panels of the internet. They let system administrators manage servers, databases, files, and configurations through a browser rather than a command line. The most widely deployed open-source versions run on millions of servers worldwide, many of them internet-facing, because administrators need to reach them remotely. Google's GTIG did not name the specific tool involved, but described it as "popular" with a "large installed base," the kind of footprint that makes it an attractive target for a mass exploitation campaign.

Two-factor authentication is the lock on the front door of these tools. Even if an attacker steals a username and password, 2FA requires a second proof of identity, typically a code from a phone app or a hardware token. Bypassing 2FA without breaking the underlying cryptography is considered extremely difficult. This attack did not break the cryptography. It found a door the developer had left unlocked by accident.

Key terms

Zero-day: A vulnerability that is unknown to the software vendor and has no patch available. 'Zero days' refers to the number of days the vendor has had to fix it.
2FA bypass: A technique that allows an attacker to skip the second verification step in a two-factor authentication system, gaining access with only a username and password.
Hardcoded trust assumption: A condition written directly into code that tells the system to treat certain requests as already verified. If an attacker can trigger that condition, the system skips its normal checks.
Semantic logic flaw: A vulnerability where the code is syntactically correct and contains no dangerous function calls, but the developer's underlying assumption about how the code would be used was wrong. Traditional scanners cannot detect these.
LLM (large language model): An AI system trained on large amounts of text and code that can read, summarize, and generate human-like language and programming code. Examples include GPT-4, Gemini, and Claude.

The Attack, Phase by Phase

Phase 1: Semantic Vulnerability Discovery

The criminal group fed the target tool's source code into an LLM and asked it to find logical attack surfaces, not the memory crashes and injection points that conventional scanners look for. The model read the code the way a developer would: comparing what the code was intended to do against what it actually did.

It found a hardcoded trust exception in the authentication flow. Under a specific condition, if a particular object or state was present in the request, the system would treat the request as already authenticated and skip the 2FA verification step entirely. The code was syntactically valid. No conventional scanner would have flagged it, because nothing about the code looked broken. The flaw was in the assumption, not the syntax.

The bypass required valid credentials to trigger. In practice, credentials are the easiest part: they are sold in bulk on criminal markets following any of the thousands of data breaches that occur each year.

Phase 2: AI-Assisted Weaponization

The same AI model then produced a working Python exploit script designed to trigger the trust exception. The script was functional. It was also unmistakably machine-made.

GTIG identified several forensic fingerprints pointing to LLM authorship. Every function carried detailed educational docstrings, far beyond what any human attacker would write. The script included a hallucinated CVSS score referencing a CVE identifier that does not exist in any public database. The code structure was textbook-clean, and the script included a custom ANSI color utility class for formatting terminal output—a flourish no human attacker would bother building from scratch.

These artifacts were not incidental. GTIG described them as the model effectively "signing" its own work. The stylistic fingerprints were inconsistent with any known human threat actor's output and became the primary evidence for GTIG's high-confidence assessment that AI was involved in both discovery and weaponization.

Phase 3: Mass Campaign Planning and Disruption

Multiple prominent criminal actors had partnered to run a large-scale exploitation campaign. The plan was to deploy the Python script against every internet-exposed instance of the vulnerable tool running the unpatched version.

GTIG detected the exploit script during proactive threat research before the campaign launched. Researchers identified the LLM-generation fingerprints, assessed AI involvement with high confidence, and worked with the unnamed vendor to coordinate responsible disclosure. A patch was shipped quietly. The campaign was disrupted before mass exploitation began.

GTIG also noted that implementation mistakes in the exploit code, likely introduced by the AI itself, may have independently complicated the attackers' plans. Google was careful to say its counter-discovery "may have prevented" the campaign, not that it definitively did.

What Made This Possible

Scanners are optimized for the wrong problem. Conventional static analysis tools are built to find syntactic flaws: buffer overflows, injection points, dangerous function calls. A semantic logic flaw—where the code does exactly what the developer wrote but the developer's assumption was wrong—produces no signal. The entire category was effectively invisible to the industry's standard detection stack.
LLMs read code the way developers do. An LLM does not look for crashes. It reads code semantically, inferring intent and comparing it against implementation. GTIG noted that "frontier LLMs excel at identifying high-level flaws and hardcoded static anomalies." The offensive and defensive applications are identical.
Credential markets made the bypass practical. The 2FA bypass required valid credentials to trigger. Stolen credentials for any popular tool are available for purchase following routine data breaches. The bypass did not need to defeat authentication from scratch—only to finish the job after credentials were already in hand.

The structural reality is that the same AI capabilities available to defenders are available to attackers, simultaneously, with no access barrier separating the two.

What Should Have Stopped This

No single defense here depends on the vulnerable tool's own authentication logic being correct. Every effective control operates outside the compromised component.

Network isolation for admin interfaces. If the administration tool is not reachable from the public internet, an attacker with stolen credentials cannot attempt the bypass remotely. Restricting admin interfaces to internal networks or a VPN removes the attack surface entirely, regardless of what flaws exist in the authentication code.
Privileged access workstations. Requiring administrators to connect from a dedicated, hardened device adds a layer that does not rely on the tool's own security. Even if 2FA is bypassed, the attacker still needs to be on the right device.
Credential hygiene and monitoring. Since the bypass required valid credentials, detecting credential compromise quickly narrows the window of exposure. Monitoring for unusual login patterns can surface an attack in progress.
Patch velocity. Organizations that apply patches quickly—particularly for internet-facing administration tools—close the window between disclosure and exploitation.

The Takeaway

This incident is the same class of failure as the Stryker Intune wipe: a tool built to manage systems was turned against the organization it was meant to protect. Where the Stryker attack required a human attacker to abuse a management platform's legitimate functions, this attack required an AI to read source code and reason about where a developer's assumption was wrong. The attacker did not need to understand the tool. They needed to ask the right question.

The February 2026 GTIG tracker described AI-enabled offensive activity as "nascent." Three months later, GTIG was documenting the first confirmed in-the-wild case of AI-discovered and AI-weaponized zero-day exploitation. The transition from "productivity aid" to "autonomous vulnerability researcher" happened faster than the industry's own analysts predicted.

GTIG's John Hultquist put it plainly: "For every zero-day we can trace back to AI, there are probably many more out there."

Pattern to remember: When the same reasoning capability that finds bugs in code review is available to attackers with no access barrier, the speed of discovery-to-weaponization compresses to whatever the AI can produce in a single session.

What changed: Vulnerability discovery no longer requires a human who understands the code. It requires a prompt.

Technical Deep Dive · 3 min

The Technical Mechanism

The vulnerability was a semantic logic flaw in the authentication flow of an unnamed popular open-source web-based administration tool. The flaw took the form of a hardcoded trust condition: a branch in the authentication logic that, when a specific object or state was present in the incoming request, caused the system to treat the request as already authenticated and skip the 2FA verification step.

This class of flaw has no standard CWE (Common Weakness Enumeration) classification that maps cleanly to it. The closest applicable classifications are CWE-287 (Improper Authentication) and CWE-290 (Authentication Bypass by Spoofing), but neither fully captures the semantic nature of the flaw. The code contained no dangerous function calls, no memory safety issues, and no injection sinks. It was syntactically valid and would pass all standard linting and static analysis checks. The flaw existed entirely in the developer's assumption about when a trusted state could legitimately be present.

The exploit required the attacker to already possess valid user credentials. The attack path was: obtain credentials (via phishing, credential stuffing, or purchase from criminal markets), craft a request that triggers the hardcoded trust condition, receive an authenticated session without completing the 2FA challenge.

The AI model produced a Python exploit script that automated this sequence. GTIG identified the script as LLM-generated through the following forensic artifacts: exhaustive educational docstrings on every function, a hallucinated CVSS score referencing a non-existent CVE identifier, textbook-clean Pythonic structure consistent with LLM training data, detailed --help menu output, and a fabricated ANSI color utility class. These artifacts are inconsistent with human attacker output across all known threat actor profiles in GTIG's tracking database.

GTIG assessed with high confidence that an AI model supported both the discovery phase (semantic code analysis to identify the trust flaw) and the weaponization phase (exploit script generation). The specific model used was not confirmed. Google ruled out both Gemini and Anthropic's Mythos. CNBC cited OpenClaw as the working hypothesis among some researchers, but this was not confirmed in the primary GTIG report.

CVE and Advisories

No CVE identifier has been publicly assigned to this vulnerability as of the disclosure date of May 11, 2026. Google and GTIG declined to name the affected tool, the CVE, or the vendor advisory. The vulnerability was patched via coordinated responsible disclosure between GTIG and the unnamed vendor. No public advisory URL is available.

The hallucinated CVSS score in the AI-generated exploit script does not correspond to any real CVE entry and should not be treated as a valid severity rating.

MITRE ATT&CK Mapping

Technique ID	ATT&CK name	How it appeared
T1078	Valid Accounts	The bypass required valid user credentials as a prerequisite. Attackers would obtain these through prior compromise before attempting the 2FA bypass.
T1556	Modify Authentication Process	The attack exploited a flaw in the authentication flow that caused the 2FA step to be skipped under specific conditions, effectively modifying the authentication outcome without altering the code at runtime.
T1190	Exploit Public-Facing Application	The target was a web-based administration tool with an internet-exposed login interface. The exploit was designed for mass deployment against all publicly reachable instances.
T1588.006	Obtain Capabilities: Vulnerabilities	The criminal group used an AI model to discover the zero-day vulnerability, representing a novel method of capability acquisition that bypasses traditional vulnerability research timelines.
T1587.004	Develop Capabilities: Exploits	The AI model produced a functional Python exploit script, completing the weaponization phase without human exploit development expertise.

Indicators of Compromise

No confirmed indicators of compromise (IOCs) were published by GTIG as of May 11, 2026. The campaign was disrupted before mass exploitation began, and no victim organizations were publicly identified.

Detection of the exploit script itself relied on stylistic forensic analysis rather than network or host-based IOCs:

Presence of exhaustive educational docstrings in exploit tooling inconsistent with known threat actor coding styles
CVSS scores in exploit documentation that do not correspond to any published CVE entry
Textbook-clean code structure with detailed help menus in offensive tooling
Custom ANSI color utility classes in scripts where no legitimate UI need exists

These artifacts are useful for retrospective analysis of discovered scripts but do not provide real-time detection capability. GTIG noted that AI is also being used to produce operational support tools that are "more difficult to detect by antivirus software and cybersecurity protections," suggesting that future AI-generated tooling may deliberately suppress these fingerprints.

Attribution

Google attributed the campaign to "prominent cybercrime threat actors" who partnered to plan the mass exploitation operation. No specific group name, alias, or nation-state nexus was disclosed. Google confirmed the attack did not involve Gemini or Anthropic's Mythos. The specific AI model used remains unconfirmed. CNBC cited OpenClaw as a working hypothesis among some researchers. The threat actor's identity and the full scope of the criminal partnership remain undisclosed as of the publication date.

Primary Sources

01.
Adversaries Leverage AI for Vulnerability Exploitation, Augmented Operations, and Initial Access
Google Cloud Blog / Google Threat Intelligence Group (GTIG) · May 11, 2026
02.
Google Threat Intelligence Group reports on AI threat trends
Google Blog (official) · May 11, 2026
03.
Google spotted an AI-developed zero-day before attackers could use it
CyberScoop · May 11, 2026
04.
Google says criminals used AI-built zero-day in planned mass hack spree
The Register · May 11, 2026
05.
Hackers Used AI to Develop First Known Zero-Day 2FA Bypass for Mass Exploitation
The Hacker News · May 11, 2026
06.
Google says it likely thwarted effort by hacker group to use AI for 'mass exploitation event'
CNBC · May 11, 2026
07.
Hackers Observed Using AI to Develop Zero-Day for the First Time
Infosecurity Magazine · May 11, 2026