CyberBytes Daily

Trending cyberattacks, explained simply.

ai security

How an AI model found a hidden 2FA bypass and wrote the exploit before any scanner could see it

Every vulnerability scanner your security team runs is optimized to find the same class of problem: code that crashes, overflows, or calls a dangerous function. In early 2026, a criminal group skipped that entire category. They fed a popular web administration tool's source code into an AI model and asked it a different question: where does the developer's assumption not match what the code actually does? The model found a flaw that no scanner would have flagged, because the code was syntactically perfect. It just trusted the wrong thing.

The flaw was a hardcoded trust exception buried in the authentication flow. If a specific object or state was present in a request, the system treated it as already verified and skipped two-factor authentication (2FA) entirely. The code did exactly what the developer wrote. The developer's assumption was simply wrong. An attacker with a valid username and password, the kind obtained in any ordinary credential theft, could walk straight past the second factor.

What makes this the first confirmed case of its kind is not just that AI found the flaw. It is that the same AI model then wrote a working Python exploit script, complete with help menus, color-coded terminal output, and a hallucinated vulnerability score for a CVE that does not exist. The script was so thoroughly annotated and so textbook-clean that Google's threat researchers recognized it as machine-made before they even ran it. The AI had, in effect, signed its own work. Google's Threat Intelligence Group (GTIG) detected the script during proactive research, coordinated a quiet patch with the unnamed vendor, and disrupted the campaign before it launched. GTIG chief analyst John Hultquist told CyberScoop: "We finally uncovered some evidence this is happening. This is probably the tip of the iceberg and it's certainly not going to be the last."

Narrative · 6 min read

The Context

Web-based administration tools are the control panels of the internet. They let system administrators manage servers, databases, files, and configurations through a browser rather than a command line. The most widely deployed open-source versions run on millions of servers worldwide, many of them internet-facing, because administrators need to reach them remotely. Google's GTIG did not name the specific tool involved, but described it as "popular" with a "large installed base," the kind of footprint that makes it an attractive target for a mass exploitation campaign.

Two-factor authentication is the lock on the front door of these tools. Even if an attacker steals a username and password, 2FA requires a second proof of identity, typically a code from a phone app or a hardware token. Bypassing 2FA without breaking the underlying cryptography is considered extremely difficult. This attack did not break the cryptography. It found a door the developer had left unlocked by accident.

The Attack, Phase by Phase

Phase 1: Semantic Vulnerability Discovery

The criminal group fed the target tool's source code into an LLM and asked it to find logical attack surfaces, not the memory crashes and injection points that conventional scanners look for. The model read the code the way a developer would: comparing what the code was intended to do against what it actually did.

It found a hardcoded trust exception in the authentication flow. Under a specific condition, if a particular object or state was present in the request, the system would treat the request as already authenticated and skip the 2FA verification step entirely. The code was syntactically valid. No conventional scanner would have flagged it, because nothing about the code looked broken. The flaw was in the assumption, not the syntax.

The bypass required valid credentials to trigger. In practice, credentials are the easiest part: they are sold in bulk on criminal markets following any of the thousands of data breaches that occur each year.

AI-ASSISTED CODE ANALYSIS📂1Source code acquiredPublic repo or docs fed to LLM🧠2Semantic analysisLLM compares intent vs. implementation🔓3Trust flaw identifiedHardcoded condition skips 2FA checkConventional scanners saw nothing. The code was syntactically valid.

Phase 2: AI-Assisted Weaponization

The same AI model then produced a working Python exploit script designed to trigger the trust exception. The script was functional. It was also unmistakably machine-made.

GTIG identified several forensic fingerprints pointing to LLM authorship. Every function carried detailed educational docstrings, far beyond what any human attacker would write. The script included a hallucinated CVSS score referencing a CVE identifier that does not exist in any public database. The code structure was textbook-clean, and the script included a custom ANSI color utility class for formatting terminal output—a flourish no human attacker would bother building from scratch.

These artifacts were not incidental. GTIG described them as the model effectively "signing" its own work. The stylistic fingerprints were inconsistent with any known human threat actor's output and became the primary evidence for GTIG's high-confidence assessment that AI was involved in both discovery and weaponization.

AI-GENERATED EXPLOIT PRODUCTION⚙️1Exploit script generatedPython script targets the trust flaw📝2LLM fingerprints embeddedDocstrings, hallucinated CVSS, color class3Script validated functionalBypass confirmed on vulnerable versionThe AI signed its own work. The fingerprints that gave it away were also what made it detectable.

Phase 3: Mass Campaign Planning and Disruption

Multiple prominent criminal actors had partnered to run a large-scale exploitation campaign. The plan was to deploy the Python script against every internet-exposed instance of the vulnerable tool running the unpatched version.

GTIG detected the exploit script during proactive threat research before the campaign launched. Researchers identified the LLM-generation fingerprints, assessed AI involvement with high confidence, and worked with the unnamed vendor to coordinate responsible disclosure. A patch was shipped quietly. The campaign was disrupted before mass exploitation began.

GTIG also noted that implementation mistakes in the exploit code, likely introduced by the AI itself, may have independently complicated the attackers' plans. Google was careful to say its counter-discovery "may have prevented" the campaign, not that it definitively did.

CAMPAIGN PLANNING AND DISRUPTION🤝1Criminal actors partnerMultiple groups plan mass exploitation🌐2Mass deployment plannedScript targeted at all exposed instances🔍3GTIG detects exploit scriptProactive research finds LLM fingerprints🛡️4Patch coordinated and shippedVendor notified, quiet fix deployedThe campaign was disrupted before a single confirmed victim. That may not happen next time.

What Made This Possible

  1. Scanners are optimized for the wrong problem. Conventional static analysis tools are built to find syntactic flaws: buffer overflows, injection points, dangerous function calls. A semantic logic flaw—where the code does exactly what the developer wrote but the developer's assumption was wrong—produces no signal. The entire category was effectively invisible to the industry's standard detection stack.

  2. LLMs read code the way developers do. An LLM does not look for crashes. It reads code semantically, inferring intent and comparing it against implementation. GTIG noted that "frontier LLMs excel at identifying high-level flaws and hardcoded static anomalies." The offensive and defensive applications are identical.

  3. Credential markets made the bypass practical. The 2FA bypass required valid credentials to trigger. Stolen credentials for any popular tool are available for purchase following routine data breaches. The bypass did not need to defeat authentication from scratch—only to finish the job after credentials were already in hand.

The structural reality is that the same AI capabilities available to defenders are available to attackers, simultaneously, with no access barrier separating the two.

What Should Have Stopped This

No single defense here depends on the vulnerable tool's own authentication logic being correct. Every effective control operates outside the compromised component.

  • Network isolation for admin interfaces. If the administration tool is not reachable from the public internet, an attacker with stolen credentials cannot attempt the bypass remotely. Restricting admin interfaces to internal networks or a VPN removes the attack surface entirely, regardless of what flaws exist in the authentication code.
  • Privileged access workstations. Requiring administrators to connect from a dedicated, hardened device adds a layer that does not rely on the tool's own security. Even if 2FA is bypassed, the attacker still needs to be on the right device.
  • Credential hygiene and monitoring. Since the bypass required valid credentials, detecting credential compromise quickly narrows the window of exposure. Monitoring for unusual login patterns can surface an attack in progress.
  • Patch velocity. Organizations that apply patches quickly—particularly for internet-facing administration tools—close the window between disclosure and exploitation.

The Takeaway

This incident is the same class of failure as the Stryker Intune wipe: a tool built to manage systems was turned against the organization it was meant to protect. Where the Stryker attack required a human attacker to abuse a management platform's legitimate functions, this attack required an AI to read source code and reason about where a developer's assumption was wrong. The attacker did not need to understand the tool. They needed to ask the right question.

The February 2026 GTIG tracker described AI-enabled offensive activity as "nascent." Three months later, GTIG was documenting the first confirmed in-the-wild case of AI-discovered and AI-weaponized zero-day exploitation. The transition from "productivity aid" to "autonomous vulnerability researcher" happened faster than the industry's own analysts predicted.

GTIG's John Hultquist put it plainly: "For every zero-day we can trace back to AI, there are probably many more out there."

Pattern to remember: When the same reasoning capability that finds bugs in code review is available to attackers with no access barrier, the speed of discovery-to-weaponization compresses to whatever the AI can produce in a single session.

What changed: Vulnerability discovery no longer requires a human who understands the code. It requires a prompt.

Technical Deep Dive · 3 min

The Technical Mechanism

The vulnerability was a semantic logic flaw in the authentication flow of an unnamed popular open-source web-based administration tool. The flaw took the form of a hardcoded trust condition: a branch in the authentication logic that, when a specific object or state was present in the incoming request, caused the system to treat the request as already authenticated and skip the 2FA verification step.

This class of flaw has no standard CWE (Common Weakness Enumeration) classification that maps cleanly to it. The closest applicable classifications are CWE-287 (Improper Authentication) and CWE-290 (Authentication Bypass by Spoofing), but neither fully captures the semantic nature of the flaw. The code contained no dangerous function calls, no memory safety issues, and no injection sinks. It was syntactically valid and would pass all standard linting and static analysis checks. The flaw existed entirely in the developer's assumption about when a trusted state could legitimately be present.

The exploit required the attacker to already possess valid user credentials. The attack path was: obtain credentials (via phishing, credential stuffing, or purchase from criminal markets), craft a request that triggers the hardcoded trust condition, receive an authenticated session without completing the 2FA challenge.

The AI model produced a Python exploit script that automated this sequence. GTIG identified the script as LLM-generated through the following forensic artifacts: exhaustive educational docstrings on every function, a hallucinated CVSS score referencing a non-existent CVE identifier, textbook-clean Pythonic structure consistent with LLM training data, detailed --help menu output, and a fabricated ANSI color utility class. These artifacts are inconsistent with human attacker output across all known threat actor profiles in GTIG's tracking database.

GTIG assessed with high confidence that an AI model supported both the discovery phase (semantic code analysis to identify the trust flaw) and the weaponization phase (exploit script generation). The specific model used was not confirmed. Google ruled out both Gemini and Anthropic's Mythos. CNBC cited OpenClaw as the working hypothesis among some researchers, but this was not confirmed in the primary GTIG report.

TECHNICAL ATTACK PATH🔑1Valid credentials obtainedPhishing, stuffing, or criminal market📡2Crafted request sentTriggers hardcoded trust condition⏭️32FA check skippedSystem treats request as pre-authed🖥️4Admin session grantedFull access to administration interfaceThe exploit required no cryptographic attack. It required triggering a condition the developer had already written.

CVE and Advisories

No CVE identifier has been publicly assigned to this vulnerability as of the disclosure date of May 11, 2026. Google and GTIG declined to name the affected tool, the CVE, or the vendor advisory. The vulnerability was patched via coordinated responsible disclosure between GTIG and the unnamed vendor. No public advisory URL is available.

The hallucinated CVSS score in the AI-generated exploit script does not correspond to any real CVE entry and should not be treated as a valid severity rating.

MITRE ATT&CK Mapping

Technique IDATT&CK nameHow it appeared
T1078Valid AccountsThe bypass required valid user credentials as a prerequisite. Attackers would obtain these through prior compromise before attempting the 2FA bypass.
T1556Modify Authentication ProcessThe attack exploited a flaw in the authentication flow that caused the 2FA step to be skipped under specific conditions, effectively modifying the authentication outcome without altering the code at runtime.
T1190Exploit Public-Facing ApplicationThe target was a web-based administration tool with an internet-exposed login interface. The exploit was designed for mass deployment against all publicly reachable instances.
T1588.006Obtain Capabilities: VulnerabilitiesThe criminal group used an AI model to discover the zero-day vulnerability, representing a novel method of capability acquisition that bypasses traditional vulnerability research timelines.
T1587.004Develop Capabilities: ExploitsThe AI model produced a functional Python exploit script, completing the weaponization phase without human exploit development expertise.

Indicators of Compromise

No confirmed indicators of compromise (IOCs) were published by GTIG as of May 11, 2026. The campaign was disrupted before mass exploitation began, and no victim organizations were publicly identified.

Detection of the exploit script itself relied on stylistic forensic analysis rather than network or host-based IOCs:

  • Presence of exhaustive educational docstrings in exploit tooling inconsistent with known threat actor coding styles
  • CVSS scores in exploit documentation that do not correspond to any published CVE entry
  • Textbook-clean code structure with detailed help menus in offensive tooling
  • Custom ANSI color utility classes in scripts where no legitimate UI need exists

These artifacts are useful for retrospective analysis of discovered scripts but do not provide real-time detection capability. GTIG noted that AI is also being used to produce operational support tools that are "more difficult to detect by antivirus software and cybersecurity protections," suggesting that future AI-generated tooling may deliberately suppress these fingerprints.

Attribution

Google attributed the campaign to "prominent cybercrime threat actors" who partnered to plan the mass exploitation operation. No specific group name, alias, or nation-state nexus was disclosed. Google confirmed the attack did not involve Gemini or Anthropic's Mythos. The specific AI model used remains unconfirmed. CNBC cited OpenClaw as a working hypothesis among some researchers. The threat actor's identity and the full scope of the criminal partnership remain undisclosed as of the publication date.


Primary Sources