CyberBytes Daily

Trending cyberattacks, explained simply.

ai security

How an AI agent turned a notebook vulnerability into a database breach in under one hour

The attacker did not write a script. Once they had a foothold inside a compromised server, they handed the keyboard to an AI agent and let it figure out the rest. The agent had never seen this network before. It had no map of the database, no list of credentials, no pre-written playbook. It reasoned its way through four pivots, from an exposed developer tool to an internal PostgreSQL database, in under sixty minutes.

On May 10, 2026, researchers at Sysdig observed what they describe as the first confirmed in-the-wild intrusion driven by a large language model (LLM) agent. The entry point was Marimo, a Python notebook platform used by data scientists and researchers. A flaw in Marimo's code gave any unauthenticated attacker a full command-line shell on the server. What happened next is what makes this incident different from every prior attack on record: the attacker outsourced the entire post-exploitation operation to an AI.

The most alarming detail is not the speed, though the database was emptied in under two minutes. It is what the speed reveals about the new economics of attacking. Building a multi-pivot intrusion used to require an operator who understood cloud infrastructure, SSH key management, database schemas, and API rate-limiting well enough to write custom code for each target. The agent in this attack needed none of that preparation. It carried general knowledge about how cloud deployments are structured and composed the attack live. As Sysdig's Sr. Director Michael Clark put it: "We are not watching AI replace attackers. We are watching attackers replace their scripts with AI."

Narrative · 6 min read

The Context

Marimo is an open-source Python notebook platform used by data scientists, researchers, and developers to build and share interactive code workflows. It has roughly 20,000 GitHub stars and is common in AI development and academic research environments. Like most notebook platforms, Marimo is designed to run locally or inside a private network. When organizations expose it to the internet—a common shortcut in research and development settings—it becomes a direct entry point into whatever cloud environment it runs inside.

The vulnerability at the center of this attack, CVE-2026-39987, was rated 9.3 out of 10 on the severity scale. It affected all Marimo versions up to and including 0.20.4. The fix shipped in version 0.23.0.

The Attack, Phase by Phase

Phase 1: Unauthenticated Shell via a Skipped Security Check

Marimo exposes several WebSocket endpoints. Every endpoint except one calls validate_auth() before accepting a connection. The terminal endpoint, /terminal/ws, skips that check entirely. It only verifies that the server is running in the right mode and on a supported platform, then hands the connecting client a full interactive shell running with the privileges of the Marimo process—often root inside a container.

An attacker who can reach that endpoint needs to send exactly one request. No password, no token, no prior foothold.

The advisory was published on April 8, 2026. Nine hours and forty-one minutes later, Sysdig's honeypot systems recorded the first exploitation attempt. No public proof-of-concept code existed at the time. Between April 11 and 14, Sysdig recorded 662 distinct exploitation events from 11 unique source IP addresses across 10 countries.

INITIAL ACCESS VIA UNAUTHENTICATED WEBSOCKET🌐1Internet-exposed Marimo notebookRunning version 0.20.4 or earlier📡2Single WebSocket requestSent to /terminal/ws, no credentials💻3Full PTY shell returnedAttacker has root-level command accessNo public exploit code existed. The attacker built the exploit from the advisory text in under 10 hours.

Phase 2: Cloud Credential Harvest and AWS Secrets Manager Pivot

With the shell established, the attacker handed control to an LLM agent. The agent enumerated the compromised host—reading environment files, checking standard configuration paths, and reasoning from general knowledge about cloud deployments. It extracted two sets of cloud credentials without any prior knowledge of this specific host.

The agent then called the AWS Secrets Manager API. To avoid triggering alerts based on unusual call volume from a single IP, it routed 12 API calls across 11 distinct Cloudflare Workers points of presence in a 22-second burst. Each call appeared to originate from a different IP address. The per-source-IP anomaly detection defenders rely on saw nothing unusual. The agent retrieved an SSH private key stored as a secret.

CREDENTIAL HARVEST AND SECRETS RETRIEVAL🔍1Environment enumerationAgent reads config files and env variables🗝️2Two credential sets extractedReasoned from cloud deployment patterns☁️3AWS Secrets Manager called12 API calls in 22 seconds🌍4Calls fanned across 11 IPsVia Cloudflare Workers edge nodes🔐5SSH private key retrievedStored as a secret in Secrets ManagerThe distributed API fan-out broke source-IP correlation. No single address generated enough calls to trigger an alert.

Phase 3: SSH Bastion Navigation and Database Exfiltration

Armed with the SSH private key, the agent opened eight parallel SSH sessions against a downstream bastion server from six separate IP addresses simultaneously, again defeating IP-based detection. From the bastion, the agent had no map of the internal network and no prior knowledge of the database it was about to hit.

It enumerated the schema of an internal PostgreSQL database in real time, identified a table containing credentials, and exfiltrated the full contents. The entire bastion phase—from first SSH connection to completed database dump—took under two minutes. The full chain from initial Marimo compromise to exfiltration completed in under one hour.

BASTION PIVOT AND DATABASE EXFILTRATION🔑18 parallel SSH sessions openedFrom 6 separate IP addresses🖥️2SSH bastion reachedGateway into the internal network🗄️3PostgreSQL schema enumeratedNo prior knowledge of schema or hostname📤4Full database exfiltratedCredential table identified and dumped💥Database emptiedUnder 2 minutes, bastion to dumpTotal elapsed time from initial Marimo exploit to completed exfiltration: under one hour.

Phase 4: How Sysdig Knew It Was an AI Agent

Sysdig identified four properties in the recorded command stream that distinguish agent-driven execution from scripted automation or a human operator.

First, the agent improvised schema enumeration with zero prior knowledge of the target database—a script would need that hardcoded. Second, a Chinese-language planning comment meaning "see what else we can do" leaked into the command stream, revealing the agent reasoning in real time. Third, every command was formatted for machine consumption: structured delimiters, bounded output caps, stderr discarded. Fourth, the agent fed its own prior output into each subsequent action without human direction—what Sysdig calls adaptive self-chaining. All four signatures appeared within a 113-second window.

BEHAVIORAL SIGNATURES OF AGENT-DRIVEN EXECUTION🧠1Improvised schema enumerationNo target-specific prior knowledge used💬2Planning comment leakedChinese-language reasoning in command stream⚙️3Machine-optimized formattingDelimiters, output caps, stderr discarded🔄4Adaptive self-chainingPrior output fed into each next actionAll four signatures appeared within a single 113-second window. No script and no human explains all four together.

What Made This Possible

  1. The developer toolchain is treated as lower-security than production. Organizations that would never expose a production database to the internet routinely expose notebook servers for researcher convenience. Those notebooks run with the same cloud credentials, SSH keys, and internal network access as the production systems they feed.

  2. A single missing function call opened a root shell. The rest of Marimo's WebSocket surface called validate_auth() correctly. The terminal endpoint did not. One omission in one file was sufficient for unauthenticated remote code execution rated 9.3 out of 10.

  3. Distributed egress defeated the detection layer designed to catch it. AWS anomaly detection is built around per-source-IP thresholds. Routing 12 API calls across 11 addresses means each made at most two calls. The detection layer saw normal traffic from many sources, not an attack from one.

What Should Have Stopped This

Every defense that would have reduced the blast radius shares one trait: it does not depend on the compromised tool's own integrity. Once Marimo was owned, any control that lived inside Marimo was also owned.

  • Network isolation for developer tools. Marimo should not have been reachable from the internet. Placing development infrastructure behind a VPN would have ended the attack before it began.
  • Credential scoping. The cloud credentials inside the Marimo environment should have had minimum permissions for notebook workflows—not the ability to call AWS Secrets Manager and retrieve SSH private keys.
  • Behavioral detection on API fan-out. Detection that looks at aggregate patterns—many calls to the same API endpoint from many sources in a short window—would have flagged the Secrets Manager access regardless of how many IPs the agent used.
  • Patch velocity. The fix was available in version 0.23.0. The attack happened on May 10, more than a month after the April 8 disclosure. Organizations running unpatched Marimo after the CISA Known Exploited Vulnerabilities listing on April 23 had documented warning and did not act in time.

The Takeaway

This attack is the same class of failure as the Stryker Intune wipe: a privileged management tool weaponized against the organization it was built to serve. The shared failure is treating internal tooling as a lower-security tier while giving it the same credential access as production systems.

What is new is the collapse of the attacker cost curve. Scripted attacks require an operator who understands the specific target well enough to write custom code. An LLM agent carries general priors about entire classes of infrastructure and composes the attack chain live against whatever it finds. The multi-pivot intrusion that used to require hours of preparation now requires an inference budget and a foothold.

Defenders built anomaly detection around the assumption that sophisticated attacks are slow. An agent that can enumerate an unknown database schema and exfiltrate its contents in under two minutes breaks that assumption. Detection needs to shift from "did this command look suspicious" to "did this sequence of outcomes—credential access, API fan-out, lateral movement—happen in an impossible timeframe."

Pattern to remember: Developer tools that hold production credentials are production attack surface, regardless of how they are classified internally.

What changed: Defenders can no longer treat attack sophistication as a proxy for attacker preparation time: an AI agent can improvise the full post-exploitation chain from general infrastructure knowledge, collapsing hours of manual reconnaissance into a single inference session.

Technical Deep Dive · 3 min

The Technical Mechanism

CVE-2026-39987 is a pre-authentication remote code execution vulnerability in Marimo's /terminal/ws WebSocket endpoint. The flaw is a missing authentication check: all other Marimo WebSocket endpoints invoke validate_auth() before accepting a connection, but the terminal endpoint performs only a mode check and a platform support check before spawning a full pseudo-terminal (PTY) session. The PTY runs with the privileges of the Marimo process, which in containerized deployments is typically root.

The CWE classification is CWE-306 (Missing Authentication for Critical Function). The CVSS 3.1 base score is 9.3 (Critical). The attack vector is network, attack complexity is low, no privileges are required, and no user interaction is required.

The post-exploitation chain proceeded as follows:

  1. The LLM agent enumerated environment variables and standard cloud credential paths (~/.aws/credentials, instance metadata endpoints, .env files) using general priors about cloud deployment patterns.
  2. The agent issued 12 GetSecretValue API calls to AWS Secrets Manager, routing each through a distinct Cloudflare Workers egress node. The 11 distinct source IPs across a 22-second window kept each address below per-IP anomaly thresholds.
  3. The retrieved SSH private key was used to open 8 parallel SSH sessions to a bastion host from 6 distinct source IPs.
  4. From the bastion, the agent issued \d and \dt meta-commands to enumerate the PostgreSQL schema, identified a credential table, and issued a COPY or pg_dump-equivalent command to exfiltrate the full database contents.

The four behavioral signatures Sysdig identified as distinguishing agent-driven execution from scripted automation: improvised schema enumeration, a leaked Chinese-language planning comment (看看还能做什么, meaning "see what else we can do"), machine-optimized command formatting with structured delimiters and 2>/dev/null stderr suppression, and adaptive self-chaining of prior command output into subsequent prompts.

Affected versions: Marimo 0.20.4 and all prior versions. Fixed in: Marimo 0.23.0.

FULL EXPLOIT CHAIN FROM WEBSOCKET TO DATABASE DUMP📡1Unauthenticated /terminal/ws requestvalidate_auth() not called on this endpoint💻2PTY shell spawned as rootFull command access inside container🔍3Credential paths enumerated by agent~/.aws/credentials, .env, metadata API☁️412 GetSecretValue calls fanned out11 Cloudflare Workers IPs in 22 seconds🔑5SSH key retrieved, bastion accessed8 sessions from 6 IPs simultaneously📤6PostgreSQL schema enumerated and dumpedFull exfiltration in under 2 minutesEach phase defeated a distinct detection layer. The agent composed the chain live from general infrastructure knowledge.

CVE and Advisories

  • CVE-2026-39987: Marimo unauthenticated WebSocket terminal RCE. CVSS 9.3 (Critical). Disclosed April 8, 2026. Fixed in version 0.23.0.
  • CISA KEV entry, April 23, 2026: CISA added CVE-2026-39987 to the Known Exploited Vulnerabilities catalog. Federal Civilian Executive Branch agencies were required to remediate by May 7, 2026, under Binding Operational Directive 22-01.

MITRE ATT&CK Mapping

Technique IDATT&CK nameHow it appeared
T1190Exploit Public-Facing ApplicationUnauthenticated WebSocket request to /terminal/ws returned a full PTY shell without any credentials.
T1552.001Credentials in FilesLLM agent enumerated environment files and configuration paths to extract cloud credentials.
T1555Credentials from Password StoresAgent called AWS Secrets Manager GetSecretValue to retrieve an SSH private key stored as a managed secret.
T1090.003Multi-hop Proxy12 AWS API calls routed across 11 Cloudflare Workers egress IPs in 22 seconds to defeat per-source-IP anomaly detection.
T1021.004Remote Services: SSH8 parallel SSH sessions opened from 6 distinct IPs against a downstream bastion server using the retrieved private key.
T1005Data from Local SystemAgent enumerated PostgreSQL schema in real time and exfiltrated full database contents including a credential table.

Indicators of Compromise

Detection is structurally difficult because the agent actively defeated the two most common detection approaches: per-IP rate limiting (defeated by egress fan-out) and command-signature matching (defeated by machine-optimized formatting with no fixed command strings).

Behavioral Indicators

  • Outbound WebSocket connections to /terminal/ws from any external IP address on an internet-exposed Marimo deployment
  • AWS GetSecretValue API calls from multiple distinct source IPs within a short time window targeting the same secret ARN
  • SSH sessions to a bastion host originating from multiple IPs within seconds of each other
  • PostgreSQL schema enumeration commands (\d, \dt, pg_dump) issued in rapid succession with no preceding application-layer authentication
  • Commands formatted with structured delimiters and 2>/dev/null stderr suppression in a shell session that was not initiated by a known user

Network Indicators

The initial access IP recorded by Sysdig was 157.66.54.26 (AS141892, geolocated to Indonesia). This single indicator has limited ongoing detection value given the distributed egress pattern used in subsequent phases.

Attribution

Unattributed. The initial access IP (157.66.54.26, AS141892) geolocates to Indonesia. A Chinese-language planning comment leaked into the command stream during the credential search phase. Sysdig Sr. Director Michael Clark explicitly declined to attribute the activity to a known threat group or nation-state actor. The specific LLM model and agent framework used by the attacker have not been identified, which limits defenders' ability to build detection signatures targeting the specific toolchain.


Primary Sources