CyberBytes Daily

Trending cyberattacks, explained simply.

ai security

How a natural-language prompt became a shell command inside a company's AI agent

The AI agent was doing exactly what it was designed to do: reading a user's request, selecting the right tool, and passing the user's input into that tool as a parameter. That is also precisely how an attacker ran arbitrary commands on the server hosting the agent. No malware, no browser exploit, no stolen credentials. A single sentence in plain English was enough.

Two vulnerabilities in Microsoft's Semantic Kernel, an open-source framework used to build AI agents across industries, carried a perfect CVSS score of 10.0. One allowed an attacker to execute operating system commands on the host server by smuggling a payload through a hotel search query. The other allowed an attacker to plant a file anywhere on the host filesystem, including the Windows Startup folder, by convincing the agent to "download" a file to an attacker-specified location. Both attacks required only a crafted prompt. Both worked because the framework treated the AI model's output as trusted, deterministic input, equivalent to hardcoded backend logic.

The detail that should keep security leaders awake: the vulnerability existed the moment a dangerous function was registered as an agent tool, not the moment it was invoked. By the time a prompt arrived, the attack surface was already open. Runtime filters and prompt guardrails were structurally too late.

Narrative · 6 min read

The Context

Microsoft Semantic Kernel is an open-source framework that lets developers build AI agents: software that takes a user's natural-language request, hands it to a large language model (LLM), and then executes whatever tools or functions the model decides are appropriate. Think of it as the plumbing between a chatbot and the actual systems it can act on. As of May 2026, the framework had roughly 28,000 GitHub stars and over 1,000 contributors, and was in use across customer service agents, document analysis pipelines, internal copilots, and enterprise automation.

In February 2026, Microsoft's own security researchers discovered two critical vulnerabilities in the framework. Both were patched before the May 7 public disclosure. The primary risk window was the period between when each patch was released and when individual operators applied it.

The Attack, Phase by Phase

Phase 1: Prompt Injection Entry

The attacker identifies an AI agent built on Semantic Kernel that accepts user-controlled natural-language input. For CVE-2026-26030, the target is any agent using the Search Plugin backed by the InMemoryVectorStore component in its default configuration. For CVE-2026-25592, the target is any .NET agent using the SessionsPythonPlugin, a component designed to let agents run Python code inside a cloud-hosted sandbox.

The attacker crafts a prompt designed to manipulate the LLM into selecting a specific tool and passing attacker-controlled values as that tool's parameters. The system prompt may contain instructions like "only answer questions about hotels," but those instructions are advisory to the model, not enforced by the framework. The model is the decision-maker, and the model can be misled.

ATTACKER ENTRY POINT🔍1Identify vulnerable agentAgent uses Search Plugin or SessionsPythonPlugin✍️2Craft malicious promptNatural language hides attacker payload🤖3LLM reads promptModel selects tool and passes attacker inputNo malware, attachment, or stolen credential required at this stage

Phase 2: Framework Trust Exploitation

The LLM, acting exactly as designed, selects the target tool and passes the attacker-controlled parameter into the framework's execution pipeline. This is where the two vulnerabilities diverge in mechanics but share the same root cause.

In CVE-2026-26030, the Search Plugin's filter logic was built as a Python lambda expression and executed using eval(). The framework attempted to block dangerous inputs using a list of prohibited code patterns. The attacker's payload bypassed this list by walking Python's internal object hierarchy—a technique that reaches OS command execution without using any explicitly blocked patterns. The attacker's city parameter, passed in by the LLM, contained the payload. The framework executed it.

In CVE-2026-25592, the SessionsPythonPlugin included a file-transfer helper called DownloadFileAsync. A developer had accidentally marked it with an attribute that registered it as a callable kernel function, meaning the AI model could invoke it. The function accepted a file path with no validation. A hostile prompt could instruct the agent to write a malicious file inside the cloud sandbox, then invoke DownloadFileAsync to pull that file to any location on the host filesystem—including the Windows Startup folder.

FRAMEWORK EXECUTION PIPELINE⚙️1Framework receives parametersLLM output treated as trusted input🔓2Blocklist bypassed (CVE-2026-26030)MRO traversal avoids blocked patterns📁3Exposed function invoked (CVE-2026-25592)DownloadFileAsync called with attacker path💥4Dangerous capability reachedeval() or unrestricted file write executesSECURITY BOUNDARY THAT FAILEDBoth paths reach host-level impact through the agent's normal tool-calling behavior

Phase 3: Host-Level Code Execution or Persistence

For CVE-2026-26030, the eval() sink executes the attacker's OS command directly on the server running the agent. Microsoft's own demonstration launched calc.exe on the host machine. In a real attack, this step would exfiltrate data, install a backdoor, or enable lateral movement.

For CVE-2026-25592, the attacker's file lands in a persistence location. On the next restart, the payload executes with the privileges of the agent host process. Any credentials, network access, or data the agent holds are then available to the attacker.

HOST-LEVEL IMPACT💻1OS command executes (CVE-2026-26030)Arbitrary code runs on host server📌2Persistence established (CVE-2026-25592)Payload written to Windows Startup folder🌐3Lateral movement possibleAgent credentials and network access exposed🗄️Data theftVector store contents exfiltrated🔁Persistent accessSurvives agent restart↔️Lateral movementDownstream systems reachable

Phase 4: Detection Gap

Because the attack flows through the agent's normal tool-calling behavior, standard endpoint and network defenses may not fire. There is no shellcode, no memory corruption, no unusual network traffic at the point of entry. Microsoft advises defenders to define the vulnerable window per deployment—from first deployment of a vulnerable version to the moment the patch was applied—and hunt within that window for anomalous child processes spawned by the agent, unexpected file writes to startup or system directories, and log entries showing DownloadFileAsync or UploadFileAsync calls with path traversal sequences.

DETECTION CHALLENGE📋1Attack looks like normal tool useNo shellcode, no network anomaly at ingress🔎2Hunt within vulnerable windowFirst deployment to patch date🚨3Look for host-level signalsChild processes, startup folder writesAI model telemetry and host execution telemetry must be correlated together

What Made This Possible

  1. The framework trusted the model's output. Semantic Kernel's AutoInvokeKernelFunctions behavior passes LLM-selected tool parameters directly into execution without treating them as attacker-controlled input. The model is a non-deterministic system that can be manipulated by the text it receives. Treating its output as equivalent to hardcoded backend logic is the foundational error.

  2. Dangerous functions were registered as agent tools. The eval() sink in the Search Plugin and DownloadFileAsync in the SessionsPythonPlugin were both reachable by the model because they were registered as kernel functions. The attack surface was created at registration time, not at invocation time. Runtime filters cannot close a gap opened at design time.

  3. Agent deployments entered organizations without security inventory. Agent frameworks tend to arrive through experiments and internal copilots. Security teams often lack a clean inventory of which versions are running, which tools are exposed, and where sandbox boundaries are. An untracked deployment is an unpatched deployment.

What Should Have Stopped This

Treating the AI model's output as untrusted input—the same way a web application treats a form submission from an anonymous user—is the unifying principle behind every defense that would have reduced the blast radius here.

  • Never register functions with dangerous capabilities as kernel tools. Functions that execute code or write files to arbitrary paths should not be in the model's tool menu. If the model cannot invoke them, prompt injection cannot reach them.
  • Apply an allowlist, not a blocklist, for tool parameters. The original eval() guard used a blocklist of prohibited patterns. Blocklists fail when attackers find unlisted patterns. An allowlist that permits only known-safe inputs is structurally harder to bypass.
  • Treat agent deployments as production infrastructure. Patch management, version inventory, and security review should apply to agent frameworks the same way they apply to web servers. An agent with filesystem access and network credentials is not a developer experiment.
  • Correlate AI-layer and host-layer telemetry. Detecting this class of attack requires connecting what the model was asked to do with what the host process actually did. Neither signal alone is sufficient.

The Takeaway

These two vulnerabilities are not bugs in the traditional sense. There was no memory corruption, no protocol flaw, no cryptographic weakness. The attack worked because the framework was designed to let the AI model control what happens next—and the model can be told what to do by anyone who can send it a message.

The systemic lesson is one of attack surface registration. The vulnerability was baked in the moment a dangerous function was added to the model's tool menu. Runtime defenses are structurally too late. The only durable fix is to never expose functions with dangerous capabilities to autonomous model invocation in the first place.

Pattern to remember: In AI agent frameworks, the attack surface is defined at tool registration time, not at runtime. Any function the model can call is a function an attacker can call.

What changed: The barrier to host-level code execution dropped from credential theft to a sentence in plain English, because the AI model is now the privileged execution boundary and it can be instructed by anyone who can reach it.

Technical Deep Dive · 3 min

The Technical Mechanism

Both vulnerabilities share a root cause: Semantic Kernel's AutoInvokeKernelFunctions behavior passes LLM-selected tool parameters into execution without sanitization, treating the model's output as trusted, deterministic input.

CVE-2026-26030 (Python SDK, CWE-94: Improper Control of Code Generation)

The InMemoryVectorStore's default filter behavior constructs a Python lambda expression by string interpolation of user-supplied values and executes it via eval(). The framework attempted to guard this with an AST-based blocklist denying node types including Import statements and explicit calls to named builtins. The blocklist was bypassable through Python's Method Resolution Order (MRO) traversal. A payload in the city parameter could walk the object hierarchy:

().__class__.__bases__[0].__subclasses__()

This chain reaches BuiltinImporter via __subclasses__(), then calls load_module to access os.system, without triggering a single blocked AST node type. In the hotel-finder agent demonstration, the attacker sends a crafted prompt that causes the LLM to call search_hotels(city=PAYLOAD), achieving arbitrary OS command execution on the host server process.

The patch in semantic-kernel Python version 1.39.4 replaced eval() with a strict safe parser enforcing four layers: an AST node-type allowlist, a function call allowlist, a dangerous-attribute blocklist covering __class__ and __subclasses__, and a name-node restriction permitting only the lambda parameter as a bare identifier.

CVE-2026-25592 (.NET SDK, CWE-22: Path Traversal)

The SessionsPythonPlugin component includes DownloadFileAsync and UploadFileAsync as host-side file-transfer helpers. Both were accidentally decorated with the [KernelFunction] attribute, registering them as callable tools in the model's function menu. The localFilePath parameter received no path validation. A hostile prompt could instruct the agent to write a malicious payload inside the Azure Container Apps sandbox, then invoke DownloadFileAsync with a localFilePath value such as C:\Users\<user>\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\malware.exe, writing the payload to the host filesystem outside the intended sandbox boundary.

The patch in Microsoft.SemanticKernel.Core version 1.71.0 removed the [KernelFunction] attribute from DownloadFileAsync, taking it out of the model's tool menu entirely, and added path validation for any remaining programmatic use.

Independent researcher Jeff Ponte (Nuka-AI) published a disclosure on April 25, 2026 claiming six bypass vectors against the CVE-2026-25592 patch targeting versions 1.47.0 through 1.48.0, identifying a TOCTOU/Type Confusion flaw where the IFunctionInvocationFilter checks string arguments but fails when the LLM outputs Base64 or a JSON array, causing the filter to evaluate to false while the underlying plugin deserializes and executes the payload. This disclosure was not corroborated by Microsoft or other Tier 1 sources at the time of this research note.

TECHNICAL EXECUTION PATH📝1LLM selects tool and parametersAutoInvokeKernelFunctions passes output2CVE-2026-26030: eval() reachedcity param interpolated into lambda string🔗3MRO traversal via __subclasses__()Reaches BuiltinImporter, bypasses blocklist💻4os.system() invokedArbitrary OS command executes on hostTRUST BOUNDARY THAT FAILEDCVE-2026-25592 follows the same pipeline but terminates at DownloadFileAsync writing to an attacker-specified host path

CVE and Advisories

  • CVE-2026-26030 (GHSA-xjw9-4gw8-4rqx): Python SDK eval() RCE via InMemoryVectorStore filter. CVSS 10.0. Fixed in semantic-kernel 1.39.4.
  • CVE-2026-25592 (GHSA-2ww3-72rp-wpp4): .NET SDK path traversal via SessionsPythonPlugin. CVSS 10.0. Fixed in Microsoft.SemanticKernel.Core 1.71.0.

Both CVEs were assigned by GitHub_M (Microsoft) and carry the CVSS vector AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H.

MITRE ATT&CK Mapping

Technique IDATT&CK nameHow it appeared
T1190Exploit Public-Facing ApplicationThe agent's natural-language input endpoint is the public-facing surface exploited via prompt injection.
T1059.006Command and Scripting Interpreter: PythonCVE-2026-26030 achieves code execution through Python's eval() function reached via MRO traversal.
T1105Ingress Tool TransferCVE-2026-25592 uses DownloadFileAsync to transfer a malicious payload from the sandbox to the host filesystem.
T1547.001Boot or Logon Autostart Execution: Registry Run Keys / Startup FolderCVE-2026-25592 achieves persistence by writing the payload to the Windows Startup folder.
T1134Access Token ManipulationPost-exploitation pivot: the agent process's credentials and network access become available to the attacker.

Indicators of Compromise

Traditional network and endpoint indicators are limited for this attack class because exploitation flows through the agent's normal tool-calling telemetry. Microsoft advises defenders to define the vulnerable window per deployment (first deployment of a vulnerable version to patch application date) and hunt within that window for:

  • Anomalous child processes spawned by the agent host process (e.g., cmd.exe, powershell.exe, calc.exe as a proof-of-concept indicator)
  • Log entries showing DownloadFileAsync or UploadFileAsync calls containing path traversal sequences (../, ..\, or absolute paths outside the intended directory)
  • Unexpected file creation in system directories, startup folders, or AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\
  • eval() execution traces in Python agent logs with city or filter parameters containing __subclasses__, __class__, or load_module strings

Detection requires correlating AI model telemetry (what the model was asked to do) with host execution telemetry (what the process actually did). Neither signal alone is sufficient to identify exploitation.

Attribution

Both CVE-2026-26030 and CVE-2026-25592 were discovered and disclosed by Microsoft Security Research as part of a stated research series on AI agent framework security. The May 7, 2026 blog post notes a follow-on post is planned covering structurally similar vulnerabilities in third-party agent frameworks. No external threat actor exploitation has been reported. Independent researcher Jeff Ponte (Nuka-AI / JDP-Security) separately identified alleged bypass vectors against the CVE-2026-25592 patch; that disclosure remains unconfirmed by Microsoft or Tier 1 sources.


Primary Sources

  1. 01.
  2. 02.
    CVE-2026-25592 Detail (GHSA-2ww3-72rp-wpp4)

    ThreatInt CVE Database (sourcing GitHub_M advisory) · February 06, 2026

  3. 03.
    CVE-2026-26030 Detail (GHSA-xjw9-4gw8-4rqx)

    ThreatInt CVE Database (sourcing GitHub_M advisory) · February 19, 2026

  4. 04.
    CVE-2026-25592: Semantic Kernel Path Traversal Flaw

    SentinelOne Vulnerability Database · February 06, 2026

  5. 05.
  6. 06.
    AIAgentCTF CVE-2026-26030 Challenge Repository

    GitHub (amiteliahu, Microsoft Security Research) · May 07, 2026

  7. 07.