May 7, 2026ai security

How a natural-language prompt became a shell command inside a company's AI agent

The AI agent was doing exactly what it was designed to do: reading a user's request, selecting the right tool, and passing the user's input into that tool as a parameter. That is also precisely how an attacker ran arbitrary commands on the server hosting the agent. No malware, no browser exploit, no stolen credentials. A single sentence in plain English was enough.

Two vulnerabilities in Microsoft's Semantic Kernel, an open-source framework used to build AI agents across industries, carried a perfect CVSS score of 10.0. One allowed an attacker to execute operating system commands on the host server by smuggling a payload through a hotel search query. The other allowed an attacker to plant a file anywhere on the host filesystem, including the Windows Startup folder, by convincing the agent to "download" a file to an attacker-specified location. Both attacks required only a crafted prompt. Both worked because the framework treated the AI model's output as trusted, deterministic input, equivalent to hardcoded backend logic.

The detail that should keep security leaders awake: the vulnerability existed the moment a dangerous function was registered as an agent tool, not the moment it was invoked. By the time a prompt arrived, the attack surface was already open. Runtime filters and prompt guardrails were structurally too late.

🟡 Level 2

Read the narrative

How the attack unfolded, phase by phase.

6 min read

🔴 Level 3

Go deeper

Technical mechanism, CVE, ATT&CK mapping, primary sources.

3 min read

Does this apply to you?

If your organization has deployed any AI agent or internal copilot built on the Microsoft Semantic Kernel Python SDK before version 1.39.4, or the .NET SDK before version 1.71.0, you may have a window of exposure that runs from the day that agent went live to the day the patch was applied. If the agent accepted user-supplied text and routed it to tools with filesystem access or code execution capabilities, your blast radius may include full host compromise, arbitrary file writes, and access to any system the agent process could reach on your network. This pattern applies beyond Semantic Kernel: any agent framework where the AI model selects tools and passes parameters without treating those parameters as attacker-controlled input carries the same structural risk. Check whether your agent deployments, including internal hackathon projects and experimental copilots, are running patched versions; untracked agent deployments are the most likely source of unpatched exposure.

Narrative · 6 min read

The Context

Microsoft Semantic Kernel is an open-source framework that lets developers build AI agents: software that takes a user's natural-language request, hands it to a large language model (LLM), and then executes whatever tools or functions the model decides are appropriate. Think of it as the plumbing between a chatbot and the actual systems it can act on. As of May 2026, the framework had roughly 28,000 GitHub stars and over 1,000 contributors, and was in use across customer service agents, document analysis pipelines, internal copilots, and enterprise automation.

In February 2026, Microsoft's own security researchers discovered two critical vulnerabilities in the framework. Both were patched before the May 7 public disclosure. The primary risk window was the period between when each patch was released and when individual operators applied it.

Key terms

Semantic Kernel: Microsoft's open-source framework for building AI agents. It connects a large language model to a set of registered tools (called kernel functions) and automatically invokes those tools based on the model's decisions.
Kernel function: A function registered with Semantic Kernel that the AI model is allowed to call. If a function is registered, the model can invoke it and pass it parameters, just as a user might click a button in an app.
Prompt injection: An attack where a malicious instruction is embedded in user-supplied text. The AI model reads it as a legitimate instruction and acts on it, even if the system was designed to prevent that.
eval(): A Python built-in function that takes a string and executes it as code. If attacker-controlled text reaches eval(), the attacker can run arbitrary commands on the host machine.
Path traversal: An attack where a file path like ../../startup/malware.exe is used to write or read files outside the intended directory, reaching sensitive locations on the host filesystem.

The Attack, Phase by Phase

Phase 1: Prompt Injection Entry

The attacker identifies an AI agent built on Semantic Kernel that accepts user-controlled natural-language input. For CVE-2026-26030, the target is any agent using the Search Plugin backed by the InMemoryVectorStore component in its default configuration. For CVE-2026-25592, the target is any .NET agent using the SessionsPythonPlugin, a component designed to let agents run Python code inside a cloud-hosted sandbox.

The attacker crafts a prompt designed to manipulate the LLM into selecting a specific tool and passing attacker-controlled values as that tool's parameters. The system prompt may contain instructions like "only answer questions about hotels," but those instructions are advisory to the model, not enforced by the framework. The model is the decision-maker, and the model can be misled.

Phase 2: Framework Trust Exploitation

The LLM, acting exactly as designed, selects the target tool and passes the attacker-controlled parameter into the framework's execution pipeline. This is where the two vulnerabilities diverge in mechanics but share the same root cause.

In CVE-2026-26030, the Search Plugin's filter logic was built as a Python lambda expression and executed using eval(). The framework attempted to block dangerous inputs using a list of prohibited code patterns. The attacker's payload bypassed this list by walking Python's internal object hierarchy—a technique that reaches OS command execution without using any explicitly blocked patterns. The attacker's city parameter, passed in by the LLM, contained the payload. The framework executed it.

In CVE-2026-25592, the SessionsPythonPlugin included a file-transfer helper called DownloadFileAsync. A developer had accidentally marked it with an attribute that registered it as a callable kernel function, meaning the AI model could invoke it. The function accepted a file path with no validation. A hostile prompt could instruct the agent to write a malicious file inside the cloud sandbox, then invoke DownloadFileAsync to pull that file to any location on the host filesystem—including the Windows Startup folder.

Phase 3: Host-Level Code Execution or Persistence

For CVE-2026-26030, the eval() sink executes the attacker's OS command directly on the server running the agent. Microsoft's own demonstration launched calc.exe on the host machine. In a real attack, this step would exfiltrate data, install a backdoor, or enable lateral movement.

For CVE-2026-25592, the attacker's file lands in a persistence location. On the next restart, the payload executes with the privileges of the agent host process. Any credentials, network access, or data the agent holds are then available to the attacker.

Phase 4: Detection Gap

Because the attack flows through the agent's normal tool-calling behavior, standard endpoint and network defenses may not fire. There is no shellcode, no memory corruption, no unusual network traffic at the point of entry. Microsoft advises defenders to define the vulnerable window per deployment—from first deployment of a vulnerable version to the moment the patch was applied—and hunt within that window for anomalous child processes spawned by the agent, unexpected file writes to startup or system directories, and log entries showing DownloadFileAsync or UploadFileAsync calls with path traversal sequences.

What Made This Possible

The framework trusted the model's output. Semantic Kernel's AutoInvokeKernelFunctions behavior passes LLM-selected tool parameters directly into execution without treating them as attacker-controlled input. The model is a non-deterministic system that can be manipulated by the text it receives. Treating its output as equivalent to hardcoded backend logic is the foundational error.
Dangerous functions were registered as agent tools. The eval() sink in the Search Plugin and DownloadFileAsync in the SessionsPythonPlugin were both reachable by the model because they were registered as kernel functions. The attack surface was created at registration time, not at invocation time. Runtime filters cannot close a gap opened at design time.
Agent deployments entered organizations without security inventory. Agent frameworks tend to arrive through experiments and internal copilots. Security teams often lack a clean inventory of which versions are running, which tools are exposed, and where sandbox boundaries are. An untracked deployment is an unpatched deployment.

What Should Have Stopped This

Treating the AI model's output as untrusted input—the same way a web application treats a form submission from an anonymous user—is the unifying principle behind every defense that would have reduced the blast radius here.

Never register functions with dangerous capabilities as kernel tools. Functions that execute code or write files to arbitrary paths should not be in the model's tool menu. If the model cannot invoke them, prompt injection cannot reach them.
Apply an allowlist, not a blocklist, for tool parameters. The original eval() guard used a blocklist of prohibited patterns. Blocklists fail when attackers find unlisted patterns. An allowlist that permits only known-safe inputs is structurally harder to bypass.
Treat agent deployments as production infrastructure. Patch management, version inventory, and security review should apply to agent frameworks the same way they apply to web servers. An agent with filesystem access and network credentials is not a developer experiment.
Correlate AI-layer and host-layer telemetry. Detecting this class of attack requires connecting what the model was asked to do with what the host process actually did. Neither signal alone is sufficient.

The Takeaway

These two vulnerabilities are not bugs in the traditional sense. There was no memory corruption, no protocol flaw, no cryptographic weakness. The attack worked because the framework was designed to let the AI model control what happens next—and the model can be told what to do by anyone who can send it a message.

The systemic lesson is one of attack surface registration. The vulnerability was baked in the moment a dangerous function was added to the model's tool menu. Runtime defenses are structurally too late. The only durable fix is to never expose functions with dangerous capabilities to autonomous model invocation in the first place.

Pattern to remember: In AI agent frameworks, the attack surface is defined at tool registration time, not at runtime. Any function the model can call is a function an attacker can call.

What changed: The barrier to host-level code execution dropped from credential theft to a sentence in plain English, because the AI model is now the privileged execution boundary and it can be instructed by anyone who can reach it.

Technical Deep Dive · 3 min

The Technical Mechanism

Both vulnerabilities share a root cause: Semantic Kernel's AutoInvokeKernelFunctions behavior passes LLM-selected tool parameters into execution without sanitization, treating the model's output as trusted, deterministic input.

CVE-2026-26030 (Python SDK, CWE-94: Improper Control of Code Generation)

The InMemoryVectorStore's default filter behavior constructs a Python lambda expression by string interpolation of user-supplied values and executes it via eval(). The framework attempted to guard this with an AST-based blocklist denying node types including Import statements and explicit calls to named builtins. The blocklist was bypassable through Python's Method Resolution Order (MRO) traversal. A payload in the city parameter could walk the object hierarchy:

().__class__.__bases__[0].__subclasses__()

This chain reaches BuiltinImporter via __subclasses__(), then calls load_module to access os.system, without triggering a single blocked AST node type. In the hotel-finder agent demonstration, the attacker sends a crafted prompt that causes the LLM to call search_hotels(city=PAYLOAD), achieving arbitrary OS command execution on the host server process.

The patch in semantic-kernel Python version 1.39.4 replaced eval() with a strict safe parser enforcing four layers: an AST node-type allowlist, a function call allowlist, a dangerous-attribute blocklist covering __class__ and __subclasses__, and a name-node restriction permitting only the lambda parameter as a bare identifier.

CVE-2026-25592 (.NET SDK, CWE-22: Path Traversal)

The SessionsPythonPlugin component includes DownloadFileAsync and UploadFileAsync as host-side file-transfer helpers. Both were accidentally decorated with the [KernelFunction] attribute, registering them as callable tools in the model's function menu. The localFilePath parameter received no path validation. A hostile prompt could instruct the agent to write a malicious payload inside the Azure Container Apps sandbox, then invoke DownloadFileAsync with a localFilePath value such as C:\Users\<user>\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\malware.exe, writing the payload to the host filesystem outside the intended sandbox boundary.

The patch in Microsoft.SemanticKernel.Core version 1.71.0 removed the [KernelFunction] attribute from DownloadFileAsync, taking it out of the model's tool menu entirely, and added path validation for any remaining programmatic use.

Independent researcher Jeff Ponte (Nuka-AI) published a disclosure on April 25, 2026 claiming six bypass vectors against the CVE-2026-25592 patch targeting versions 1.47.0 through 1.48.0, identifying a TOCTOU/Type Confusion flaw where the IFunctionInvocationFilter checks string arguments but fails when the LLM outputs Base64 or a JSON array, causing the filter to evaluate to false while the underlying plugin deserializes and executes the payload. This disclosure was not corroborated by Microsoft or other Tier 1 sources at the time of this research note.

CVE and Advisories

CVE-2026-26030 (GHSA-xjw9-4gw8-4rqx): Python SDK eval() RCE via InMemoryVectorStore filter. CVSS 10.0. Fixed in semantic-kernel 1.39.4.
CVE-2026-25592 (GHSA-2ww3-72rp-wpp4): .NET SDK path traversal via SessionsPythonPlugin. CVSS 10.0. Fixed in Microsoft.SemanticKernel.Core 1.71.0.

Both CVEs were assigned by GitHub_M (Microsoft) and carry the CVSS vector AV:N/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H.

MITRE ATT&CK Mapping

Technique ID	ATT&CK name	How it appeared
T1190	Exploit Public-Facing Application	The agent's natural-language input endpoint is the public-facing surface exploited via prompt injection.
T1059.006	Command and Scripting Interpreter: Python	CVE-2026-26030 achieves code execution through Python's eval() function reached via MRO traversal.
T1105	Ingress Tool Transfer	CVE-2026-25592 uses DownloadFileAsync to transfer a malicious payload from the sandbox to the host filesystem.
T1547.001	Boot or Logon Autostart Execution: Registry Run Keys / Startup Folder	CVE-2026-25592 achieves persistence by writing the payload to the Windows Startup folder.
T1134	Access Token Manipulation	Post-exploitation pivot: the agent process's credentials and network access become available to the attacker.

Indicators of Compromise

Traditional network and endpoint indicators are limited for this attack class because exploitation flows through the agent's normal tool-calling telemetry. Microsoft advises defenders to define the vulnerable window per deployment (first deployment of a vulnerable version to patch application date) and hunt within that window for:

Anomalous child processes spawned by the agent host process (e.g., cmd.exe, powershell.exe, calc.exe as a proof-of-concept indicator)
Log entries showing DownloadFileAsync or UploadFileAsync calls containing path traversal sequences (../, ..\, or absolute paths outside the intended directory)
Unexpected file creation in system directories, startup folders, or AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\
eval() execution traces in Python agent logs with city or filter parameters containing __subclasses__, __class__, or load_module strings

Detection requires correlating AI model telemetry (what the model was asked to do) with host execution telemetry (what the process actually did). Neither signal alone is sufficient to identify exploitation.

Attribution

Both CVE-2026-26030 and CVE-2026-25592 were discovered and disclosed by Microsoft Security Research as part of a stated research series on AI agent framework security. The May 7, 2026 blog post notes a follow-on post is planned covering structurally similar vulnerabilities in third-party agent frameworks. No external threat actor exploitation has been reported. Independent researcher Jeff Ponte (Nuka-AI / JDP-Security) separately identified alleged bypass vectors against the CVE-2026-25592 patch; that disclosure remains unconfirmed by Microsoft or Tier 1 sources.

Primary Sources

01.
When prompts become shells: RCE vulnerabilities in AI agent frameworks
Microsoft Security Blog · May 07, 2026
02.
CVE-2026-25592 Detail (GHSA-2ww3-72rp-wpp4)
ThreatInt CVE Database (sourcing GitHub_M advisory) · February 06, 2026
03.
CVE-2026-26030 Detail (GHSA-xjw9-4gw8-4rqx)
ThreatInt CVE Database (sourcing GitHub_M advisory) · February 19, 2026
04.
CVE-2026-25592: Semantic Kernel Path Traversal Flaw
SentinelOne Vulnerability Database · February 06, 2026
05.
Semantic Kernel Prompt Injection Bugs Let Attackers Run Code or Write Files
Windows Forum · May 07, 2026
06.
AIAgentCTF CVE-2026-26030 Challenge Repository
GitHub (amiteliahu, Microsoft Security Research) · May 07, 2026
07.
Microsoft's Semantic Kernel: The Cracked Kernel (Nuka-AI Research Series, Disclosure 1)
Nuka-AI (Jeff Ponte, JDP-Security) · April 25, 2026