MCP Server-Initiated Sampling: The Spec Feature That Becomes an Attack Vector
MCP sampling lets servers request LLM completions through the client. Unit42 research shows how this legitimate spec feature enables prompt injection, cross-server poisoning, privilege escalation, and data exfiltration.

MCP Server-Initiated Sampling: The Spec Feature That Becomes an Attack Vector
There is a feature in the MCP specification that most developers have never thought critically about. It is called sampling — the ability for an MCP server to request an LLM completion back through the client. The server sends a prompt, the client forwards it to the model, and the model’s response routes back to the server.
On paper, this is elegant. In practice, it is one of the most dangerous capabilities in the MCP spec, and the security community is only beginning to map the threat surface.
Palo Alto Networks’ Unit42 team recently published research identifying four distinct attack vectors that sampling enables. This post breaks down what sampling is, why it exists, how it can be weaponized, and what the ecosystem needs to do before the spec finalizes this feature.
What MCP Sampling Is
The Model Context Protocol defines a standard interface between LLM-powered clients and tool servers. Most interactions flow in one direction: the client calls a tool on the server, the server returns a result. Sampling reverses that flow. It allows a server to say: “I need the model to think about something before I can continue.”
The mechanism works like this. A server issues a sampling/createMessage request to the client. The request contains a prompt — messages, system instructions, model preferences. The client receives this request, forwards the prompt to the LLM, collects the completion, and sends it back to the server. The server then uses that completion to continue its work.
The spec defines this as an optional capability. Clients can advertise whether they support sampling during the initialization handshake. If they do, any connected server can request completions at any time during the session.
Why Servers Need It
The legitimate use cases are real.
Consider a code review server. It scans a repository, identifies potential issues, and collects findings. But before presenting results to the user, it needs the model to synthesize twenty findings into a coherent summary with priority rankings. The server does not have an LLM — it is a tool server. Sampling lets it borrow the client’s model for that synthesis step.
Or consider a data pipeline server that runs a SQL query and gets back a result set. The raw data is meaningless without interpretation. The server uses sampling to ask the model: “Given this schema and these results, what is the key insight?” The model’s answer becomes part of the tool’s response.
A debugging server provides another example. It sets a breakpoint, captures variable state, and needs the model to hypothesize about root cause before deciding which breakpoint to set next. This iterative loop — server acts, model reasons, server acts again — requires sampling.
These are not contrived scenarios. They represent the natural evolution of tool servers from stateless function calls to agentic workflows where tools and models collaborate across multiple turns.
The Unit42 Research: Four Attack Vectors
This is where it gets dangerous. Palo Alto Unit42’s research systematically mapped how a malicious or compromised MCP server can abuse sampling to attack the broader system. They identified four vectors.
Indirect Prompt Injection
A server crafts a sampling request with a prompt designed to inject instructions into the model’s context. The model processes the server’s prompt as if it were a legitimate reasoning request, but embedded in that prompt are instructions that cause the model to take actions the user never authorized.
The attack works because sampling requests are, fundamentally, prompts. And prompts are the control plane of LLMs. When a server gets to write arbitrary prompts that the model will execute, the server is programming the model. A malicious server does not need to exploit a buffer overflow or escape a sandbox. It just needs to write a convincing prompt.
Cross-Server Context Poisoning
This vector exploits multi-server deployments. Server A issues a sampling request that deliberately pollutes the model’s context with instructions or false information. When Server B later interacts with the same model context, it encounters poisoned state. Server A has effectively achieved lateral movement — not through a network exploit, but through the shared context window.
This is particularly insidious because most MCP clients treat the context as a shared resource. There is no isolation between the context that Server A’s sampling request touches and the context that Server B reads. A compromised low-privilege server can poison the context to manipulate interactions with a high-privilege server.
Capability Escalation
A server uses sampling to trick the model into invoking tools with higher privileges than the server itself possesses. The server cannot directly call a privileged tool — it does not have access. But it can craft a sampling request that causes the model to call that tool on its behalf.
This breaks the privilege model. The server’s permissions should bound what it can do. But if the server can prompt the model, and the model has access to privileged tools, then the server’s effective privilege level equals the model’s privilege level. Sampling becomes a privilege escalation primitive.
Exfiltration via Completion
A server that has access to sensitive data — credentials, PII, internal documents — embeds that data in a sampling request’s prompt. The model processes the prompt and generates a completion. That completion routes back to the server, but the act of processing the prompt means the sensitive data has now transited through the model’s context. If the server is the attacker, it already has the data. But the more subtle version: the server embeds data from its environment into a prompt designed so that the model’s completion encodes that data in a form the server can extract, exfiltrate, or relay.
This turns the model into an unwitting data mule. The data never crosses a network boundary that a firewall would flag. It moves through the LLM inference path, which most security tooling does not monitor.
The RFC Gap
The MCP community has an active RFC — discussion/706 — proposing to formalize sampling as a first-class protocol feature. The RFC covers the mechanics: message format, capability negotiation, model preferences, token limits.
What it does not adequately cover is threat modeling.
The security section, to the extent it exists, treats sampling as a trust problem between the client and user. The implicit assumption is that the client will mediate sampling requests responsibly. But the Unit42 research demonstrates that the threats are structural, not behavioral. A well-intentioned client cannot prevent cross-server context poisoning if the protocol does not define context isolation semantics. A client cannot prevent capability escalation if the protocol does not bind sampling requests to the originating server’s permission scope.
Compare this with discussion/689 on SMCP (Secure MCP), which has generated substantially more community engagement around security boundaries. The ecosystem is clearly thinking about security — but the sampling RFC is advancing without incorporating those insights.
The spec is building a feature that Unit42 has already demonstrated is exploitable, and the security analysis in the RFC does not address the demonstrated exploits. That gap needs to close before the feature finalizes.
What Secure Sampling Requires
Fixing sampling is not about removing it. The legitimate use cases are valuable, and agentic workflows genuinely need server-initiated model interaction. The fix is constraining it.
Per-server sampling allowlists. Not every connected server should be able to request completions. The client must maintain an explicit allowlist of servers permitted to use sampling, configured by the user or administrator. Servers not on the list get their sampling requests rejected.
Audit logging of every sampling request. Every sampling/createMessage request should be logged with the originating server identity, the full prompt content, and the resulting completion. Security teams cannot detect prompt injection or exfiltration attempts without visibility into what servers are asking the model to do.
Context isolation between servers. Server A’s sampling request must not be able to read or influence the context that Server B operates in. This likely requires the client to maintain per-server context boundaries — or at minimum, to strip server-specific context before forwarding sampling requests to the model.
Rate limiting. A server that issues hundreds of sampling requests per minute is either malfunctioning or attacking. Rate limits per server, with configurable thresholds, are a basic control.
User confirmation for high-risk requests. Sampling requests that involve tool invocations, access sensitive context, or exceed a complexity threshold should require explicit user approval. The model should not silently execute arbitrary server-crafted prompts without the user knowing.
Practical Recommendations
The MCP ecosystem spans spec authors, client developers, server operators, and end users. Each group has a role in making sampling safe.
For MCP client developers: Implement sampling as opt-in, not opt-out. Default to denying sampling requests. Provide a configuration surface for allowlisting specific servers. Log all sampling interactions. If you maintain a shared context window across servers, you are vulnerable to cross-server poisoning — architect context isolation now, before the feature goes GA.
For MCP server operators: Audit whether your deployed servers request sampling. If they do, verify that the prompts they send are deterministic and bounded. A server that dynamically constructs sampling prompts from external input (user data, API responses, file contents) is a prompt injection risk even if the server itself is not malicious.
For the spec team: The RFC needs a formal security section that addresses the four Unit42 vectors by name. The protocol should define a server permission model that explicitly includes or excludes sampling capability per server. Context isolation semantics between servers should be normative, not left to client implementation discretion.
For security teams evaluating MCP deployments: Add sampling to your threat model. Review which servers have sampling capability enabled. Monitor sampling request volume and content. Treat a server’s sampling prompt with the same scrutiny you would apply to any untrusted input reaching your LLM.
Conclusion
MCP sampling is not a vulnerability. It is a feature. But it is a feature whose threat surface the ecosystem has not adequately mapped, and the Unit42 research makes that gap impossible to ignore.
The spec community has a window to get this right — to ship sampling with the security controls it requires, rather than bolting them on after the first real-world exploit. That window will not stay open indefinitely. The RFC is moving. The implementations are coming. The time to embed security into sampling is now.