/blog
securityMCPOWASPtool-poisoningagentic-ai

The OWASP MCP Top 10: A Security Framework for the AI Agent Era

The OWASP MCP Top 10 maps the most critical security risks in AI agent tool integration — from tool poisoning to context poisoning. Here is what practitioners need to know.

Algis Dumbris
The OWASP MCP Top 10: A Security Framework for the AI Agent Era

The Model Context Protocol needed its own threat taxonomy. Now it has one.

OWASP — the organization behind the Web Application Top 10 that shaped a generation of security engineering — has published the MCP Top 10, a structured framework for the most critical security risks in AI agent tool integration. The project, led by Vandana Verma Sehgal, is currently in beta under a CC BY-NC-SA 4.0 license, and it addresses a gap that has been widening for months: the absence of a shared vocabulary for reasoning about MCP security.

This is not a theoretical exercise. Over 30 CVEs have been filed against MCP implementations in the past 60 days. Research consistently shows that tool poisoning attacks succeed at alarming rates — 84.2% with auto-approval enabled, according to recent benchmarks. An audit of 17 popular MCP servers found an average security score of 34 out of 100, with 100% lacking permission declarations. The threat landscape has outpaced the defensive toolkit, and OWASP’s framework is an attempt to bring structure to the response.

Here is what each category covers, why it matters, and what practitioners should do about it.

The Ten Categories

OWASP MCP Top 10 risk categories — from token mismanagement to context injection

MCP01: Token Mismanagement and Secret Exposure

The first category addresses what is arguably the most common security failure in MCP deployments: credentials that end up where they should not be. Hard-coded API keys in MCP server configurations, long-lived tokens without rotation policies, and secrets persisted in model memory or protocol debug logs all create pathways for unauthorized access.

The risk is amplified in MCP because agents often need credentials to interact with external services on behalf of users. When those credentials leak — through prompt injection, debug traces, or simply poor configuration management — the blast radius extends to every system the agent can reach. A compromised MCP session with a valid AWS key is not just an MCP problem; it is an infrastructure problem.

What to do: Implement short-lived, scoped credentials. Never store secrets in tool descriptions or model context. Deploy secret-scanning across your MCP server configurations. Use environment-variable injection at runtime rather than baking credentials into tool definitions.

MCP02: Privilege Escalation via Scope Creep

MCP servers define what tools an agent can access, but those permissions often start broad and stay that way. Worse, permissions that were appropriate during initial setup tend to expand over time as teams add capabilities without revisiting access controls.

The result is agents with permissions far exceeding what any single task requires. A file-system MCP server that was configured to read project files might gradually gain write access, then execute permissions, then access to parent directories. Each expansion is individually justifiable; the cumulative effect is an agent that can modify your entire filesystem.

What to do: Enforce least-privilege by default. Implement automated scope expiry — permissions that require periodic re-authorization. Conduct regular access reviews. Define explicit permission boundaries per tool, not per server.

MCP03: Tool Poisoning

This is the category that has received the most research attention, and for good reason. Tool poisoning exploits a fundamental assumption in MCP’s architecture: that tool descriptions are trustworthy.

When an agent connects to an MCP server, it receives tool definitions including names, descriptions, and input schemas. These descriptions become part of the agent’s context — effectively instructions that the LLM uses to decide how to invoke tools. A malicious description can embed hidden instructions that manipulate agent behavior: exfiltrating sensitive files, ignoring safety constraints, or preferring the malicious tool over legitimate alternatives.

Invariant Labs demonstrated this concretely. Their proof-of-concept showed a poisoned add tool containing hidden <IMPORTANT> tags that directed the model to read ~/.cursor/mcp.json and ~/.ssh/id_rsa, then transmit the contents via hidden parameters — all while the user-facing UI showed only a simple math operation. The agent complied.

The attack has three major variants:

  • Direct poisoning: Hidden instructions in tool descriptions cause data exfiltration or behavior modification. The poisoned tool does not even need to be invoked — merely being loaded into the agent’s context is sufficient for the model to follow its hidden instructions.
  • Tool shadowing: A malicious MCP server injects descriptions that override the behavior of tools from trusted servers. In one experiment, attackers successfully redirected all emails to attacker-controlled addresses, even when users explicitly specified different recipients.
  • Rug pulls: A server passes initial review with clean tool definitions, then silently modifies them on subsequent connections. Since most clients approve tools once and never re-verify, the window for exploitation is indefinite.

Research from arXiv:2602.11327 found that tool spoofing achieves a 100% success rate in first-match resolution mode, where the agent selects the first tool that matches a query regardless of source server.

What to do: Implement tool pinning (hash-based verification of tool descriptions). Never auto-approve tool invocations in production. Use schema quarantine to analyze new tool definitions before exposing them to agents. Monitor for description changes across sessions.

MCP04: Supply Chain Attacks and Dependency Tampering

MCP ecosystems depend on open-source packages, connectors, and plugins distributed through registries like npm, PyPI, and dedicated MCP directories. The classic software supply chain attack vectors — typosquatting, dependency confusion, compromised maintainer accounts — all apply, but the consequences are more severe because the payload executes inside an AI agent with elevated permissions.

A compromised MCP server package does not just run code on a developer’s machine. It runs code that an AI agent trusts implicitly, with whatever permissions that agent has been granted. The registry poisoning incident on mcp.run earlier this year, where a popular database connector was replaced with a modified version that logged all query parameters to an external endpoint, demonstrated the practical impact.

What to do: Pin MCP server versions. Verify package signatures and provenance. Use lock files for MCP server dependencies. Monitor registries for suspicious updates to servers you depend on. Prefer servers with established track records and active maintenance.

MCP05: Command Injection and Execution

When an AI agent constructs and executes system commands, shell scripts, API calls, or code snippets using untrusted input — whether from user prompts, retrieved context, or third-party data sources — without proper validation, command injection becomes possible.

This is the MCP equivalent of SQL injection, but the attack surface is broader. An agent that processes user input through a terminal MCP server might construct a shell command that includes unsanitized user data. An agent that generates API calls from retrieved context might include attacker-controlled parameters. The Clinejection attack chain demonstrated how a malicious GitHub issue title could trigger a code assistant to execute npm install on a poisoned package, exploiting the composition of GitHub and terminal MCP servers.

What to do: Validate and sanitize all input before command construction. Use allowlists for permitted operations. Implement sandboxed execution environments. Apply strict input validation at every trust boundary, especially between MCP servers.

MCP06: Intent Flow Subversion

MCP enables agents to retrieve complex context that functions as a secondary instruction channel. Intent flow subversion occurs when malicious instructions embedded in that context hijack the agent’s decision-making, steering it away from the user’s original intent.

MCP attack flow — how tool poisoning progresses from malicious server to compromised agent behavior

This differs from direct prompt injection in that the attack vector is the tool context itself — data returned by MCP servers that the agent incorporates into its reasoning. A seemingly legitimate API response might contain embedded instructions. A document retrieved through a file-system MCP server might include hidden text that redirects the agent’s behavior. The attack exploits the fundamental tension in agentic AI: agents must be responsive to context to be useful, but that responsiveness makes them susceptible to context-based manipulation.

What to do: Implement clear separation between system instructions and retrieved context. Validate context sources. Use explicit intent markers and chain-of-thought logging to detect when agent behavior deviates from user intent. Monitor for context-based instruction conflicts.

MCP07: Insufficient Authentication and Authorization

This category addresses a structural problem in the MCP ecosystem: a startling proportion of servers simply do not authenticate connections. Research indicates that 38% of over 500 scanned MCP servers lack any form of authentication, meaning any agent that connects to them is trusting an endpoint that anyone on the network could impersonate or modify.

Even among servers that implement authentication, the quality varies widely. Some use static API keys without rotation. Others implement OAuth but skip token validation. Multi-agent environments introduce additional complexity: how do you verify that an agent making a request is authorized to do so on behalf of its user?

The MCP specification’s OAuth 2.1 integration provides a standard mechanism, but adoption remains inconsistent. The gap between what the spec supports and what implementations actually enforce is a persistent vulnerability.

What to do: Only connect to MCP servers that support authentication. Implement mutual authentication between agents and servers. Use OAuth 2.1 as specified in the MCP protocol. Enforce RBAC at the tool level, not just the server level. Regularly audit authentication configurations.

MCP08: Lack of Audit and Telemetry

Without comprehensive logging and real-time alerting, unauthorized actions go undetected. This is a particularly acute problem in MCP environments because agent-to-tool interactions generate complex, multi-step activity chains that are difficult to reconstruct after the fact.

Most MCP clients provide minimal logging — maybe a record of which tools were invoked, but not the full context of why, what parameters were used, or what data was returned. When an incident does occur, teams find themselves unable to determine the scope of compromise or the attack chain that led to it.

What to do: Log all tool invocations with full parameters and responses. Record context changes with timestamps. Implement immutable audit trails. Enable real-time alerting for anomalous patterns — unusual tool invocation sequences, unexpected data volumes, or access to sensitive resources.

MCP09: Shadow MCP Servers

Shadow MCP servers are unauthorized deployments that operate outside an organization’s security governance. Development teams spin up MCP servers for experimentation, then forget about them. These instances typically use default credentials, permissive configurations, and lack monitoring.

The parallel to shadow IT is direct, but the risk is higher. A shadow MCP server connected to an agent has the same trust level as an approved server. If that shadow server is compromised — or simply misconfigured — the agent will follow its instructions with the same compliance it shows to any other tool provider.

What to do: Establish centralized MCP deployment governance. Discover and inventory all MCP instances. Enforce mandatory security baselines. Implement network monitoring to detect unauthorized MCP traffic. Provide secure, approved alternatives for experimentation so teams do not need to go rogue.

MCP10: Context Injection and Over-Sharing

When context windows are shared, persistent, or insufficiently scoped, sensitive information from one task, user, or agent can leak to another. This is the confidentiality counterpart to intent flow subversion: instead of an attacker injecting instructions into context, sensitive data leaks out of it.

The risk materializes in several ways. An agent processing sensitive financial data in one session might retain that context when switching to a different task. A multi-tenant MCP deployment might share context across users. An agent’s working memory might contain credentials, personal data, or proprietary information that gets exposed through tool invocations or logging.

What to do: Implement isolated context windows per user and per task. Enforce context expiration policies. Sanitize context before sharing across agents or sessions. Apply access controls on context retrieval. Audit context contents regularly.

What the Numbers Say

The OWASP framework did not emerge in a vacuum. It codifies a body of research and incident data that has been accumulating over the past year:

  • 30+ CVEs in 60 days filed against MCP implementations, reflecting the pace at which vulnerabilities are being discovered and disclosed.
  • 84.2% success rate for tool poisoning attacks when auto-approval is enabled, according to benchmarking studies. This means that the majority of agents will follow malicious instructions embedded in tool descriptions without questioning them.
  • 38% of 500+ scanned servers lack authentication entirely, leaving the door open to impersonation and man-in-the-middle attacks.
  • 34/100 average security score across 17 popular MCP server audits, with zero servers declaring tool permissions. The baseline is not just low — it is absent.
  • 100% tool spoofing success in first-match resolution mode, where agents select the first matching tool regardless of source server.
  • FastMCP exceeds 1 million daily downloads, indicating the scale at which potentially vulnerable MCP infrastructure is being deployed.

CoSAI (Coalition for Secure AI), an OASIS-affiliated initiative, has independently published 12 threat categories encompassing approximately 40 discrete threats to AI agent systems. Their taxonomy overlaps significantly with OWASP’s, suggesting that the security community is converging on a shared understanding of the threat landscape. GitHub discussion #2402 on the MCP specification repository addresses tool integrity and semantic rug pulls directly, indicating that the protocol maintainers are aware of and actively working on these issues.

Emerging Defense Patterns

Defense in depth layers for MCP security — registry governance, schema quarantine, runtime monitoring, and context isolation

The OWASP framework maps threats, but the ecosystem is also building defenses. Several patterns are maturing:

Schema Quarantine and Tool Pinning

The most direct defense against tool poisoning and rug pulls is to verify tool definitions before they reach the agent. Schema quarantine treats new MCP server connections like untrusted code: tools are analyzed for known attack patterns, tested in isolation, and compared against baselines before being made available.

Tool pinning extends this with cryptographic verification. By hashing tool descriptions (typically SHA-256), clients can detect when a server modifies its tool definitions between sessions. Invariant Labs’ mcp-scan implements this pattern, detecting poisoning attempts, rug pulls, and cross-origin escalations. MCPDome, a Rust-based security gateway, adds schema pinning with SHA-256 hash verification as a gateway layer.

Several proxy-based tools implement schema quarantine as part of their architecture. mcpproxy-go combines BM25-based tool discovery with quarantine capabilities, filtering tool sets before they reach the agent context. Sentrial takes a policy-based approach, allowing teams to define acceptable tool behaviors declaratively.

Runtime Behavioral Monitoring

Static schema analysis catches known patterns, but runtime monitoring is necessary for detecting behavioral drift — tools that pass initial review but later modify their behavior. Tools like Golf Scanner (a Go-based CLI for discovering and auditing MCP server configurations) and Vet scan for runtime anomalies: unexpected network requests, file system access outside declared scope, and data flow patterns inconsistent with declared capabilities.

AgentArmor proposes an 8-layer security framework for AI agents that includes runtime behavioral analysis alongside traditional static checks, acknowledging that no single verification point is sufficient.

Registry Governance

The supply chain problem requires registry-level solutions. Signed packages, provenance tracking, and automated vulnerability scanning at the registry level can catch compromised servers before they reach users. The MCP ecosystem’s registries are still maturing in this regard — most lack the kind of security infrastructure that npm and PyPI have built over years of responding to supply chain attacks.

Context Isolation and Permission Boundaries

Defending against context injection and over-sharing requires architectural decisions: isolated context windows per task, strict permission boundaries per tool, and context expiration policies that prevent stale data from leaking across sessions. These are not bolt-on security features; they need to be built into the agent architecture from the start.

How Practitioners Should Respond

The OWASP MCP Top 10 is a starting point, not a checklist. The framework gives teams a shared vocabulary for discussing MCP security risks and a structure for prioritizing their defenses. Here is a practical response plan:

Immediate (this week):

  1. Inventory your MCP connections. Know exactly which servers your agents connect to, what permissions they have, and whether they authenticate. If you cannot enumerate your MCP servers, you have a shadow server problem (MCP09).

  2. Disable auto-approval. The 84.2% tool poisoning success rate applies to auto-approval configurations. Requiring human confirmation for tool invocations is the single most impactful change you can make today.

  3. Scan your configurations for secrets. Check MCP server configs, environment files, and agent prompts for hard-coded credentials (MCP01). This is low-hanging fruit with high impact.

Short-term (this month):

  1. Implement tool pinning. Use mcp-scan or equivalent tooling to hash your tool definitions and alert on changes. This directly addresses rug pull attacks (MCP03).

  2. Add authentication to all connections. If a server does not support auth, either replace it or put it behind an authenticated proxy (MCP07).

  3. Enable audit logging. At minimum, log every tool invocation with timestamps, parameters, and responses. You cannot investigate what you did not record (MCP08).

Medium-term (this quarter):

  1. Adopt a gateway architecture. Route all MCP traffic through a security gateway that can enforce policies, quarantine new tools, manage context size, and provide centralized monitoring.

  2. Implement context isolation. Ensure that sensitive data from one task or user cannot leak to another through shared context windows (MCP10).

  3. Establish MCP governance. Define organizational policies for which MCP servers are approved, how new servers are evaluated, and who is responsible for ongoing security review.

The Gap That Matters

The OWASP MCP Top 10 arrives at a critical moment. MCP adoption has reached a scale where security failures have real consequences — credential theft, data exfiltration, supply chain compromise — but defensive tooling adoption has not kept pace. The framework provides the taxonomy. The research community has quantified the risks. The tools exist.

What remains is the hardest part of any security initiative: getting practitioners to actually implement defenses before an incident forces their hand. The history of web application security suggests that shared frameworks like the OWASP Top 10 accelerate this adoption by giving teams a common language and a clear set of priorities.

MCP security is following the same trajectory that web security followed 15 years ago: rapid adoption outpacing security, a growing body of research quantifying the risks, and a gradually maturing set of tools and standards. The OWASP MCP Top 10 is one of the clearest signs yet that the ecosystem is moving from awareness to action. The question is whether that movement is fast enough to keep pace with the threat landscape.

The full framework is available at owasp.org/www-project-mcp-top-10 and the source repository at github.com/OWASP/www-project-mcp-top-10.