MCP Tool Annotations: What They Are, Why They Matter, and What's Coming Next

When an AI agent connects to an MCP server, it receives tool definitions — names, descriptions, and input schemas. What it usually does not receive is any indication of whether a tool reads data or modifies it, whether its effects are reversible, or whether it interacts with the open internet. The MCP specification includes a mechanism for exactly this information: tool annotations. But adoption has been slow, and the consequences of that gap are becoming clear.

The Five Annotations

Tool annotations were introduced in the MCP spec’s 2025-03-26 revision via PR #185, authored by Basil Hosmer at Anthropic. The current spec defines five annotation fields on every tool:

title is a human-readable display name. It has no security implications — it exists purely for UX.

readOnlyHint indicates whether the tool modifies its environment. If true, the tool only reads data. Default: false.

destructiveHint indicates whether the tool may perform destructive updates (deleting, overwriting) as opposed to additive ones (creating, appending). Only meaningful when readOnlyHint is false. Default: true.

idempotentHint indicates whether calling the tool repeatedly with the same arguments has no additional effect. Only meaningful when readOnlyHint is false. Default: false.

openWorldHint indicates whether the tool interacts with an open world of external entities — the internet, third-party APIs, external services. A web search tool is open-world. A local memory tool is not. Default: true.

All five are optional. All are hints, not guarantees. The spec is explicit: clients “MUST consider tool annotations to be untrusted unless they come from trusted servers.”

MCP tool annotation fields and their pessimistic default values

Why the Defaults Are Pessimistic

The default values for tool annotations follow what the spec calls a “cautious posture.” A tool with no annotations is assumed to be:

Not read-only (it might modify things)
Potentially destructive (it might delete things)
Not idempotent (calling it twice might cause different effects)
Open-world (it might interact with external services)

This is the worst-case assumption for every dimension. The design is intentional: if a server author does not bother to annotate their tools, agents should treat those tools with maximum caution rather than assuming they are safe.

This matters because most MCP servers ship without annotations. When every unannotated tool is assumed destructive and open-world, agents cannot make intelligent decisions about which tools are safe to call without human confirmation, which tools can run in parallel, or which tool combinations create security risks.

How Clients Use Annotations Today

The clients that do consume annotations use them in meaningfully different ways:

Claude Code uses readOnlyHint to determine parallelism. Tools marked read-only execute concurrently at roughly double the dispatch rate. Non-read-only tools serialize to prevent conflicting mutations.

ChatGPT (in dev mode) displays tools as “READ” or “WRITE” based on readOnlyHint. Tools without the annotation default to a “WRITE” badge, following the pessimistic default.

VS Code Copilot shows confirmation dialogs for all tools not marked readOnlyHint: true. Read-only tools execute without prompting the user.

GitHub’s MCP Server implements a /readonly URL suffix that filters to only read-annotated tools. According to Sam Morrow from GitHub’s MCP team, only about 17% of users enable it.

The pattern is consistent: clients that use annotations use them to gate confirmation prompts, control parallelism, or filter tool sets. But no client currently lets users filter or sort tools by annotation values through their UI.

The Lethal Trifecta

Simon Willison formalized the security risk that tool annotations are trying to address in his June 2025 post on the “lethal trifecta” for AI agents. The trifecta is the combination of three capabilities:

Access to private data — tools that can read sensitive information
Exposure to untrusted content — any path by which attacker-controlled text reaches the agent
Ability to externally communicate — capacity to send data outside the system

When all three are present in a single agent session, an attacker who controls any untrusted content the agent processes can potentially exfiltrate private data. The attack surface exists in the composition of tools, not in any individual tool.

Willison specifically called out MCP as vulnerable because it “encourages users to mix and match tools from different sources.” A database reader, a web scraper, and an email sender are each safe individually. Combined in one agent session, they form the complete trifecta.

The current annotation set partially maps to these dimensions. readOnlyHint distinguishes readers from writers. openWorldHint flags external interaction. But there is no annotation for “accesses sensitive data,” and openWorldHint conflates reading untrusted input with sending data out — two fundamentally different risks.

The lethal trifecta: private data access, untrusted content, and external communication

The Adoption Problem

The fundamental challenge with tool annotations is that they are optional. Server authors who do not add them face no consequences — their tools still work. The pessimistic defaults were designed to incentivize annotation, but in practice they have not been sufficient.

The MCP official blog post draft (PR #2230, authored by Ola Hungerford with Sam Morrow from GitHub and Luca Chang from AWS) captures the state of adoption: “Client implementations vary wildly in support and many don’t actually follow the worst-case presumption in their absence.”

Sam Morrow’s review comments on that PR are especially revealing. He identifies a “schism between autonomy maximalists (vibe coders) and enterprise AI adopters” — the former see no need for annotations because they want agents to act freely, while the latter require significant guardrails. This tension has slowed convergence on how annotations should be used.

The adoption gap creates a self-reinforcing cycle. Server authors do not add annotations because most clients ignore them. Clients do not invest in annotation-based features because most servers do not provide them.

What’s Coming Next

The community is actively working to extend annotations beyond the current five fields. At least five independent proposals (SEPs) have been filed on the MCP spec repository, and a Discussion (#2382) from March 2026 frames the problem directly in terms of the lethal trifecta:

sensitiveHint would indicate that a tool accesses or returns sensitive data — credentials, PII, financial records. This addresses the first dimension of the lethal trifecta that the current annotations miss entirely.

egressHint would indicate that a tool can transmit data outside the system boundary — sending email, uploading files, publishing content. This splits openWorldHint into its two distinct risk components: reading untrusted input versus actively exfiltrating data.

reversibleHint would distinguish between tools whose effects can be undone (moving to trash) and those that cannot (permanent deletion). This is orthogonal to destructiveHint — a destructive tool might still be reversible.

A more comprehensive proposal (SEP-1913, from Sam Morrow at GitHub) adds response-level annotations with propagation rules: sensitivity escalates within a session and never decreases, boolean hints use union semantics, and attribution accumulates across context boundaries.

A Security Interest Group has formed within the MCP community to coordinate these proposals, though formal governance is still being established.

Developer workstation with security status indicators

How Proxies Add Value

MCP proxies are uniquely positioned to make annotations useful even when servers do not provide them. A proxy that sits between agents and MCP servers can:

Derive annotations from behavior. By observing what a tool actually does at runtime — whether it makes network requests, modifies files, or returns sensitive-looking data — a proxy can infer annotations that the server author did not provide.

Enforce annotation-based policies. Even when annotations are just hints, a proxy can treat them as policy inputs. Tools marked destructiveHint: true can require human confirmation. Tools marked openWorldHint: true can be restricted to specific agent sessions.

Map annotations to access control. A proxy can translate the read/write/destructive spectrum into concrete permission levels, routing tool calls through different authorization paths based on their annotations.

Detect annotation changes. If a server’s tool annotations change between sessions — a previously read-only tool suddenly claiming to be destructive, or vice versa — a proxy can flag this as suspicious, similar to schema drift detection.

This is the approach that MCPProxy takes with its DeriveCallWith system, which maps tool annotations to distinct call_tool_read, call_tool_write, and call_tool_destructive variants, each with their own permission model.

What You Should Do Now

If you are building MCP servers:

Annotate your tools. Every tool should have at minimum readOnlyHint and destructiveHint set accurately. This costs nothing and immediately benefits clients that use annotations.
Be honest about open-world interaction. If your tool makes network requests to external services, set openWorldHint: true. If it only accesses local resources, set it to false.
Watch the proposal process. The new annotations being proposed — particularly sensitiveHint and egressHint — will likely land in a future spec revision. Designing your tool descriptions with these dimensions in mind now will make adoption easier later.

If you are building MCP clients:

Respect the pessimistic defaults. A tool without annotations should be treated as potentially destructive and open-world. This is what the spec requires.
Use annotations for UX decisions. Read-only tools can run without confirmation. Destructive tools should prompt. This is low-effort and immediately improves the user experience.
Consider the lethal trifecta. When an agent session combines tools that access private data, process untrusted content, and can communicate externally, the annotation metadata provides the information needed to flag or prevent this combination.

Tool annotations are the MCP ecosystem’s first attempt at a risk vocabulary for AI agent tools. The vocabulary is incomplete, adoption is uneven, and the fundamental tension between hints and guarantees is unresolved. But the direction is clear: agents need metadata about what tools do, not just how to call them. The current five annotations are a starting point, not an endpoint.