/blog
mcpsecurityscanningecosystemadmission-control

Nobody Is Checking: What Three Independent Scans of 14,000+ MCP Servers Reveal

Three independent teams scanned 14,000+ MCP servers in 30 days. All found the same vulnerabilities. All ended with the same recommendation. None of them could enforce it.

Algis Dumbris
Nobody Is Checking: What Three Independent Scans of 14,000+ MCP Servers Reveal

In March and April 2026, three independent research teams — with no coordination between them — scanned a combined 14,000+ MCP servers for security vulnerabilities. They used different methodologies, different tooling, and different scoring frameworks. They arrived at the same conclusion: the MCP ecosystem has a systemic admission control problem that scanning alone cannot fix.

This is not a story about one bad package or one clever exploit. It is a story about convergent evidence at scale, and the structural gap it reveals.

The Scan Convergence

Within a 30-day window, three separate studies published their results:

AgentSeal scanned 8,400+ MCP servers from public registries and package repositories. Their automated analysis filed 87 CVEs against servers with confirmed vulnerabilities. The scope was broad: authentication checks, input validation, default configurations, and known vulnerability patterns across the full public server corpus.

Dominion Observatory took a different approach, applying dynamic behavioral scoring to 4,584 servers. Rather than static analysis alone, their methodology included runtime behavior observation — watching what servers actually did when invoked, not just what their schemas declared. Each server received a composite security score based on observed behavior patterns.

Protodex scored all 2,013 servers in their dataset, producing per-server security ratings covering authentication, transport security, input handling, and privilege exposure. Every server was evaluated. No sampling, no selection bias.

Three teams. Three methodologies. 14,997 servers scanned in total. The overlap in their findings is what makes this significant.

What Every Scan Found

The vulnerability categories were consistent across all three studies. The specific numbers varied, but the patterns did not.

Missing authentication is endemic. Across all three datasets, a majority of servers exposed tools with no authentication mechanism. No API keys, no OAuth, no token validation. Any client that could reach the server could invoke its tools. This was not limited to hobby projects — servers from established vendors and enterprise-focused packages showed the same pattern.

Input validation is rare. SSRF, path traversal, and command injection vulnerabilities appeared at scale in all three scans. Servers accepted arbitrary user-controlled input and passed it directly to system calls, file operations, or network requests without sanitization. AgentSeal’s 87 CVE filings were dominated by these categories. Protodex’s per-server scores reflected the same distribution. Dominion Observatory’s behavioral scoring flagged servers that accepted and executed payloads that no production system should allow.

Dangerous defaults are the norm. Servers binding to 0.0.0.0 instead of localhost. No TLS in transit. Elevated privilege requirements with no justification. These are not edge cases — they represent the default configuration for a significant portion of the scanned corpus.

The vulnerability distribution is uniform across vendor tiers. This is perhaps the most striking finding. Community-maintained packages and enterprise-published packages fail at the same rates. The assumption that “serious” vendors ship more secure servers is not supported by the data. The problem is structural, not a matter of individual developer competence.

What Scanners Tell You vs. What They Do Not

Pre-deploy vulnerability scanning is valuable. Finding and reporting 87 CVEs before those servers are deployed in production environments is unambiguously good work. Scoring every server in a registry gives operators data they did not have before. Dynamic behavioral analysis catches things static analysis misses.

But scanning has inherent limits that the ecosystem needs to be honest about.

Scanners detect known vulnerability patterns. They find SSRF because SSRF has a signature. They find path traversal because path traversal has a signature. They find missing authentication because authentication presence is a checkable property. This is useful and necessary work.

Scanners cannot detect what does not yet have a signature. Zero-day vulnerabilities, by definition, are not in the scanner’s pattern database. A server that is clean today can become compromised tomorrow through a supply chain swap — a dependency update that introduces malicious behavior without changing the server’s own code. Behavioral changes after a patch, a new CVE class that nobody has written detection rules for yet — these are outside the scanner’s reach.

The practical consequence is that “scan it first, run it anyway” is not a complete security posture. A scan gives you a point-in-time assessment against known patterns. It does not give you ongoing runtime assurance, and it does not protect against novel attack vectors.

This is not a criticism of scanning tools. It is a statement about the limits of any pre-deploy check operating in isolation.

The Activation Gap

Every scan report published in this 30-day window ended with some version of the same recommendation: check servers before installing them. Review tool schemas. Validate configurations. Verify authentication.

This is good advice. It is also advice that has no enforcement mechanism.

The MCP protocol, as currently specified, allows any server to connect to a client and expose tools with zero admission control. There is no built-in checkpoint between “a server exists” and “a server’s tools are available to agents.” The protocol does not define a quarantine state, does not require approval before tool activation, and does not specify any mechanism for an operator to review what a server does before it starts doing it.

The gap is not in knowledge — the scan data makes the risks clear. The gap is in architecture. There is no structural mechanism that prevents a vulnerable or malicious server from activating its tools the moment it connects. The protocol assumes that if a server is configured, it is trusted.

This is the activation gap: the distance between “we know servers should be checked” and “the system prevents activation until they are checked.” Every scan report identifies the problem. None of them can close it, because the gap exists at the protocol layer, not the scanning layer.

What the Ecosystem Needs

The architectural pattern that addresses the activation gap is admission control — a gateway layer that sits between MCP servers and the agents that consume their tools, preventing any new server from surfacing tools until it has been explicitly reviewed and approved.

Multiple teams are converging on this pattern independently. MCPProxy implements gateway-layer quarantine where new servers are held in a non-active state until an operator approves them. IBM’s ContextForge enforces policy-based admission at the infrastructure layer. Gravitee’s MCP gateway applies API management patterns — rate limiting, authentication enforcement, schema validation — to MCP server traffic.

The specific tool matters less than having the pattern in place. What these approaches share is a common architectural principle: default-deny for new servers, with explicit operator approval required before tools become available to agents. This inverts the current protocol default of implicit trust.

The admission control pattern also creates a natural integration point for scanning. Rather than “scan and hope operators read the report,” scan results feed into the admission decision. A server with known CVEs stays in quarantine. A server with no authentication stays in quarantine. A server that passes scanning proceeds to human review. Scanning becomes one input to an automated workflow rather than a standalone advisory.

Post-approval, the pattern extends to runtime isolation — network egress controls, resource limits, and behavioral monitoring that continues after a server is admitted. Admission control is not a one-time gate; it is the entry point to an ongoing trust lifecycle.

The Numbers That Matter

The convergent data from these three independent studies tells a clear story:

  • 14,000+ servers scanned across three studies
  • 87 CVEs filed from AgentSeal’s analysis alone
  • 30+ CVEs disclosed across the MCP ecosystem in the past 90 days
  • Uniform vulnerability distribution across community and enterprise packages
  • Zero protocol-level mechanisms enforcing pre-activation review

The data says the same thing the protocol architecture says: trust-by-default is not a defensible posture for an ecosystem where agents execute tool calls with real-world consequences.

Scanning is necessary. It is not sufficient. The ecosystem needs the structural layer that turns scan results into enforceable admission decisions — and the MCP protocol needs to acknowledge that servers should not be trusted simply because they exist.


Sources: AgentSeal Discussion #720 (8,400+ servers scanned), Protodex security scoring (2,013 servers, dev.to April 2026), Dominion Observatory behavioral analysis (4,584 servers, dynamic scoring).