MCProbe
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCProbeaudit the MCP server at http://localhost:3000"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCProbe
A stdio MCP server that audits other MCP servers over the live protocol. It connects to any MCP target (stdio or HTTP), lints every tool's schema for agent-usability, then actually calls the tools with deliberately broken inputs to see how the server handles them, and returns a 0–100 conformance score with a per-dimension breakdown rendered as Markdown.
The behavioral pass is the part that matters. Static schema audits tell you
that a tool exists and looks reasonable. MCProbe then picks up a phone and
dials each tool with missing_required, wrong_type, out_of_enum, and
extra_garbage inputs — the same mistakes a language model will make on a
bad day — and classifies the response. A server that says "OK" to garbage is
graded harshly. A server that crashes the JSON-RPC transport is graded
harsher. A server that returns a clean isError: true is graded correctly.
Problem statement
The Model Context Protocol is new. Servers proliferate. Most ship with
tool schemas that an agent can call, but few ship with tool schemas that
an agent can call correctly: parameters are untyped, descriptions are
missing, names are not snake_case, and a quick look at the code reveals
that the handler is doing Number(x) / Number(y) with no guard at all.
The convention in the wider ecosystem is to ship a static schema audit
that flags the obvious smells and then declare the server ready. The
smells are real, but a static audit cannot tell you whether the server
behaves: it cannot tell you that divide("x", "y") silently returns
NaN, or that an extra unknown key is just stripped and ignored.
MCProbe does both, on a single connection:
Static lint. Eleven rules over every tool's schema: missing or thin descriptions, duplicate or unusual names, an empty or non-object schema, untyped or undocumented parameters, and a server-wide rule for "I said I had tools but I have none."
Behavioral fuzz. For each tool, the generator produces one valid case and at least three malformed variants, calls the target over the live JSON-RPC transport, and classifies the outcome as
ok(the tool shrugged),toolError(graceful rejection), orprotocolCrash(worst case). A malformed case that comes back withoutisError: trueis flagged assilentlyAccepted— exactly the failure mode the linter cannot see.Scoring. The findings and the fuzz results are combined into a 0–100 score on four dimensions, mapped to an A–F grade, and rendered as a Markdown report the host (or a human) can read.
Related MCP server: hivelaw
Install
npm install
npm run build # tsc -p tsconfig.json && tsc -p examples/demo-target/tsconfig.jsonThe build emits:
dist/index.js— the probe (run this as a stdio MCP server).examples/demo-target/dist/index.js— a deliberately flawed MCP server used by the tests and the demo.
To launch the probe as a stdio MCP server so any host can talk to it:
npm startNo port, no daemon, no config file. The probe speaks JSON-RPC on stdin/stdout and writes operator logs to stderr.
The six probe_* tools
MCProbe registers four core tools and two optional helpers. The core four cover the full lint → fuzz → score pipeline; the two helpers cover the everyday ergonomics of managing connections.
Tool | Purpose | Returns |
| Open a connection to a target. |
|
| Run the 11 lint rules over the target's cached tool summaries. |
|
| Generate valid + malformed inputs per tool, call each, classify the outcome. |
|
| Run lint (and fuzz when requested), score, render Markdown. |
|
| (optional) Enumerate the target's tools. |
|
| (optional) Close one connection (by id) or every connection. |
|
All tools default to the most recently opened connection when
connectionId is omitted, so a single-target audit is a three-call
sequence: probe_connect → probe_report → probe_disconnect.
probe_connect
Two transports: stdio (spawns a child process) and http (speaks
the streamable HTTP transport, with SSE fallback). For stdio,
command is required; for http, url is required. The target's
initialize handshake is run synchronously, the server's identity
and capabilities are cached, and a stable connectionId is returned.
probe_lint
A pure pass over the connection's cached tool summaries — no extra
round-trip. Each finding carries a stable code, a severity
(error, warning, info), a human-readable message, a
location ({ tool, param? }), and a hint with a concrete fix.
The eleven rules are:
Code | Severity | What it catches |
| error | A tool with no description at all. |
| warning | A description under 12 characters. |
| error | Two tools registered with the same name. |
| warning | A name that is not |
| warning | An empty or missing |
| error | A schema that fails to compile (Ajv). |
| warning | A root |
| info | Properties declared but no |
| warning | A property with no |
| warning | A property with no |
| warning | The server claims |
probe_fuzz
For every tool (capped at maxTools, default 10), the generator
emits one valid case and at least three malformed variants:
missing_required:<field>— drop each required field in turn.wrong_type:<field>— replace each typed field with a value of a different primitive type.out_of_enum:<field>— forenumorconstfields, send a value the schema forbids.extra_garbage— append a sentinel key to the valid args.
Each case is sent to the target over the live JSON-RPC transport. The classifier assigns one of three outcomes:
Outcome | Meaning |
| The target returned a result with |
| The target returned a result with |
| The call rejected or the transport closed. |
probe_report
The convenience entry point. Calls probe_lint (always) and
probe_fuzz (when fuzz: true), scores the result on the four
dimensions described below, and returns the structured
ConformanceReport and a rendered Markdown string. The
Markdown is the canonical payload; downstream tools that need the
numbers can pull them out of the structured fields.
Scoring model — four dimensions
The scorecard is subtractive. Every dimension starts at 10/10 and
loses points only for concrete, observed problems. The overall
0–100 score is the mean of the measured dimensions; dimensions
that were not measured (e.g. the two behavioral ones when
fuzz: false) are reported as "not measured" and excluded from the
average rather than penalized with a fake value. This is what lets
a static audit of a clean server still score 100/100.
Letter grades: A ≥ 90, B ≥ 75, C ≥ 60, D ≥ 40, F < 40.
Dimension | Always measured? | What it captures |
Metadata & Documentation | yes | Server identity (name, version), advertised capabilities, presence of |
Schema Quality | yes | Deducted 1 per |
Error Handling | only with | Deducted 2 per |
Liveness & Performance | only with | Deducted 4 per |
The full deduction list and the top-offender breakdown for each dimension are emitted in the Markdown report so the score is auditable by a human.
30-second demo
The probe ships with a deliberately flawed demo target at
examples/demo-target/ and a smoke script that runs the full
probe_report pipeline against it. From a clean clone:
npm install
npm run build
node scripts/smoke-report.mjsThe script spawns the probe as a stdio MCP server, opens a
connection to the demo target, calls probe_report with
fuzz: true, and prints the Markdown report to stdout. The demo
target is wired to fail loudly: greet has no description,
divide returns NaN on bad input, set_mode has a thin
description, and well_behaved is the only clean tool. The report
will show a low overall score with concrete findings and a fuzz
table that classifies the broken cases.
For an interactive tour, the official MCP inspector works as a host against the built probe:
npx @modelcontextprotocol/inspector node dist/index.jsThe inspector UI lists the six probe_* tools; calling them
manually is a good way to see the request/response shape.
External server example
The probe is not coupled to the demo target. To audit any other
MCP server, swap the command/args in probe_connect:
// tool call: probe_connect
{
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem@latest", "/tmp"]
}The probe runs the initialize handshake against the spawned
process, caches its tools, and is ready for probe_lint /
probe_fuzz / probe_report. The same pattern works for HTTP
targets: pass transport: "http" and a url instead.
A real transcript of this audit (run against
@modelcontextprotocol/server-filesystem@latest and saved to
examples/transcripts/external-server.md) is included in the
repository. The script that produced it is
scripts/external-audit.mjs. A self-audit (a second copy of the
probe scoring the first) lives at
examples/transcripts/self-audit.md.
Architecture
MCProbe plays two roles at once: it is a stdio MCP server to its host, and an MCP client to whatever it is auditing. The split mirrors the source layout.
+--------------------+ stdio / http +-------------------+
| host | <-----------------------> | target MCP |
| (claude code, etc) | | server |
+--------------------+ +-------------------+
^ ^
| JSON-RPC on stdin/stdout | JSON-RPC
| | over the
v v chosen
+--------------------+ spawn / dial +-------------------+
| src/index.ts | ------------------------> | src/target-client |
| (the probe) | | (outbound client) |
+--------------------+ +-------------------+
|
| calls the pure modules
v
+--------------------+ +---------------+ +----------------+ +--------------+
| src/schema-lint | | src/fuzz.ts | | src/conformance | | src/report |
| (11 rules) | | (generator) | | (4-dim score) | | (markdown) |
+--------------------+ +---------------+ +----------------+ +--------------+Module | Role | I/O? |
| Shared | none |
| Outbound MCP client, | yes — spawns / dials |
| The 11 lint rules. Pure: no I/O, deterministic ordering. | none |
| Case generator + runner + | none on the generator; the runner calls the target |
| Per-dimension scoring + rollup. Pure. | none |
| Pure Markdown renderer. Same input → same output every run. | none |
|
| yes — owns the stdio transport |
The four pure modules (schema-lint, fuzz generator,
conformance, report) are deliberately side-effect-free so the
vitest suite can exercise them in milliseconds without spawning a
target. The integration test in tests/demo-target.test.ts is the
only piece that touches a live process; it is the smallest test
that proves the build artifact loads over the real protocol.
Limitations
The four runtime dependencies are frozen.
@modelcontextprotocol/sdk,ajv,ajv-formats,zod. The probe deliberately does not depend on any CLI framework, HTTP server, or transport library beyond what the SDK already exposes. Adding a runtime dependency is an explicit change to the spec.The probe is a stdio MCP server, full stop. It does not expose an HTTP endpoint. Run it as a subprocess of your host.
The fuzzer is shallow, not adversarial. It exercises the surface documented by the tool's
inputSchema; it does not attempt to discover server-side bugs that are out of band of the tool contract. The point of MCProbe is conformance, not general-purpose server fuzzing.The scoring is subtractive and dimension-local. A perfect score on one dimension does not rescue a failure on another. The four dimensions are weighted equally when measured.
Behavioral scores need a real protocol round-trip. When
fuzz: falseis passed toprobe_report, theError HandlingandLiveness & Performancedimensions are reported as "not measured" and excluded from the rollup. A "lint-only" audit can still score 100/100 on a clean server, but it cannot tell you whether the server would survive a bad input.Tooling is four cores + two helpers, no more. The spec pins the surface area. Adding a
probe_*tool is an explicit change to the spec.The optional helpers are still required at startup. The
McpServeris constructed with thetoolscapability only; it does not advertiseresourcesorprompts. The probe itself is an audit tool, not a content server.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/alitiknazoglu/mcprobe'
If you have feedback or need assistance with the MCP directory API, please join our Discord server