Skip to main content
Glama

🧭 Quartermaster

Issues your agent exactly the tools the mission needs β€” nothing more.

An offline, zero-dependency tool-router for MCP. It funnels N tools down to a ranked shortlist for a natural-language query, so the model reads ~8 tools instead of 200 β€” no embedding model, no network, no API key.

Quick start Β· Getting started Β· How it works Β· Which one? Β· Benchmarks


Status: alpha β€” on npm (npx quartermaster-mcp). The ranker is extracted from a production system (see Heritage); the proxy (quartermaster-mcp) is built, published, and runnable end-to-end (federation + retrieve_tools + call_tool); the Claude Code plugin is still scaffolded.

Verdict β€” GO. Zero-dependency BM25 is a genuinely good router: 91.5% recall@8 on a real 171-tool manifest, beating a substring baseline everywhere. Optional offline synonym expansion is a large win on terse/vocabulary-poor manifests (the common case β€” 5–9Γ— recall@1 at 500–1000 tools) and, with weighting, only marginally trails BM25 at recall@8 on rich descriptions while leading on MRR β€” so it ships opt-in and corpus-tuned. We do not claim to beat hybrid embeddings β€” we claim competitive routing with no model dependency at all. Numbers: benchmarks.

The problem

Give a model 200 tools and two things break: every tool's schema is loaded into context on every turn (token tax), and the model has to pick the right one from 200 lookalikes (accuracy drops as the count grows). This is well-documented prior art β€” RAG-MCP names "prompt bloat and selection complexity," and ToolRet (ACL 2025) shows generic retrievers do poorly on tool selection specifically.

Related MCP server: MCP Vector Proxy

The shape: funnel advises, model decides

  query                 query
    β”‚                     β”‚
    β–Ό                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  offline BM25 over
β”‚  LLM   β”‚β—„ 200    β”‚ Quartermasterβ”‚  tool descriptions
β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ schemas β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  (zero deps, no model)
    β”‚  picks wrong,        β”‚ top-8 shortlist + guidance
    β”‚  huge context        β–Ό
    β–Ό                β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 a tool              β”‚     LLM      β”‚ reads a small,
                     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ relevant set β†’ picks
                            β–Ό
                       right tool(s)

Quartermaster doesn't decide. It returns a scored shortlist; the host LLM β€” already in the loop, free β€” makes the final call. So we optimize for recall@K ("is the right tool in the top K?"), not top-1.

What makes it different

The MCP-router space is crowded (Anthropic's native Tool Search, mcpproxy-go, mcp-funnel, MCPJungle, …). We are honest about that β€” see the comparison. The seam Quartermaster fills:

  • Zero embedding model. No torch, no model download, nothing to warm up. The whole ranker is a few hundred lines of dependency-free TypeScript.

  • Host-agnostic. Works outside the Anthropic API β€” any MCP client, any model.

  • Advises, doesn't decide. Returns a shortlist + guidance, never a forced pick.

  • Offline & private. Nothing phones home; suitable for air-gapped / regulated environments.

We do not claim best-in-class retrieval accuracy. The benchmarks show the honest picture: zero-dependency BM25 is a strong router, and offline query expansion adds a large recall boost on terse manifests (where the vocabulary gap bites) while adding noise on rich ones β€” so expansion is an opt-in toggle, not a silver bullet. The bet that paid off: you can get competitive tool routing with no embedding model at all.

Quick start

Quartermaster is a single package β€” quartermaster-mcp. Put it in front of N MCP servers; agents load retrieve_tools + call_tool instead of every downstream schema. Point it at a quartermaster.json:

{
  "servers": [
    { "id": "github", "command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": { "GITHUB_PERSONAL_ACCESS_TOKEN": "${GITHUB_TOKEN}" } }
  ]
}
npx quartermaster-mcp --config ./quartermaster.json

It spawns the downstream servers, aggregates their tools, and serves a ranked, schema-hydrated shortlist via retrieve_tools β€” the model then calls the chosen tool through call_tool. See packages/proxy.

Host recipe: Use Quartermaster in Cursor (the same mcpServers config works for Claude Desktop).

What ships

One package β€” quartermaster-mcp β€” the drop-in MCP proxy that federates downstream servers behind one retrieve_tools + call_tool. The zero-dependency BM25/TF-IDF ranker that powers it lives in packages/core and is bundled into the proxy; it is not published separately, so the proxy installs self-contained (its only runtime dependency is the MCP SDK). A .claude-plugin/ manifest is also included for the Claude Code tool-search seam.

Heritage

Extracted and generalized from the semantic funnel in sf-intelligence, a read-only intelligence layer that routes ~170 tools for one Salesforce org. The fork makes the tool corpus and synonyms injectable, and upgrades the default ranker from TF-IDF cosine to BM25.

License

MIT Β© 2026 Pranav Nagrecha. See LICENSE.

A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

–Maintainers
–Response time
–Release cycle
–Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PranavNagrecha/Quartermaster'

If you have feedback or need assistance with the MCP directory API, please join our Discord server