Skip to main content
Glama
shishirreddyyk

mcp-docpilot-server

mcp-docpilot-server

An MCP server that exposes document retrieval as tools any LLM provider can call. It puts a single, stable interface in front of a vector index (built from DocPilot's ingestion pipeline) so a model never has to know how the documents are stored or which embedding backend is in use - it just calls docpilot_search.

The server provides the tools and data access; the connected model does the generation. That split is what makes it provider-agnostic: Claude Desktop, or any client that speaks MCP, gets the same retrieval tools.

Tools

Tool

What it does

docpilot_search

Semantic search over the corpus; returns ranked chunks with source and score

docpilot_list_sources

Lists indexed source documents with per-source chunk counts

Both tools are read-only.

Related MCP server: @sanderkooger/mcp-server-ragdocs

How it works

docs/*.md ──ingest.py──> chunk + embed ──> ChromaDB (persistent)
                                              │
                          server.py exposes ──┤── docpilot_search
                          MCP tools over      └── docpilot_list_sources
                          stdio or HTTP
                                              │
            Claude Desktop / any MCP client ──┘  (model calls the tools)

Setup

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# Build the index from the docs folder (swap in your own .txt/.md files)
python ingest.py ./docs

Embeddings use ChromaDB's local default model, so it runs with no API key. To point it at a hosted embedding provider instead, set a ChromaDB embedding function in ingest.py and server.py - the rest of the pipeline is unchanged.

Run

stdio (local clients like Claude Desktop):

python server.py

Streamable HTTP (remote server):

DOCPILOT_TRANSPORT=http python server.py
# serves MCP at http://localhost:8000/mcp

The SDK's HTTP transport supersedes the older SSE transport; point HTTP-based MCP clients at the /mcp endpoint.

Connect to Claude Desktop

Add this to claude_desktop_config.json:

{
  "mcpServers": {
    "docpilot": {
      "command": "python",
      "args": ["/absolute/path/to/mcp-docpilot-server/server.py"],
      "env": {
        "DOCPILOT_CHROMA_PATH": "/absolute/path/to/mcp-docpilot-server/chroma"
      }
    }
  }
}

Or, for an HTTP server:

claude mcp add --transport http docpilot http://localhost:8000/mcp

Configuration

Env var

Default

Meaning

DOCPILOT_CHROMA_PATH

./chroma

Persistent ChromaDB store

DOCPILOT_COLLECTION

docpilot

Collection name

DOCPILOT_TRANSPORT

stdio

stdio or http

DOCPILOT_CHUNK_SIZE

800

Characters per chunk (ingest)

DOCPILOT_CHUNK_OVERLAP

100

Overlap between chunks (ingest)

DOCPILOT_EMBEDDINGS

default

default (local ONNX model) or hash (offline, for CI/tests)

Switching the embedding backend changes the vector space, so re-ingest into a fresh store when you change it (rm -rf chroma && python ingest.py ./docs). All backend selection lives in embeddings.py - that one file is the seam for the embedding lifecycle.

Test

pytest -q

The test ingests a tiny corpus and confirms retrieval ranks the expected document first. CI runs it on every push (.github/workflows/ci.yml).

F
license - not found
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/shishirreddyyk/mcp-docpilot-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server