How do I use mcp-ai-workspace?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@mcp-ai-workspace What is the company's policy on remote work?" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

mcp-ai-workspace

by nickbiird

Overview Schema Related Servers Score Discussions

Python

Local

mcp-ai-workspace

Give an LLM a tool that searches your own documents — and measure whether it retrieves the right ones.

A small, from-scratch implementation of the pattern behind every serious "chat with your data" product: an MCP server that exposes document retrieval as a tool, a vector-RAG pipeline underneath it, and an evals harness that scores retrieval quality. No framework magic — ~200 lines you can read in one sitting.

The problem it solves

An LLM on its own can't see your private documents, and when asked about them it will confidently make things up. The fix is retrieval: look up the relevant real passages first, then answer from them, with citations.

The interesting question is how an agent gets access to that retrieval. The answer here is MCP (the Model Context Protocol) — the emerging standard for giving models tools. This repo wires retrieval up as an MCP tool, so any MCP-capable client (Claude Desktop, an agent framework, a self-hosted chat UI) can call it and answer grounded in your documents instead of guessing.

Related MCP server: Markdown RAG MCP

What it does

Indexes a folder of markdown docs into a local vector store.
Serves a single MCP tool, search_knowledge_base, that returns the passages most relevant to a question, each with its source file and a similarity score.
Ships an evals harness that checks, over a set of known question→document pairs, whether retrieval actually surfaces the right source (hit@k + MRR).

The bundled corpus is a fictional company handbook (corpus/), so the whole thing runs end-to-end with no setup beyond pip install.

Architecture

flowchart LR
    subgraph Offline["Indexing (ingest.py)"]
        D[corpus/*.md] --> C[chunk into passages]
        C --> E1[embed]
        E1 --> Q[(Qdrant<br/>vector store)]
    end

    subgraph Online["Serving (server.py)"]
        U[MCP client / LLM] -->|calls tool| T[search_knowledge_base]
        T --> E2[embed query]
        E2 --> Q
        Q -->|top-k passages + sources| T
        T -->|grounded context| U
    end

    EV[evals/run_evals.py] -.->|same retrieval path| Q

The model on the left never talks to the vector store directly. It calls the tool; the tool does the retrieval. That indirection is the whole point of MCP.

How the agent knows what it can do

An MCP tool is defined by three things, and the model reads all three to decide when and how to call it:

Part	In this repo	What it's for
name	`search_knowledge_base`	how the model refers to the tool
description	the tool's docstring in `server.py`	the model reads this to decide when to call it
input schema	the typed arguments (`query: str`, `top_k: int`)	tells the model how to call it

That contract — name + description + schema — is the entire interface between the model and your code. Get the description right and the model uses the tool well; that's most of the "prompt engineering" in an agentic system.

Run it

make install      # create a venv, install deps
make ingest       # build the vector index from corpus/
make evals        # score retrieval quality
make serve        # run the MCP server (stdio)
# or just:
make demo         # ingest + evals, end to end

To use it from an MCP client (e.g. Claude Desktop), register the server with the example in mcp-client-config.example.json (fix the absolute path), restart the client, and the model gains a search_knowledge_base tool.

Evals

make evals runs evals/evalset.json — questions whose correct source document is known — and reports:

hit@k — fraction of questions where the right document is in the top-k
MRR — mean reciprocal rank, which rewards ranking the right doc first

The run exits non-zero if hit@k falls below the threshold, so it can gate CI. "I built RAG" is cheap; "I measure RAG, and here's the number" is the point.

Stack

Layer	Choice	Why
Tool protocol	MCP (`mcp` SDK, FastMCP)	the standard way to expose tools to an LLM
Vector store	Qdrant (local, on-disk)	a real vector DB API with no service to run
Embeddings	fastembed (`BAAI/bge-small-en-v1.5`)	ONNX, CPU-only, no torch, no API key

Every choice is swappable: a stronger embedding model, a hosted Qdrant, or a synthesis step that calls an LLM to write the final answer from the retrieved passages.

What I'd add next

An answer tool that calls an LLM to synthesise a cited answer from the retrieved passages (kept out of the core so the repo runs with no API key).
Chunking by semantics rather than character budget.
A reranker, and reporting precision/recall per document, not just hit@k.

Acknowledgements

The architecture here — a self-hosted LLM fronted by MCP tools and a vector-RAG layer — follows the pattern I learned from my DevOps professor, Oriol Rius, whose course stack first showed me how these pieces fit together. This repo is my own from-scratch, minimal re-implementation, written to internalise the concepts and demonstrate them honestly in code I wrote myself.

License

MIT — see LICENSE.

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nickbiird/mcp-ai-workspace'

If you have feedback or need assistance with the MCP directory API, please join our Discord server