wandering-rag-mcp
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@wandering-rag-mcpsearch for machine learning basics"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
English | 中文
wandering-rag-mcp
A local RAG (Retrieval-Augmented Generation) knowledge base MCP server that exposes semantic document search as tools. Uses zvec (Alibaba's embedded vector database) for vector storage and Qwen3-Embedding-0.6B for text embedding.
No external LLM required — the MCP server handles retrieval, and the client (QoderWork, Claude Desktop, etc.) provides generation.
Features
Multi-format support: Plain text files (40+ types: md, txt, py, js, ts, go, rs, etc.) and binary documents (PDF, DOCX, PPTX, XLSX)
Embedded vector DB: zvec — zero-config, no Docker, WAL-persistent, HNSW-indexed
Local embedding: Qwen3-Embedding-0.6B (0.6B params, 1024-dim, 32K context, bilingual CN/EN)
Three transport modes: stdio, SSE, Streamable HTTP
Multi-collection: Isolate documents into separate knowledge bases
Related MCP server: ragi
Quick Start
Prerequisites
Python >= 3.10
Install
git clone <repo-url>
cd wandering-rag-mcp
pip install -e .Run
# stdio mode (default, for QoderWork / Claude Desktop)
python server.py
# SSE mode
python server.py --mode sse --port 8000
# Streamable HTTP mode
python server.py --mode streamable-http --host 0.0.0.0 --port 8000Environment variables are also supported:
Variable | Description | Default |
| Transport mode |
|
| Bind host |
|
| Bind port |
|
| Embedding model name |
|
| Vector data directory |
|
Client Configuration
stdio Mode (QoderWork / Claude Desktop)
{
"mcpServers": {
"wandering-rag-mcp": {
"command": "python",
"args": ["D:\\repos\\rag-mcp\\server.py"]
}
}
}SSE Mode
{
"mcpServers": {
"wandering-rag-mcp": {
"url": "http://your-server:8000/sse"
}
}
}Streamable HTTP Mode
{
"mcpServers": {
"wandering-rag-mcp": {
"url": "http://your-server:8000/mcp"
}
}
}MCP Tools
search
Search the knowledge base with natural language queries.
Parameter | Type | Default | Description |
| string | (required) | Natural language search query |
| int | 5 | Number of results to return |
| string |
| Collection to search |
ingest_file
Import a single file into the knowledge base.
Parameter | Type | Default | Description |
| string | (required) | Path to the file |
| string |
| Target collection |
| int | 500 | Max characters per chunk |
Supported formats: .md, .txt, .py, .js, .ts, .pdf, .docx, .pptx, .xlsx, and 40+ more.
ingest_directory
Batch import all files in a directory.
Parameter | Type | Default | Description |
| string | (required) | Directory path |
| string |
| Target collection |
| bool |
| Scan subdirectories |
| string |
| Comma-separated extensions filter (empty = all supported) |
| int | 500 | Max characters per chunk |
list_collections
List all knowledge base collections.
list_documents
List all documents in a collection.
Parameter | Type | Default | Description |
| string |
| Collection name |
delete_document
Remove a document and all its chunks from the knowledge base.
Parameter | Type | Default | Description |
| string | (required) | Path used during import |
| string |
| Collection name |
Architecture
flowchart TB
subgraph Client["MCP Client (QoderWork, etc.)"]
direction LR
C1["User question"] --> C2["Call MCP tools"] --> C3["LLM answer"]
end
Client <-->|"stdio / SSE / Streamable HTTP"| Server
subgraph Server["RAG MCP Server (FastMCP)"]
direction LR
subgraph Tools[" "]
direction TB
T1["Ingest Pipeline"] ~~~ T2["Search Pipeline"] ~~~ T3["Collection Manager"]
end
Tools --> Embed & Vec
Embed["sentence-transformers<br/>Qwen3-Embedding-0.6B"]
Vec["zvec<br/>./data/"]
end
style Client fill:#e8f4f8,stroke:#2196F3
style Server fill:#f5f5f5,stroke:#333
style Tools fill:#fff3e0,stroke:#FF9800
style Embed fill:#fce4ec,stroke:#E91E63
style Vec fill:#f3e5f5,stroke:#9C27B0Project Structure
wandering-rag-mcp/
├── pyproject.toml # Dependencies and entry point
├── server.py # MCP server entry + 6 tool definitions
├── core/
│ ├── chunker.py # Recursive text chunking
│ ├── embeddings.py # sentence-transformers wrapper (lazy load)
│ └── vector_store.py # zvec wrapper (CRUD + search)
├── data/ # zvec storage (auto-created at runtime)
│ └── default/
└── .gitignoreHow It Works
Ingest: File is read (plain text or converted via markitdown) → split into overlapping chunks → each chunk embedded into a 1024-dim vector → stored in zvec with metadata (text, source path, chunk index)
Search: Query text → embedded into vector → zvec ANN search returns top-k nearest chunks with similarity scores → returned as formatted text with source references
Document ID: SHA256 hash of the file path (first 16 chars) is used as a stable document ID, enabling idempotent re-imports and deletion by file path.
Dependencies
Package | Purpose |
| MCP protocol SDK (FastMCP) |
| Embedded vector database by Alibaba |
| Load and run embedding models |
| Convert PDF/DOCX/PPTX/XLSX to Markdown |
Technical Documentation
For detailed architecture and technical stack explanation, see Architecture Document.
License
MIT
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/mambo-wang/wandering-rag-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server