Markdown RAG Documentation

README.md•17.7 KiB

# mcp-markdown-ragdocs A Model Context Protocol server that provides semantic search over local Markdown documentation using hybrid retrieval. ## What it is This is an MCP server that indexes local Markdown files and exposes a `query_documents` tool for hybrid semantic search. The server identifies relevant document sections using semantic search, keyword matching, and graph traversal, enabling efficient discovery without loading entire documentation collections into LLM context. ## Why it exists Technical documentation, personal notes, and project wikis are typically stored as Markdown files. Searching these collections manually or with grep is inefficient. This server provides a conversational interface to query documentation using natural language while automatically keeping the index synchronized with file changes. Existing RAG solutions require manual database setup, explicit indexing steps, and ongoing maintenance. This server eliminates that friction with automatic file watching, zero-configuration defaults, and built-in index versioning. ## Features - Hybrid search combining semantic embeddings (FAISS), keyword search (Whoosh), and graph traversal (NetworkX) - **Community-based boosting**: Louvain clustering detects document communities; co-community results receive score boost - **Score-aware dynamic fusion**: Adjusts vector/keyword weights based on score variance per query - **HyDE (Hypothetical Document Embeddings)**: `search_with_hypothesis` tool for vague queries - Cross-encoder re-ranking for improved precision (optional, ~50ms latency) - Query expansion via concept vocabulary for better recall - **Git history search:** Semantic search over commit history with metadata and delta context (parallel indexing for 2-4x speedup) - **Multi-project support:** Manage isolated indices for multiple projects on one machine with automatic project detection - Server-Sent Events (SSE) streaming for real-time response delivery - CLI query command with rich formatted output - Automatic file watching with debounced incremental indexing - Zero-configuration operation with sensible defaults - Index versioning with automatic rebuild on configuration changes - **Pluggable parser architecture:** Markdown and plain text (.txt) support out-of-the-box - Rich Markdown parsing: frontmatter, wikilinks, tags, transclusions - Reciprocal Rank Fusion for multi-strategy result merging - Recency bias for recently modified documents - **Memory Management System:** Persistent AI memory bank with cross-corpus linking, **calibrated scoring** (sigmoid expansion for interpretable 0-1 range), exponential decay, and ghost node graph traversal - Local-first architecture with no external dependencies ## Installation Requires Python 3.13+. ```zsh git clone https://github.com/yourusername/mcp-markdown-ragdocs.git cd mcp-markdown-ragdocs uv sync ``` ## Quick Start ### For VS Code / MCP Clients (Recommended) Start the stdio-based MCP server for use with VS Code or other MCP clients: ```zsh uv run mcp-markdown-ragdocs mcp ``` The server will: 1. Scan for `*.md` and `*.txt` files in the current directory 2. Build vector, keyword, and graph indices 3. Start file watching for automatic updates 4. Expose query_documents tool via stdio transport See [MCP Integration](#mcp-integration) below for VS Code configuration. ### For HTTP API / Development Start the HTTP server on default port 8000: ```zsh uv run mcp-markdown-ragdocs run ``` The server will: 1. Index documents (same as mcp command) 2. Expose HTTP API at `http://127.0.0.1:8000` 3. Provide REST endpoints for queries See [API Endpoints](#api-endpoints) below for HTTP usage. ## Basic Usage ### Configuration Create `.mcp-markdown-ragdocs/config.toml` in your project directory or at `~/.config/mcp-markdown-ragdocs/config.toml`: ```toml [server] host = "127.0.0.1" port = 8000 [indexing] documents_path = "~/Documents/Notes" # Path to your Markdown files index_path = ".index_data/" # Where to store indices [parsers] "**/*.md" = "MarkdownParser" # Markdown files "**/*.txt" = "PlainTextParser" # Plain text files [search] semantic_weight = 1.0 # Weight for semantic search results keyword_weight = 1.0 # Weight for keyword search results recency_bias = 0.5 # Boost for recently modified documents rrf_k_constant = 60 # Reciprocal Rank Fusion constant min_confidence = 0.3 # Score threshold (default: 0.3) max_chunks_per_doc = 2 # Per-document limit (default: 2) dedup_enabled = true # Semantic deduplication (default: true) ``` The server searches for configuration files in this order: 1. `.mcp-markdown-ragdocs/config.toml` in current directory 2. `.mcp-markdown-ragdocs/config.toml` in parent directories (walks up to root) 3. `~/.config/mcp-markdown-ragdocs/config.toml` (global fallback) This supports **monorepo workflows** where you can place a shared configuration in the repository root. If no configuration file exists, the server uses these defaults: - Documents path: `.` (current directory) - Server: `127.0.0.1:8000` - Index storage: `.index_data/` ### CLI Commands #### Start MCP Server (stdio) ```zsh uv run mcp-markdown-ragdocs mcp ``` Starts stdio-based MCP server for VS Code and compatible MCP clients. Runs persistently until stopped. #### Start HTTP Server ```zsh uv run mcp-markdown-ragdocs run ``` Starts HTTP API server on port 8000 (default). Override with: ```zsh uv run mcp-markdown-ragdocs run --host 0.0.0.0 --port 8080 ``` #### Query Documents (CLI) Query documents directly from command line: ```zsh uv run mcp-markdown-ragdocs query "How do I configure authentication?" ``` With options: ```zsh # JSON output for scripting uv run mcp-markdown-ragdocs query "authentication" --json # Limit number of results uv run mcp-markdown-ragdocs query "authentication" --top-n 3 # Specify project context uv run mcp-markdown-ragdocs query "authentication" --project my-project # Enable debug mode to see search internals uv run mcp-markdown-ragdocs query "authentication" --debug ``` **Debug Mode (`--debug`):** Displays two formatted tables showing search internals: 1. **Search Strategy Results**: Result counts from each search strategy - Vector (Semantic): FAISS embedding search results - Keyword (BM25): Whoosh keyword search results - Graph (PageRank): NetworkX graph traversal results - Code: Code-specific search results (if enabled) - Tag Expansion: Query expansion results (if enabled) 2. **Compression Pipeline**: Filtering stages with counts and items removed - Original (RRF Fusion): Combined results before filtering - Confidence Filter: Low-score results removed (threshold: `min_confidence`) - Content Dedup: Exact duplicate content removed - N-gram Dedup: Near-duplicate content removed (character n-grams) - Semantic Dedup: Semantically similar results removed (cosine similarity) - Doc Limit: Per-document chunk limit applied (`max_chunks_per_doc`) **Example output:** ``` ┌─ Search Strategy Results ─────────────┐ │ Strategy │ Count │ ├───────────────────┼──────────────────┤ │ Vector (Semantic) │ 15 │ │ Keyword (BM25) │ 8 │ │ Graph (PageRank) │ 3 │ └───────────────────┴──────────────────┘ ┌─ Compression Pipeline ─────────────────────┐ │ Stage │ Count │ Removed│ ├────────────────────────────┼───────┼────────┤ │ Original (RRF Fusion) │ 26 │ - │ │ After Confidence Filter │ 20 │ 6 │ │ After Content Dedup │ 18 │ 2 │ │ After N-gram Dedup │ 16 │ 2 │ │ After Semantic Dedup │ 12 │ 4 │ │ After Doc Limit │ 5 │ 7 │ └────────────────────────────┴───────┴────────┘ ``` **Use debug mode to:** - Understand why certain results appear or are filtered - Tune search configuration (weights, thresholds, dedup settings) - Diagnose low-quality results (check if semantic or keyword search dominates) - Identify over-aggressive deduplication (high removal in dedup stages) - Optimize performance (balance precision vs. recall via thresholds) #### Configuration Management Check your configuration: ```zsh uv run mcp-markdown-ragdocs check-config ``` Force a full index rebuild: ```zsh uv run mcp-markdown-ragdocs rebuild-index ``` | Command | Purpose | Use When | |---------|---------|----------| | `mcp` | Stdio MCP server | Integrating with VS Code or MCP clients | | `run` | HTTP API server | Development, testing, or HTTP-based integrations | | `query` | CLI query with optional `--debug` flag | Scripting, quick searches, or debugging search behavior | | `check-config` | Validate config | Debugging configuration issues | | `rebuild-index` | Force full reindex (documents, git commits, vocabulary) | Config changes, corrupted indices, or force rebuild | ### MCP Integration #### VS Code Configuration Configure the MCP server in VS Code user settings or workspace settings. **File:** `.vscode/settings.json` or `~/.config/Code/User/mcp.json` ```json { "mcpServers": { "markdown-docs": { "command": "uv", "args": [ "--directory", "/absolute/path/to/mcp-markdown-ragdocs", "run", "mcp-markdown-ragdocs", "mcp" ], "type": "stdio" } } } ``` **With project override:** ```json { "mcpServers": { "markdown-docs": { "command": "uv", "args": [ "--directory", "/absolute/path/to/mcp-markdown-ragdocs", "run", "mcp-markdown-ragdocs", "mcp", "--project", "my-project" ], "type": "stdio" } } } ``` #### Claude Desktop Configuration **File:** `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) ```json { "mcpServers": { "markdown-docs": { "command": "uv", "args": [ "--directory", "/absolute/path/to/mcp-markdown-ragdocs", "run", "mcp-markdown-ragdocs", "mcp" ] } } } ``` #### Available Tools The server exposes two MCP tools: **`query_documents`**: Search indexed documents using hybrid search and return ranked document chunks. **`search_git_history`**: Search git commit history using natural language queries. Returns relevant commits with metadata, message, and diff context. **`search_with_hypothesis`**: Search using a hypothesis about expected documentation content. Embeds the hypothesis directly for semantic search (HyDE technique). Useful for vague queries where describing expected content yields better results than the query itself. **Parameters:** - `query` (required): Natural language query or question - `top_n` (optional): Maximum results to return (1-100, default: 5) - `uniqueness_mode` (optional): Result uniqueness strategy - `"off"` (allow duplicates), `"document"` (one chunk per document), or `"content"` (semantic deduplication, default) - `min_score` (optional): Minimum confidence threshold (0.0-1.0, default: 0.3) - `similarity_threshold` (optional): Semantic deduplication threshold (0.5-1.0, default: 0.85) - `show_stats` (optional): Show compression statistics (default: false) **Note:** Compression is enabled by default (`min_score=0.3`, `max_chunks_per_doc=2`, `dedup_enabled=true`) to reduce token overhead by 40-60%. Results use compact format: `[N] file § section (score)\ncontent` **Usage Pattern:** 1. Call `query_documents` to identify relevant sections 2. Review returned chunks to locate specific files and sections 3. Use file reading tools to access full document context **Example query from MCP client:** ```json { "query": "How do I configure authentication in the API?", "top_n": 5, "min_score": 0.3 } ``` The server returns ranked document chunks with file paths, header hierarchies, and relevance scores. **`search_git_history`**: Search git commit history using natural language queries. **Parameters:** - `query` (required): Natural language query describing commits to find - `top_n` (optional): Maximum commits to return (1-100, default: 5) - `min_score` (optional): Minimum relevance threshold (0.0-1.0, default: 0.0) - `file_pattern` (optional): Glob pattern to filter by changed files (e.g., `src/**/*.py`) - `author` (optional): Filter commits by author name or email - `after` (optional): Unix timestamp to filter commits after this date - `before` (optional): Unix timestamp to filter commits before this date **Note:** Git history search indexes up to 200 lines of diff per commit. Indexing processes 60 commits/sec on average. Search latency averages 5ms for 10k commits. **Example query:** ```json { "query": "fix authentication bug", "top_n": 5, "file_pattern": "src/auth/**", "after": 1704067200 } ``` The server returns ranked commits with hash, title, author, timestamp, message, files changed, and truncated diff. ### Memory Management The server supports an AI memory bank for persistent cross-session knowledge storage. **Enable in configuration:** ```toml [memory] enabled = true storage_strategy = "project" # "project" or "user" score_threshold = 0.2 # Minimum score after calibration (0.0-1.0) # Per-type exponential decay (defaults shown) [memory.decay_journal] decay_rate = 0.90 # 7-day half-life floor_multiplier = 0.1 ``` **Available tools:** - `create_memory`, `read_memory`, `update_memory`, `append_memory`, `delete_memory`: CRUD operations (system auto-generates frontmatter for `create_memory`) - `search_memories`: Hybrid search with recency boost, tag/type filtering, and time range filtering (absolute timestamps or relative days) - `search_linked_memories`: Find memories linking to a specific document via ghost nodes - `get_memory_stats`: Memory bank statistics - `merge_memories`: Consolidate multiple memories into one **Time range filtering examples:** ```json // Last 7 days {"query": "bug fixes", "relative_days": 7} // Absolute range (Jan 1-31, 2024) {"query": "features", "after_timestamp": 1704067200, "before_timestamp": 1706745600} // Combined with tag filtering {"query": "auth improvements", "relative_days": 30, "filter_tags": ["security"]} ``` See [Memory Management](docs/memory.md) for complete documentation. ### API Endpoints Health check: ```zsh curl http://127.0.0.1:8000/health ``` Server status (document count, queue size, failed files): ```zsh curl http://127.0.0.1:8000/status ``` Query endpoint (standard): ```zsh curl -X POST http://127.0.0.1:8000/query_documents \ -H "Content-Type: application/json" \ -d '{"query": "authentication configuration"}' ``` Query endpoint (streaming SSE): ```zsh curl -X POST http://127.0.0.1:8000/query_documents_stream \ -H "Content-Type: application/json" \ -d '{"query": "authentication configuration", "top_n": 3}' \ -N ``` The streaming endpoint returns Server-Sent Events: ``` event: search_complete data: {"count": 3} event: token data: {"token": "Authentication"} event: token data: {"token": " is"} event: done data: {"results": [{"content": "...", "file_path": "auth.md", "header_path": ["Configuration"], "score": 1.0}]} ``` Example response (standard endpoint): ```json { "results": [ { "chunk_id": "authentication_0", "content": "Authentication is configured in the auth section...", "file_path": "docs/authentication.md", "header_path": ["Configuration", "Authentication"], "score": 0.92 }, { "chunk_id": "security_2", "content": "Security settings include authentication tokens...", "file_path": "docs/security.md", "header_path": ["Security", "API Keys"], "score": 0.78 } ] } ``` **MCP Stdio Format (Compact):** For MCP clients (VS Code, Claude Desktop), results use compact format: ``` [1] docs/authentication.md § Configuration > Authentication (0.92) Authentication is configured in the auth section... [2] docs/security.md § Security > API Keys (0.78) Security settings include authentication tokens... ``` Factual queries (e.g., "getUserById function", "configure auth") truncate content to 200 characters: ``` [1] docs/api.md § Functions > getUserById (0.92) getUserById(id: string): User | null Retrieves user by ID. Returns null if not found. Example: const user = getUserById("123"); if (user) ... ``` ``` Each result contains: - `chunk_id`: Unique identifier for the document chunk - `content`: The text from the matching document chunk - `file_path`: Source file path relative to documents directory - `header_path`: Document structure showing nested headers (semantic "breadcrumbs") - `score`: Calibrated confidence score [0, 1] representing absolute match quality (>0.9 = excellent, 0.7-0.9 = good, 0.5-0.7 = moderate, <0.3 = noise) ## Configuration Details See [docs/configuration.md](docs/configuration.md) for exhaustive configuration reference including all TOML options, defaults, and environment variable support. ## Documentation - [Architecture](docs/architecture.md) - System design, component overview, data flow - [Configuration](docs/configuration.md) - Complete configuration reference - [Hybrid Search](docs/hybrid-search.md) - Search strategies and RRF fusion algorithm - [Integration](docs/integration.md) - VS Code MCP setup and client integration - [Memory Management](docs/memory.md) - AI memory bank, CRUD tools, ghost nodes - [Troubleshooting](docs/troubleshooting.md) - Debug mode usage, tuning guide, common issues - [Development](docs/development.md) - Development setup, testing, contributing ## License MIT

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/andnp/ragdocs-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•17.7 KiB