Markdown RAG Documentation

configuration.md•51.7 KiB

# Configuration Reference This document provides an exhaustive reference for all configuration options and CLI commands in mcp-markdown-ragdocs. ## CLI Commands ### mcp Start stdio-based MCP server for VS Code and compatible MCP clients. ```zsh uv run mcp-markdown-ragdocs mcp [OPTIONS] ``` **Options:** - `--project TEXT`: Override project detection by specifying project name or absolute path **Usage:** Starts a persistent MCP server using stdio transport. VS Code or Claude Desktop manages the server lifecycle. The server remains running until terminated by the client or user. **Examples:** ```zsh # Start with automatic project detection uv run mcp-markdown-ragdocs mcp # Start with specific project uv run mcp-markdown-ragdocs mcp --project monorepo # Start with project path uv run mcp-markdown-ragdocs mcp --project /home/user/myproject ``` **When to Use:** - Integrating with VS Code MCP extension - Integrating with Claude Desktop - Any MCP client supporting stdio transport **Behavior:** - Indexes documents on startup (or loads existing index) - Starts file watcher for automatic updates - Exposes `query_documents` tool via stdio - Runs persistently until stopped ### run Start HTTP API server for development and testing. ```zsh uv run mcp-markdown-ragdocs run [OPTIONS] ``` **Options:** - `--host TEXT`: IP address to bind (default: 127.0.0.1) - `--port INTEGER`: TCP port to bind (default: 8000) - `--project TEXT`: Override project detection **Usage:** Starts an HTTP server exposing REST API endpoints. Suitable for development, testing, or custom HTTP-based integrations. **Examples:** ```zsh # Start on default host/port uv run mcp-markdown-ragdocs run # Listen on all interfaces uv run mcp-markdown-ragdocs run --host 0.0.0.0 # Custom port uv run mcp-markdown-ragdocs run --port 8080 # With project override uv run mcp-markdown-ragdocs run --project my-docs ``` **When to Use:** - Development and testing - Direct HTTP API access - Custom integrations not using MCP - Debugging query behavior **Behavior:** - Same indexing and file watching as `mcp` command - Exposes HTTP endpoints: `/health`, `/status`, `/query_documents` - Runs persistently until stopped (Ctrl+C) ### query Query documents directly from command line. ```zsh uv run mcp-markdown-ragdocs query QUERY_TEXT [OPTIONS] ``` **Arguments:** - `QUERY_TEXT`: Natural language query or question (required) **Options:** - `--json`: Output results as JSON instead of formatted text - `--top-n INTEGER`: Maximum number of results (default: 5, max: 100) - `--project TEXT`: Override project detection **Usage:** Executes a one-time query against the indexed documents and outputs results to stdout. Suitable for scripting, testing, or quick searches. **Examples:** ```zsh # Basic query with formatted output uv run mcp-markdown-ragdocs query "How do I configure authentication?" # JSON output for scripting uv run mcp-markdown-ragdocs query "authentication" --json # Limit results uv run mcp-markdown-ragdocs query "deployment" --top-n 3 # Query specific project uv run mcp-markdown-ragdocs query "API reference" --project monorepo ``` **Output Formats:** **Formatted (default):** ``` Query: How do I configure authentication? Found 3 results: ╭─ #1 Score: 0.8542 ──────────────────────────────╮ │ Document: authentication │ │ Section: Configuration > OAuth │ │ File: docs/auth.md │ │ │ │ To configure authentication, set the auth │ │ section in config.toml... │ ╰──────────────────────────────────────────────────╯ ``` **JSON:** ```json { "query": "How do I configure authentication?", "results": [ { "doc_id": "authentication", "content": "To configure authentication...", "file_path": "docs/auth.md", "header_path": "Configuration > OAuth", "score": 0.8542 } ] } ``` **When to Use:** - Quick documentation searches from terminal - Shell scripts processing query results - Testing search behavior - Verifying indexed content **Behavior:** - Loads existing index (does not rebuild) - Fails if no index exists (run `rebuild-index` first) - Suppresses logging for clean output - Exits after displaying results ### check-config Validate configuration and display resolved settings. ```zsh uv run mcp-markdown-ragdocs check-config [OPTIONS] ``` **Options:** - `--project TEXT`: Override project detection **Usage:** Loads configuration, validates all settings, and displays effective values including project detection results. **Examples:** ```zsh # Check configuration uv run mcp-markdown-ragdocs check-config # Check with project override uv run mcp-markdown-ragdocs check-config --project monorepo ``` **Output:** ``` ╭─ Configuration ─────────────────────────────────╮ │ Setting │ Value │ ├──────────────────────┼──────────────────────────┤ │ Server Host │ 127.0.0.1 │ │ Server Port │ 8000 │ │ Documents Path │ /home/user/docs │ │ Index Path │ .index_data/ │ │ Recursive │ True │ │ │ │ │ Registered Projects │ 3 project(s) │ │ • monorepo │ /home/user/work/mono │ │ • notes │ /home/user/notes │ │ │ │ │ Active Project │ ✅ monorepo │ │ │ │ │ Semantic Weight │ 1.0 │ │ Keyword Weight │ 1.0 │ │ Recency Bias │ 0.5 │ ╰──────────────────────────────────────────────────╯ ✅ Configuration is valid 📊 Index exists at: /home/user/.local/share/mcp-markdown-ragdocs/monorepo ``` **When to Use:** - Debugging configuration issues - Verifying project detection - Checking resolved paths before indexing - Confirming multi-project setup **Behavior:** - Loads and validates configuration - Does not modify any files - Does not build or load indices - Exits after displaying information ### rebuild-index Force a full index rebuild. ```zsh uv run mcp-markdown-ragdocs rebuild-index [OPTIONS] ``` **Options:** - `--project TEXT`: Override project detection **Usage:** Forces a complete reindex of all documents. Deletes existing indices and rebuilds from scratch. If enabled via configuration, also indexes git commit history and builds concept vocabulary for query expansion. **Examples:** ```zsh # Rebuild index for current directory/project uv run mcp-markdown-ragdocs rebuild-index # Rebuild specific project uv run mcp-markdown-ragdocs rebuild-index --project monorepo ``` **Output:** ``` Indexing documents... ████████████████████ 147/147 00:00:45 ✅ Successfully rebuilt index: 147 documents indexed Found 2 git repository(ies) Indexing git commits... ████████████████████ 523/523 00:01:32 ✅ Successfully indexed 523 git commits from 2 repositories Building concept vocabulary... ✅ Successfully built concept vocabulary: 1847 terms ``` **When to Use:** - Configuration changes (embedding model, parsers, chunking) - Corrupted or missing index - After bulk document changes - Testing indexing behavior - Force rebuild of git commit index or concept vocabulary **Normal Indexing vs. Rebuild:** By default, the system uses **incremental indexing**: - Documents: Only modified files reindexed via file watcher - Git commits: Only new commits indexed on startup - State persisted: Last indexed timestamp stored per repository Use `rebuild-index` to force a **full rebuild**: - Documents: All files reindexed from scratch - Git commits: Database cleared, all commits reindexed - State reset: Incremental indexing resumes after rebuild **Behavior:** 1. **Document Indexing Phase:** - Discovers all matching files based on `include`/`exclude` patterns - Displays progress bar during indexing - Persists new index and manifest - Overwrites existing index files 2. **Git Commit Indexing Phase (if `git_indexing.enabled = true`):** - **Clears existing commit index** (forces full rebuild) - Discovers git repositories respecting exclusion patterns - Counts total commits across all repositories - Displays progress bar showing commit indexing progress - Indexes commit metadata, message, and truncated diffs - Non-fatal: continues if git binary unavailable or indexing fails - Outputs commit count and repository count on success - **Note:** Path normalization ensures consistent state tracking 3. **Concept Vocabulary Building Phase (if `search.query_expansion_enabled = true`):** - Extracts unique terms from indexed documents - Filters by minimum frequency threshold - Embeds terms for query expansion - Displays vocabulary size on completion - Non-fatal: continues if vocabulary building fails **Edge Cases:** - **Git binary unavailable:** Displays warning and skips git commit indexing - **No repositories found:** Displays informational message - **No new commits:** Displays message indicating no commits to index (during normal startup) - **Git indexing disabled:** Skips git commit indexing phase silently - **Query expansion disabled:** Skips concept vocabulary building phase silently - **Indexing failures:** Logs error and continues with remaining phases **Incremental Indexing After Rebuild:** After running `rebuild-index`, subsequent server startups use incremental indexing: - Git commits: Only commits added after last rebuild are indexed - Documents: File watcher detects changes and indexes modified files - State persistence: Repository paths normalized for consistent timestamp tracking **Performance Notes:** - Document indexing: Depends on document count and size - Git commit indexing: ~60 commits/sec on average - Incremental git indexing: Near-instant when no new commits (zero-commit optimization) - Concept vocabulary: Scales with corpus size and term count (5000 terms default) - Large repositories (10k+ commits): May take several minutes for full rebuild - Subsequent startups: Incremental indexing fetches only new commits since last index **Technical Implementation:** Git commit incremental indexing relies on: - Per-repository timestamp tracking in SQLite - Path normalization (absolute, no `.git` suffix, no trailing slashes) - Zero-commit optimization skips repositories with no new commits - Logging distinguishes incremental vs first-time indexing for observability ## Configuration File Discovery The server searches for `config.toml` in the following order. The first file found is used: 1. `.mcp-markdown-ragdocs/config.toml` in current directory 2. `.mcp-markdown-ragdocs/config.toml` in parent directories (walks up to filesystem root) 3. `$HOME/.config/mcp-markdown-ragdocs/config.toml` (user-global configuration) This discovery order supports **monorepo workflows** where a single configuration can be placed in a parent directory and shared across multiple projects. The server walks up the directory tree until it finds a `.mcp-markdown-ragdocs/config.toml` file or reaches the filesystem root. If no configuration file is found, all options use their default values. ## Multi-Project Support The global configuration file (`~/.config/mcp-markdown-ragdocs/config.toml`) supports registering multiple projects. The server automatically detects which project you're working in based on your current directory and uses isolated indices for each project. See [Multi-Project Setup Guide](guides/multi-project-setup.md) for complete details. ### [[projects]] Define multiple projects with automatic detection and isolated storage. ```toml [[projects]] name = "monorepo" path = "/home/user/work/monorepo" [[projects]] name = "personal-notes" path = "/home/user/Documents/notes" ``` #### `name` - **Type:** string - **Required:** yes - **Constraints:** Alphanumeric, hyphens, and underscores only - **Description:** Unique identifier for the project. Used as directory name in data storage. #### `path` - **Type:** string - **Required:** yes - **Constraints:** Must be absolute path - **Description:** Project root directory. The server detects the project when your current directory is under this path. **Project Detection:** - If CWD matches or is a subdirectory of a registered project path, that project is active - For nested projects, the deepest match wins - Indices stored in `~/.local/share/mcp-markdown-ragdocs/{project-name}/` ## Configuration Sections ### [server] Controls the HTTP server behavior. #### `host` - **Type:** string - **Default:** `"127.0.0.1"` - **Description:** IP address to bind the server to. Use `"0.0.0.0"` to listen on all interfaces. - **Example:** ```toml [server] host = "0.0.0.0" # Listen on all interfaces ``` #### `port` - **Type:** integer - **Default:** `8000` - **Description:** TCP port for the HTTP server. - **Example:** ```toml [server] port = 8080 ``` ### [indexing] Controls document discovery and index management. #### `documents_path` - **Type:** string - **Default:** `"."` - **Description:** Path to the directory containing documents to index. Supports tilde expansion (`~`) and relative paths (resolved to absolute). - **Security Note:** Avoid pointing to high-level system directories (e.g., `/`, `/etc`, `$HOME`). Scope to specific project or notes folders. - **Example:** ```toml [indexing] documents_path = "~/Documents/ProjectDocs" ``` #### `index_path` - **Type:** string - **Default:** `".index_data/"` - **Description:** Directory where persistent indices are stored. Supports tilde expansion and relative paths. - **Storage Structure:** ``` {index_path}/ ├── index.manifest.json ├── vector/ ├── keyword/ └── graph/ ``` - **Example:** ```toml [indexing] index_path = "~/.cache/mcp-ragdocs-indices" ``` #### `recursive` - **Type:** boolean - **Default:** `true` - **Description:** Whether to search subdirectories recursively when discovering documents. - **Example:** ```toml [indexing] recursive = false # Only index top-level directory ``` #### `include` - **Type:** list[string] - **Default:** `["**/*"]` - **Description:** Glob patterns for files to include in indexing. Only files matching at least one include pattern will be indexed. Uses glob syntax similar to `.gitignore`. - **Example:** ```toml [indexing] include = ["**/*.md", "**/*.txt"] # Only markdown and text files ``` #### `exclude` - **Type:** list[string] - **Default:** `["**/.venv/**", "**/venv/**", "**/build/**", "**/dist/**", "**/.git/**", "**/node_modules/**", "**/__pycache__/**", "**/.pytest_cache/**"]` - **Description:** Glob patterns for files/directories to exclude from indexing. Exclude patterns take precedence over include patterns. - **Common patterns:** - Virtual environments: `**/.venv/**`, `**/venv/**` - Build artifacts: `**/build/**`, `**/dist/**`, `**/target/**` - Version control: `**/.git/**`, `**/.svn/**` - Dependencies: `**/node_modules/**`, `**/vendor/**` - Python cache: `**/__pycache__/**`, `**/.pytest_cache/**` - **Example:** ```toml [indexing] exclude = ["**/drafts/**", "**/archive/**", "**/templates/**"] ``` #### `exclude_hidden_dirs` - **Type:** boolean - **Default:** `true` - **Description:** Automatically exclude files in hidden directories (directories starting with `.`). When enabled, any file path containing a directory component that starts with a dot will be excluded, regardless of include/exclude patterns. This is useful for avoiding indexing of directories like `.stversions`, `.cache`, `.config`, etc. - **Behavior:** Hidden directory exclusion is checked before include/exclude pattern matching. Set to `false` to disable this feature and rely only on explicit exclude patterns. - **Example:** ```toml [indexing] exclude_hidden_dirs = false # Disable automatic hidden directory exclusion ``` - **Note:** This setting works independently of the exclude patterns. Even if `.git` is in your exclude list, setting `exclude_hidden_dirs = true` will also exclude `.stversions`, `.cache`, and any other hidden directories without needing to list them explicitly. #### `reconciliation_interval_seconds` - **Type:** integer - **Default:** `3600` (1 hour) - **Description:** Interval in seconds between automatic reconciliation checks. Reconciliation compares the filesystem with indexed files and automatically removes stale entries (deleted files or newly excluded files) and indexes new files. Set to `0` to disable periodic reconciliation. - **Behavior:** - Reconciliation always runs once on server startup - If enabled, runs periodically in the background at the specified interval - Catches edge cases like files deleted while server was offline, or config changes - **Range:** 0 (disabled) to any positive integer (recommended: 1800-7200 seconds) - **Example:** ```toml [indexing] reconciliation_interval_seconds = 1800 # Every 30 minutes # reconciliation_interval_seconds = 0 # Disable periodic reconciliation ``` - **Performance Impact:** Reconciliation is lightweight (just filesystem scan + comparison), typically adds <1s overhead per run. ### [parsers] Maps file glob patterns to parser class names. Enables extending the server to new file types. - **Type:** dict[string, string] - **Default:** ```toml [parsers] "**/*.md" = "MarkdownParser" "**/*.markdown" = "MarkdownParser" "**/*.txt" = "PlainTextParser" ``` - **Description:** Keys are glob patterns matched against file paths. Values are parser class names registered in `src/parsers/`. - **Behavior:** First matching pattern wins (pattern order matters). - **Example:** ```toml [parsers] "**/*.md" = "MarkdownParser" "**/*.txt" = "PlainTextParser" "docs/api/*.md" = "APIDocParser" # Custom parser for API docs ``` - **Available Parsers:** - `MarkdownParser`: Full markdown support with frontmatter, wikilinks, tags, and tree-sitter AST parsing - `PlainTextParser`: Plain text files with paragraph-based chunking (UTF-8 with fallback encoding support) ### [chunking] Controls document chunking strategy for vector indexing. #### `strategy` - **Type:** string - **Default:** `"header_based"` - **Description:** Chunking strategy to use. Currently only `"header_based"` is supported, which splits documents at Markdown headers. - **Note:** Changing this value triggers a full index rebuild on next startup. - **Example:** ```toml [chunking] strategy = "header_based" ``` #### `min_chunk_chars` - **Type:** integer - **Default:** `200` - **Description:** Minimum chunk size in characters. Chunks smaller than this will be merged with adjacent chunks. - **Range:** 50 to 10000 (typical: 100 to 500) - **Effect:** Smaller values create more granular chunks; larger values create broader context chunks. - **Example:** ```toml [chunking] min_chunk_chars = 200 ``` #### `max_chunk_chars` - **Type:** integer - **Default:** `2000` - **Description:** Maximum chunk size in characters. Chunks larger than this will be split at sentence boundaries. - **Range:** 500 to 20000 (typical: 1000 to 4000) - **Effect:** Smaller values create more focused chunks; larger values preserve more context. - **Example:** ```toml [chunking] max_chunk_chars = 1500 # Smaller chunks for focused retrieval ``` #### `overlap_chars` - **Type:** integer - **Default:** `100` - **Description:** Number of overlapping characters between adjacent chunks. Preserves context across chunk boundaries. - **Range:** 0 to 500 (typical: 50 to 200) - **Effect:** Larger overlap increases context preservation but storage overhead. - **Example:** ```toml [chunking] overlap_chars = 100 ``` #### `include_parent_headers` - **Type:** boolean - **Default:** `true` - **Description:** Whether to include parent section headers in chunk metadata. Enables semantic "breadcrumbs" in search results. - **Example:** ```toml [chunking] include_parent_headers = true ``` #### `parent_retrieval_enabled` - **Type:** boolean - **Default:** `false` - **Description:** Enable two-level chunking with parent document retrieval. When enabled, documents are chunked at two levels: larger parent sections (return unit) and smaller child chunks (retrieval unit). Search matches child chunks but returns parent sections for better context. - **Effect:** Improves retrieval precision while providing sufficient context for LLM consumption. - **Note:** Requires index rebuild when changing this setting. - **Example:** ```toml [chunking] parent_retrieval_enabled = true ``` #### `parent_chunk_min_chars` - **Type:** integer - **Default:** `1500` - **Description:** Minimum size in characters for parent chunks. Child chunks are grouped into parent sections until this minimum is reached. - **Range:** 500 to 10000 (typical: 1000 to 2000) - **Requires:** `parent_retrieval_enabled = true` - **Example:** ```toml [chunking] parent_retrieval_enabled = true parent_chunk_min_chars = 1500 ``` #### `parent_chunk_max_chars` - **Type:** integer - **Default:** `2000` - **Description:** Maximum size in characters for parent chunks. When accumulated child content exceeds this, a new parent section begins. - **Range:** 1000 to 20000 (typical: 1500 to 3000) - **Requires:** `parent_retrieval_enabled = true` - **Example:** ```toml [chunking] parent_retrieval_enabled = true parent_chunk_max_chars = 2000 ``` **Chunking Trade-offs:** | Setting | Small Values | Large Values | |---------|-------------|--------------| | `min_chunk_chars` | More granular results, may lose context | Better context, fewer results | | `max_chunk_chars` | Focused results, less context | Broader context, may dilute relevance | | `overlap_chars` | Less storage, less context | Better boundary matching, more storage | ### [git_indexing] Controls git commit history indexing and search. #### `enabled` - **Type:** boolean - **Default:** `true` - **Description:** Enable git commit history indexing and search. When enabled, the server discovers `.git` directories and indexes commit history for semantic search. - **Requirements:** Git binary must be in PATH - **Example:** ```toml [git_indexing] enabled = true ``` #### `exclude_patterns` - **Type:** list[string] - **Default:** `[".venv", "venv", "node_modules", "build", "dist", ".git", "__pycache__", ".pytest_cache", ".stversions"]` - **Description:** Directory names (not globs) to exclude when discovering git repositories. Matches against directory basename only. Uses same list as document indexing by default. - **Example:** ```toml [git_indexing] exclude_patterns = [".venv", "archive", "vendor"] ``` #### `max_delta_lines` - **Type:** integer - **Default:** `200` - **Description:** Maximum number of diff lines to store per commit. Diffs exceeding this limit are truncated with indicator showing omitted line count. Prevents embedding explosion for large commits. - **Range:** 50 to 1000 (typical: 100 to 300) - **Effect:** Smaller values reduce storage and embedding size; larger values preserve more diff context. - **Example:** ```toml [git_indexing] max_delta_lines = 200 ``` #### `parallel_workers` - **Type:** integer - **Default:** `4` - **Description:** Number of parallel workers for git commit parsing. Uses ThreadPoolExecutor (not multiprocessing) since git subprocesses release the GIL. Provides 2-4x speedup for 100+ commits. - **Range:** 1 to 16 (typical: 2 to 8) - **Effect:** Higher values improve indexing throughput at cost of CPU usage. - **Example:** ```toml [git_indexing] parallel_workers = 8 # Use 8 threads for faster indexing ``` #### `embed_batch_size` - **Type:** integer - **Default:** `32` - **Description:** Batch size for embedding generation during git commit indexing. Commits are batched together before embedding to improve throughput. - **Range:** 8 to 128 (typical: 16 to 64) - **Effect:** Larger batches improve embedding throughput; smaller batches reduce memory usage. - **Example:** ```toml [git_indexing] embed_batch_size = 64 # Larger batches for high-memory systems ``` **Complete Example:** ```toml [git_indexing] enabled = true exclude_patterns = [".venv", "node_modules", "build"] max_delta_lines = 200 parallel_workers = 4 embed_batch_size = 32 ``` **Behavior:** - Repository discovery respects `indexing.exclude` patterns - Commits indexed from all branches (`git log --all`) - Delta truncation applied per commit (early lines prioritized) - Embedding model shared with document indexing (BAAI/bge-small-en-v1.5) - Incremental updates via GitWatcher monitoring `.git/HEAD` and `.git/refs/` **Performance:** - Indexing: 60-240 commits/sec depending on `parallel_workers` (2-4x speedup with parallelization) - Query: 5ms average for 10k commits - Storage: ~2KB per commit (metadata + embedding + truncated delta) - Parallelization: ThreadPoolExecutor used (not ProcessPoolExecutor) because git subprocesses release the GIL ### [search] Controls hybrid search behavior and result fusion. #### `semantic_weight` - **Type:** float - **Default:** `1.0` - **Description:** Weight multiplier for semantic (vector) search results in RRF fusion. Higher values increase influence of semantic similarity. - **Range:** 0.0 to infinity (typical: 0.5 to 2.0) - **Example:** ```toml [search] semantic_weight = 1.5 # Prefer semantic matches ``` #### `keyword_weight` - **Type:** float - **Default:** `1.0` - **Description:** Weight multiplier for keyword (BM25) search results in RRF fusion. Higher values increase influence of exact term matches. - **Range:** 0.0 to infinity (typical: 0.5 to 2.0) - **Example:** ```toml [search] keyword_weight = 1.0 # Increase keyword match weight ``` #### `recency_bias` - **Type:** float - **Default:** `0.5` - **Description:** Configuration option for recency boost. **Note:** The current implementation uses fixed tier multipliers (1.2x for 7 days, 1.1x for 30 days) regardless of this setting. Reserved for future dynamic tier calculation. - **Range:** 0.0 (no recency bias) to 1.0 (maximum recency bias) - **Recency Tiers (applied during fusion):** - Last 7 days: 1.2x - Last 30 days: 1.1x - Over 30 days: 1.0x - **Example:** ```toml [search] recency_bias = 0.0 # (Currently unused, tiers are fixed) ``` #### `rrf_k_constant` - **Type:** integer - **Default:** `60` - **Description:** Constant `k` in Reciprocal Rank Fusion formula: `score = 1 / (k + rank)`. Higher values dampen the effect of top-ranked results. - **Range:** 1 to infinity (typical: 20 to 100) - **Effect:** Lower values make top results more dominant. Higher values distribute scores more evenly. - **Example:** ```toml [search] rrf_k_constant = 30 # Increase influence of top-ranked results ``` #### `min_confidence` - **Type:** float - **Default:** `0.3` - **Description:** Minimum calibrated confidence threshold for results. Results below this threshold are filtered out. Set to `0.0` to disable filtering. - **Range:** 0.0 to 1.0 (recommended: 0.3 for filtering low-relevance results) - **Effect:** Higher values return fewer but more relevant results. When no results meet the threshold, an empty list is returned. - **Notes:** Threshold applies **after sigmoid calibration** of RRF+recency scores. Legacy min-max normalization is deprecated and not used for final scores. - **Example:** ```toml [search] min_confidence = 0.3 # Filter results below 30% confidence ``` #### `score_calibration_threshold` - **Type:** float - **Default:** `0.035` - **Description:** RRF score corresponding to 50% confidence in sigmoid calibration. Controls the midpoint of the calibration curve. Raw RRF scores above this threshold map to >0.5 confidence, scores below map to <0.5 confidence. - **Range:** 0.01 to 0.10 (typical: 0.03 to 0.05) - **Effect:** Lower values shift the curve left (higher confidence for same raw score). Higher values shift right (lower confidence for same raw score). - **Tuning:** Adjust based on corpus characteristics. Use higher values for stricter confidence requirements. - **Example:** ```toml [search] score_calibration_threshold = 0.035 # Balanced threshold ``` #### `score_calibration_steepness` - **Type:** float - **Default:** `150.0` - **Description:** Controls steepness of the sigmoid calibration curve. Higher values create sharper transitions between low and high confidence. Lower values create gentler transitions. - **Range:** 50.0 to 300.0 (typical: 100.0 to 200.0) - **Effect:** Higher values emphasize score differences near the threshold, creating more separation. Lower values compress score differences, reducing separation. - **Tuning:** Increase for clearer distinction between good and weak matches. Decrease for smoother confidence gradation. - **Example:** ```toml [search] score_calibration_steepness = 150.0 # Standard steepness ``` #### `max_chunks_per_doc` - **Type:** integer - **Default:** `2` - **Description:** Maximum number of chunks from a single document in results. Prevents result lists dominated by one document with many matching sections. Set to `0` to disable. - **Range:** 0 (disabled) to any positive integer (recommended: 2-3) - **Effect:** Lower values increase result diversity across documents. - **Example:** ```toml [search] max_chunks_per_doc = 3 # Max 3 chunks per document ``` #### `dedup_enabled` - **Type:** boolean - **Default:** `false` - **Description:** Enable semantic deduplication of results. When enabled, chunks with high cosine similarity are clustered, and only one representative per cluster is returned. Reduces redundancy in results. - **Example:** ```toml [search] dedup_enabled = true ``` #### `dedup_similarity_threshold` - **Type:** float - **Default:** `0.80` - **Description:** Cosine similarity threshold for clustering chunks during deduplication. Chunks with similarity above this threshold are considered duplicates. - **Range:** 0.0 to 1.0 (recommended: 0.80 to 0.90) - **Effect:** Lower values cluster more aggressively (fewer results). Higher values preserve more distinct chunks. - **Requires:** `dedup_enabled = true` - **Example:** ```toml [search] dedup_enabled = true dedup_similarity_threshold = 0.85 # Stricter deduplication ``` #### `rerank_enabled` - **Type:** boolean - **Default:** `false` - **Description:** Enable cross-encoder re-ranking. When enabled, a cross-encoder model re-scores the top candidates after fusion and filtering, computing query-document relevance jointly for higher precision. - **Performance:** Adds ~50ms latency for 10 candidates on CPU. - **Example:** ```toml [search] rerank_enabled = true ``` #### `rerank_model` - **Type:** string - **Default:** `"cross-encoder/ms-marco-MiniLM-L-6-v2"` - **Description:** HuggingFace model identifier for the cross-encoder. The model is downloaded on first use and cached locally. - **Requires:** `rerank_enabled = true` - **Options:** - `cross-encoder/ms-marco-MiniLM-L-6-v2` (22MB, ~50ms/10 docs, recommended) - `cross-encoder/ms-marco-TinyBERT-L-2-v2` (17MB, ~30ms/10 docs, faster) - `BAAI/bge-reranker-base` (110MB, ~150ms/10 docs, higher quality) - **Example:** ```toml [search] rerank_enabled = true rerank_model = "cross-encoder/ms-marco-MiniLM-L-6-v2" ``` #### `rerank_top_n` - **Type:** integer - **Default:** `10` - **Description:** Maximum number of candidates to pass to the cross-encoder for re-ranking. The re-ranker processes this many top results from the fusion pipeline. - **Range:** 1 to 100 (recommended: 5 to 20) - **Effect:** Higher values may improve recall at the cost of latency (~5ms per additional candidate). - **Requires:** `rerank_enabled = true` - **Example:** ```toml [search] rerank_enabled = true rerank_top_n = 10 ``` #### `adaptive_weights_enabled` - **Type:** boolean - **Default:** `false` - **Description:** Enable automatic query type classification and adaptive weight adjustment. When enabled, the system detects query intent (factual, navigational, exploratory) and adjusts search strategy weights accordingly. - **Query Types:** - **Factual**: Queries with code identifiers, versions, quoted phrases → keyword weight × 1.5 - **Navigational**: Queries mentioning sections, guides, documentation → graph weight × 1.5 - **Exploratory**: Questions (what, how, why) → semantic weight × 1.3 - **Example:** ```toml [search] adaptive_weights_enabled = true ``` #### `code_search_enabled` - **Type:** boolean - **Default:** `false` - **Description:** Enable specialized code block search index. When enabled, code blocks extracted from Markdown are indexed separately with code-aware tokenization that handles camelCase, snake_case, and programming identifiers. - **Effect:** Improves retrieval of code snippets, function names, and technical identifiers. - **Example:** ```toml [search] code_search_enabled = true ``` #### `code_search_weight` - **Type:** float - **Default:** `1.0` - **Description:** Weight multiplier for code search results in RRF fusion. - **Range:** 0.0 to infinity (typical: 0.5 to 2.0) - **Requires:** `code_search_enabled = true` - **Example:** ```toml [search] code_search_enabled = true code_search_weight = 1.2 ``` #### `query_expansion_enabled` - **Type:** boolean - **Default:** `true` - **Description:** Enable query expansion via concept vocabulary. When enabled, queries are expanded with semantically similar terms from a pre-built vocabulary to improve recall. Disabling this feature skips vocabulary building and reduces startup time significantly for large document collections. - **Effect:** Improves search recall by finding documents with related but different terminology. Vocabulary building embeds unique terms from corpus, which can be expensive for large collections. - **Performance:** Vocabulary building time scales with corpus size (up to 10,000 terms by default). For large notebases (10,000+ documents), building can take several minutes. - **Example:** ```toml [search] query_expansion_enabled = false # Disable for faster startup on large corpora ``` #### `query_expansion_max_terms` - **Type:** integer - **Default:** `5000` - **Description:** Maximum number of unique terms to include in the concept vocabulary. Reducing this value significantly decreases vocabulary building time at the cost of query expansion coverage. - **Range:** 100 to 10000 (recommended: 1000-3000 for large corpora, 5000-10000 for small corpora) - **Effect:** Lower values = faster build time, less comprehensive expansion. Higher values = slower build time, more comprehensive expansion. - **Performance:** Each term requires one embedding forward pass (~5-10ms on CPU). 5000 terms ≈ 25-50 seconds, 2000 terms ≈ 10-20 seconds. - **Requires:** `query_expansion_enabled = true` - **Example:** ```toml [search] query_expansion_enabled = true query_expansion_max_terms = 2000 # Reduce from 5000 for faster builds ``` #### `query_expansion_min_frequency` - **Type:** integer - **Default:** `2` - **Description:** Minimum number of occurrences required for a term to be included in the concept vocabulary. Terms appearing fewer times are filtered out as noise. Higher values reduce vocabulary size and build time. - **Range:** 1 to any positive integer (recommended: 2-5) - **Effect:** Higher values filter more terms, reducing vocabulary size and build time at the cost of coverage for rare but meaningful terms. - **Performance:** Filtering by frequency happens before embedding, so this provides computational savings proportional to filtered terms. - **Requires:** `query_expansion_enabled = true` - **Example:** ```toml [search] query_expansion_enabled = true query_expansion_min_frequency = 3 # Only embed terms appearing 3+ times ``` #### `mmr_enabled` - **Type:** boolean - **Default:** `false` - **Description:** Enable Maximal Marginal Relevance (MMR) for result selection. MMR balances relevance with diversity by penalizing results similar to already-selected items. When enabled, replaces per-document limiting as the diversity mechanism. - **Example:** ```toml [search] mmr_enabled = true ``` #### `mmr_lambda` - **Type:** float - **Default:** `0.7` - **Description:** Lambda parameter for MMR selection. Controls the trade-off between relevance (1.0) and diversity (0.0). - **Range:** 0.0 to 1.0 - `1.0`: Pure relevance ranking (no diversity) - `0.7`: Balanced (default, recommended) - `0.5`: Equal weight to relevance and diversity - `0.3`: Diversity-focused - **Requires:** `mmr_enabled = true` - **Example:** ```toml [search] mmr_enabled = true mmr_lambda = 0.7 ``` #### `ngram_dedup_enabled` - **Type:** boolean - **Default:** `true` - **Description:** Enable n-gram overlap deduplication as a fast pre-filter before semantic deduplication. Uses character trigrams and Jaccard similarity to detect near-duplicate content. - **Effect:** Removes obvious duplicates cheaply, reducing candidates for expensive embedding-based dedup. - **Example:** ```toml [search] ngram_dedup_enabled = true ``` #### `ngram_dedup_threshold` - **Type:** float - **Default:** `0.7` - **Description:** Jaccard similarity threshold for n-gram deduplication. Chunks with n-gram similarity above this threshold are considered duplicates. - **Range:** 0.0 to 1.0 (recommended: 0.6 to 0.8) - **Effect:** Lower values cluster more aggressively (fewer results). Higher values preserve more distinct chunks. - **Requires:** `ngram_dedup_enabled = true` - **Example:** ```toml [search] ngram_dedup_enabled = true ngram_dedup_threshold = 0.7 ``` ### [search.advanced] Advanced search features including community detection, score-aware fusion, and hypothesis-driven search. #### `community_detection_enabled` - **Type:** boolean - **Default:** `true` - **Description:** Enable community detection using the Louvain algorithm. When enabled, documents are clustered into communities based on wikilink connectivity. Results in the same community as highly-ranked documents receive a score boost. - **Effect:** Improves retrieval for queries touching related documentation sections. - **Example:** ```toml [search.advanced] community_detection_enabled = true ``` #### `community_boost_factor` - **Type:** float - **Default:** `1.1` - **Description:** Score multiplier applied to results sharing a community with top-ranked documents. A value of 1.1 gives co-community results a 10% score boost. - **Range:** 1.0 to 2.0 (recommended: 1.05 to 1.2) - **Requires:** `community_detection_enabled = true` - **Example:** ```toml [search.advanced] community_detection_enabled = true community_boost_factor = 1.15 ``` #### `dynamic_weights_enabled` - **Type:** boolean - **Default:** `true` - **Description:** Enable score-aware dynamic fusion weights. When enabled, the system analyzes score variance from vector and keyword searches per query. Low variance (flat scores) indicates uncertain matches, reducing that strategy's weight. - **Effect:** Improves fusion by down-weighting unreliable signals automatically. - **Example:** ```toml [search.advanced] dynamic_weights_enabled = true ``` #### `variance_threshold` - **Type:** float - **Default:** `0.1` - **Description:** Variance threshold for dynamic weight adjustment. Strategies with score variance below this threshold have their weights reduced proportionally. - **Range:** 0.0 to 1.0 (recommended: 0.05 to 0.2) - **Requires:** `dynamic_weights_enabled = true` - **Example:** ```toml [search.advanced] dynamic_weights_enabled = true variance_threshold = 0.1 ``` #### `hyde_enabled` - **Type:** boolean - **Default:** `true` - **Description:** Enable Hypothetical Document Embeddings (HyDE) search. When enabled, the `search_with_hypothesis` tool accepts a hypothesis describing expected documentation content and searches using that embedding directly. - **Effect:** Improves retrieval for vague or abstract queries where the user can describe what they expect to find. - **Example:** ```toml [search.advanced] hyde_enabled = true ``` #### `default_edge_type` - **Type:** string - **Default:** `"links_to"` - **Description:** Default edge type for graph relationships when no explicit type is inferred from document context. - **Allowed values:** `"links_to"`, `"implements"`, `"tests"`, `"related"` - **Example:** ```toml [search.advanced] default_edge_type = "links_to" ``` **Complete Example:** ```toml [search.advanced] community_detection_enabled = true community_boost_factor = 1.1 dynamic_weights_enabled = true variance_threshold = 0.1 hyde_enabled = true default_edge_type = "links_to" ``` ### [llm] Controls embedding model and LLM provider configuration. #### `embedding_model` - **Type:** string - **Default:** `"local"` - **Description:** Identifier for the embedding model. Currently only `"local"` is supported, which uses HuggingFace `BAAI/bge-small-en-v1.5` (384 dimensions). - **Note:** Changing this value triggers a full index rebuild on next startup. - **Example:** ```toml [llm] embedding_model = "local" ``` ### [memory] Controls the AI Memory Management System for persistent cross-session storage. #### `enabled` - **Type:** boolean - **Default:** `false` - **Description:** Enable the Memory Management System. When enabled, the server exposes memory CRUD tools and maintains a separate memory corpus with its own indices. - **Example:** ```toml [memory] enabled = true ``` #### `storage_strategy` - **Type:** string (enum) - **Default:** `"project"` - **Allowed values:** `"project"`, `"user"` - **Description:** Storage location for memory files: - `"project"`: Store in `.memories/` within project directory (project-isolated) - `"user"`: Store in `~/.local/share/mcp-markdown-ragdocs/memories/` (shared across projects) - **Example:** ```toml [memory] storage_strategy = "user" # Shared memory bank ``` #### `score_threshold` - **Type:** float - **Default:** `0.2` - **Range:** `[0.0, 1.0]` - **Description:** Minimum score threshold after calibration and decay. Memories scoring below this value are filtered from search results. The threshold applies to calibrated scores (0-1 range) after sigmoid expansion and exponential decay. - **Purpose:** Controls precision/recall trade-off by filtering low-confidence results - **Recommendations:** - **Default (0.2)**: Balanced precision/recall for general use - **Lower (0.15 or 0.1)**: Prioritize recall, explore broadly, useful for older memory banks - **Higher (0.25 or 0.3)**: Prioritize precision, reduce noise, strict filtering - **Tuning Guide:** - Too few results? Lower to 0.15 or 0.1 - Too many irrelevant results? Raise to 0.25 or 0.3 - Check decay configuration if scores seem compressed despite tuning - **Example:** ```toml [memory] score_threshold = 0.2 # Default: filter results < 20% confidence # For high-precision retrieval score_threshold = 0.3 # For exploratory search score_threshold = 0.15 ``` - **Technical Details:** - Threshold applies after: RRF fusion → Calibration → Decay - Calibration uses sigmoid: `1 / (1 + exp(-steepness * (rrf_score - threshold)))` - Decay formula: `calibrated_score × max(floor, decay_rate^days_old)` - See [Memory Search Scoring](memory.md#search-scoring--calibration) for details ### [memory.recency_*] Per-memory-type exponential additive recency boost configuration. Memory search adds exponential bonus to recent memories: `final_score = base_score + (boost_rate^days × max_boost)` for memories within the boost window. Old memories retain base score (no penalty). **Available types:** `recency_journal`, `recency_plan`, `recency_fact`, `recency_observation`, `recency_reflection` #### `boost_window_days` - **Type:** integer - **Default:** Type-dependent (see table below) - **Description:** Number of days within which to apply recency boost. Memories older than this retain base score without penalty. - **Range:** 7 to 60 days (typical: 14 to 30 days) - **Example:** ```toml [memory.recency_journal] boost_window_days = 14 # Boost journals created in last 14 days ``` #### `max_boost_amount` - **Type:** float - **Default:** Type-dependent (see table below) - **Description:** Maximum bonus added to score at age 0. Decays exponentially as memory ages. - **Range:** 0.0 to 0.5 (typical: 0.1 to 0.2) - **Example:** ```toml [memory.recency_journal] max_boost_amount = 0.2 # Add up to 0.2 to base score ``` #### `boost_decay_rate` - **Type:** float - **Default:** Type-dependent (see table below) - **Description:** Exponential decay rate for boost amount. Higher values = slower decay of boost. - **Range:** 0.85 to 0.99 (typical: 0.90 to 0.98) - **Formula:** Bonus is `(boost_decay_rate^age_days) × max_boost_amount` - **Example:** ```toml [memory.recency_journal] boost_decay_rate = 0.95 # Boost decays by 5% per day ``` **Default Recency Boost Configurations:** | Memory Type | Window | Max Boost | Decay Rate | Section | |------------|--------|-----------|------------|----------| | `journal` | 14 days | 0.2 | 0.95 | `[memory.recency_journal]` | | `plan` | 21 days | 0.15 | 0.93 | `[memory.recency_plan]` | | `fact` | 7 days | 0.1 | 0.98 | `[memory.recency_fact]` | | `observation` | 14 days | 0.2 | 0.92 | `[memory.recency_observation]` | | `reflection` | 30 days | 0.15 | 0.98 | `[memory.recency_reflection]` | **Example configuration:** ```toml [memory] enabled = true storage_strategy = "project" score_threshold = 0.1 # Filter results below 10% confidence # Aggressive boost for ephemeral journals [memory.recency_journal] boost_window_days = 7 # Shorter window max_boost_amount = 0.3 # Higher boost boost_decay_rate = 0.90 # Faster decay # Minimal boost for evergreen facts [memory.recency_fact] boost_window_days = 30 # Longer window max_boost_amount = 0.05 # Small boost boost_decay_rate = 0.99 # Very slow decay ``` **Deprecated fields:** `recency_boost_days` and `recency_boost_factor` are deprecated and ignored if present. Use per-type recency configs instead. Deprecation warnings logged at startup. ## Complete Example Configuration ```toml # .mcp-markdown-ragdocs/config.toml [server] host = "127.0.0.1" port = 8000 [indexing] # Path to documentation directory documents_path = "~/Projects/my-project/docs" # Store indices in project-local directory index_path = ".ragdocs-index/" # Search subdirectories recursively recursive = true # Optional: Filter files by pattern include = ["**/*.md", "**/*.txt"] exclude = ["**/drafts/**", "**/archive/**"] [parsers] # Use MarkdownParser for all .md and .markdown files "**/*.md" = "MarkdownParser" "**/*.markdown" = "MarkdownParser" [chunking] # Chunking strategy for vector indexing strategy = "header_based" min_chunk_chars = 200 max_chunk_chars = 1500 overlap_chars = 100 include_parent_headers = true # Parent document retrieval (two-level chunking) parent_retrieval_enabled = false parent_chunk_min_chars = 1500 parent_chunk_max_chars = 2000 [search] # Balance semantic and keyword search equally semantic_weight = 1.0 keyword_weight = 1.0 # Moderate recency bias (prioritize documents modified in last 30 days) recency_bias = 0.6 # Standard RRF constant rrf_k_constant = 60 # Result filtering min_confidence = 0.3 # Filter results below 30% confidence max_chunks_per_doc = 2 # Limit chunks per document for diversity dedup_enabled = true # Enable semantic deduplication dedup_similarity_threshold = 0.85 # Similarity threshold for clustering # N-gram deduplication (fast pre-filter) ngram_dedup_enabled = true ngram_dedup_threshold = 0.7 # MMR diversity selection (alternative to max_chunks_per_doc) mmr_enabled = false mmr_lambda = 0.7 # Query type classification adaptive_weights_enabled = false # Code search code_search_enabled = false code_search_weight = 1.0 # Query expansion optimization query_expansion_enabled = true # Disable for faster startup on very large corpora query_expansion_max_terms = 5000 # Reduce (e.g., 2000) for faster vocabulary builds query_expansion_min_frequency = 2 # Increase (e.g., 3-5) to filter low-frequency terms # Re-ranking (adds ~50ms latency) rerank_enabled = true rerank_model = "cross-encoder/ms-marco-MiniLM-L-6-v2" rerank_top_n = 10 [llm] # Use local embedding model embedding_model = "local" [memory] # Enable AI memory bank enabled = false storage_strategy = "project" # "project" (local) or "user" (shared) # Per-type decay (defaults shown) [memory.decay_journal] decay_rate = 0.90 floor_multiplier = 0.1 [memory.decay_plan] decay_rate = 0.85 floor_multiplier = 0.1 [memory.decay_fact] decay_rate = 0.98 floor_multiplier = 0.2 ``` ## Environment Variables No environment variables are currently supported. All configuration is file-based. ## Configuration Scenarios ### Scenario 1: Personal Notes (Obsidian Vault) ```toml # .mcp-markdown-ragdocs/config.toml [indexing] documents_path = "~/Documents/ObsidianVault" index_path = "~/.cache/ragdocs-obsidian" [search] semantic_weight = 1.2 # Prefer conceptual connections keyword_weight = 0.8 recency_bias = 0.7 # Prioritize recent notes ``` ### Scenario 2: Project Documentation # .mcp-markdown-ragdocs/config.toml ```toml [indexing] documents_path = "~/Projects/myapp/docs" index_path = ".ragdocs-index/" [search] semantic_weight = 1.0 keyword_weight = 1.2 # Prefer exact API/function names recency_bias = 0.3 # Documentation changes less frequently ``` ### Scenario 3: Research Papers # .mcp-markdown-ragdocs/config.toml ```toml [indexing] documents_path = "~/Research/Papers" index_path = "~/.cache/ragdocs-research" recursive = true [search] semantic_weight = 1.5 # Emphasize semantic similarity keyword_weight = 0.5 recency_bias = 0.0 # Publication date more relevant than file mtime ``` ### Scenario 4: Multi-Team Documentation Server # .mcp-markdown-ragdocs/config.toml ```toml [server] host = "0.0.0.0" # Listen on all interfaces port = 8080 [indexing] documents_path = "/opt/team-docs" index_path = "/var/lib/ragdocs-indices" [search] semantic_weight = 1.0 keyword_weight = 1.0 recency_bias = 0.5 ``` ### Scenario 5: Monorepo Workflow For monorepo setups, place `.mcp-markdown-ragdocs/config.toml` in the repository root. All projects within the monorepo will inherit this configuration: ``` monorepo/ ├── .mcp-markdown-ragdocs/ │ └── config.toml # Shared configuration ├── project-a/ │ └── docs/ ├── project-b/ │ └── docs/ └── shared-docs/ ``` When running the server from `monorepo/project-a/`, it will discover and use `monorepo/.mcp-markdown-ragdocs/config.toml`. ```toml # .mcp-markdown-ragdocs/config.toml (in monorepo root) [indexing] # Index all docs in the monorepo documents_path = "." index_path = ".index_data/" recursive = true [search] semantic_weight = 1.0 keyword_weight = 1.0 recency_bias = 0.5 ``` ## Index Rebuild Triggers A full index rebuild is automatically triggered on server startup if: 1. **No manifest exists**: First run or corrupted index 2. **spec_version changed**: Index format upgrade 3. **embedding_model changed**: Different embedding dimensions or model 4. **parsers changed**: Different parser configuration To force a manual rebuild at any time: ```zsh uv run mcp-markdown-ragdocs rebuild-index ``` ## Configuration Validation Check your configuration file for errors: ```zsh uv run mcp-markdown-ragdocs check-config ``` Output shows resolved paths and all configuration values. ## CLI Override Options The `run` command accepts command-line overrides for server settings: ```zsh uv run mcp-markdown-ragdocs run --host 0.0.0.0 --port 8080 ``` These override any values in the configuration file for that server instance only.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/andnp/ragdocs-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

configuration.md•51.7 KiB