Embeddings Searcher for Claude Code Documentation
A focused embeddings-based search system for navigating markdown documentation in code repositories.
Features
- Semantic Search: Uses sentence transformers to find relevant documentation based on meaning, not just keywords
- Markdown-Focused: Optimized for markdown documentation with intelligent chunking
- Repository-Aware: Organizes and searches across multiple repositories
- MCP Integration: Provides an MCP server for integration with Cursor/Claude
- UV Package Management: Uses UV for fast dependency management
Quick Start for Claude Code
1. Clone and setup
2. Add your documentation
Place your documentation repositories in the repos/
directory.
3. Index your documentation
4. (Optional) Convert to ONNX for faster inference
5. Add MCP server to Claude Code
Replace /absolute/path/to/kb
with the actual path to your project directory.
6. Use in Claude Code
Ask Claude Code questions like:
- "Search for authentication patterns"
- "Find API documentation"
- "Look up configuration options"
The MCP server will automatically search through your indexed documentation and return relevant results.
Quick Start
1. Index Documentation
First, index all markdown documentation in your repositories:
This will:
- Find all
.md
,.markdown
, and.txt
files in therepos/
directory - Chunk them intelligently based on markdown structure
- Generate embeddings using sentence transformers
- Store everything in a SQLite database
2. Search Documentation
3. Get Statistics
MCP Server Integration
The project includes an MCP server for integration with Cursor/Claude:
MCP Tools Available
- search_docs: Search through documentation using semantic similarity
- list_repos: List all indexed repositories
- get_stats: Get indexing statistics
- get_document: Retrieve full document content by path
Project Structure
How It Works
Intelligent Chunking
The system chunks markdown documents based on:
- Header structure (H1, H2, H3, etc.)
- Content length (500 words per chunk)
- Semantic boundaries
Embedding Generation
- Uses
all-MiniLM-L6-v2
sentence transformer model by default - Supports ONNX models for faster inference
- Caches embeddings for efficient updates
Search Algorithm
- Generates embedding for your query
- Compares against all document chunks using cosine similarity
- Returns ranked results with context and metadata
- Supports repository-specific searches
CLI Options
embeddings_searcher.py
mcp_server.py
ONNX Model Conversion
For faster inference, you can convert the sentence transformer model to ONNX format:
Example Usage
Performance
- Indexing: ~1400 documents in ~1 minute
- Search: Sub-second response times
- Storage: ~50MB for embeddings database with 6500+ chunks
- Memory: ~500MB during indexing, ~200MB during search
Troubleshooting
Unicode Errors
Some files may have encoding issues. The system automatically falls back to latin-1 encoding for problematic files.
Large Files
Files larger than 1MB are automatically skipped to prevent memory issues.
Model Loading
If sentence-transformers is not available, the system will attempt to use ONNX models or fall back to dummy embeddings for testing.
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
Enables semantic search through markdown documentation in code repositories using AI embeddings. Provides intelligent document chunking and similarity-based search to help users find relevant documentation based on meaning rather than just keywords.