README.md•6.59 kB
# Embeddings Searcher for Claude Code Documentation
A focused embeddings-based search system for navigating markdown documentation in code repositories.
## Features
- **Semantic Search**: Uses sentence transformers to find relevant documentation based on meaning, not just keywords
- **Markdown-Focused**: Optimized for markdown documentation with intelligent chunking
- **Repository-Aware**: Organizes and searches across multiple repositories
- **MCP Integration**: Provides an MCP server for integration with Cursor/Claude
- **UV Package Management**: Uses UV for fast dependency management
## Quick Start for Claude Code
### 1. Clone and setup
```bash
git clone <this-repo>
cd kb
uv sync
```
### 2. Add your documentation
Place your documentation repositories in the `repos/` directory.
### 3. Index your documentation
```bash
uv run python embeddings_searcher.py --index
```
### 4. (Optional) Convert to ONNX for faster inference
```bash
uv run python onnx_convert.py --convert --test
```
### 5. Add MCP server to Claude Code
```bash
claude mcp add -- documentation-searcher uv run --directory /absolute/path/to/kb python mcp_server.py
```
*Replace `/absolute/path/to/kb` with the actual path to your project directory.*
### 6. Use in Claude Code
Ask Claude Code questions like:
- "Search for authentication patterns"
- "Find API documentation"
- "Look up configuration options"
The MCP server will automatically search through your indexed documentation and return relevant results.
## Quick Start
### 1. Index Documentation
First, index all markdown documentation in your repositories:
```bash
uv run python embeddings_searcher.py --index
```
This will:
- Find all `.md`, `.markdown`, and `.txt` files in the `repos/` directory
- Chunk them intelligently based on markdown structure
- Generate embeddings using sentence transformers
- Store everything in a SQLite database
### 2. Search Documentation
```bash
# Basic search
uv run python embeddings_searcher.py --query "API documentation"
# Search within a specific repository
uv run python embeddings_searcher.py --query "authentication" --repo "my-project.git"
# Limit results and set similarity threshold
uv run python embeddings_searcher.py --query "configuration" --max-results 5 --min-similarity 0.2
```
### 3. Get Statistics
```bash
# Show indexing statistics
uv run python embeddings_searcher.py --stats
# List indexed repositories
uv run python embeddings_searcher.py --list-repos
```
## MCP Server Integration
The project includes an MCP server for integration with Cursor/Claude:
```bash
# Start the MCP server
uv run python mcp_server.py
```
### MCP Tools Available
1. **search_docs**: Search through documentation using semantic similarity
2. **list_repos**: List all indexed repositories
3. **get_stats**: Get indexing statistics
4. **get_document**: Retrieve full document content by path
## Project Structure
```
kb/
├── embeddings_searcher.py # Main searcher implementation
├── mcp_server.py # MCP server for Claude Code integration
├── onnx_convert.py # ONNX model conversion utility
├── pyproject.toml # UV project configuration
├── embeddings_docs.db # SQLite database with embeddings
├── sentence_model.onnx # ONNX model (generated)
├── model_config.json # Model configuration (generated)
├── tokenizer/ # Tokenizer files (generated)
├── repos/ # Your documentation repositories
│ ├── project1.git/
│ ├── project2.git/
│ └── documentation.git/
└── README.md # This file
```
## How It Works
### Intelligent Chunking
The system chunks markdown documents based on:
- Header structure (H1, H2, H3, etc.)
- Content length (500 words per chunk)
- Semantic boundaries
### Embedding Generation
- Uses `all-MiniLM-L6-v2` sentence transformer model by default
- Supports ONNX models for faster inference
- Caches embeddings for efficient updates
### Search Algorithm
1. Generates embedding for your query
2. Compares against all document chunks using cosine similarity
3. Returns ranked results with context and metadata
4. Supports repository-specific searches
## CLI Options
### embeddings_searcher.py
```bash
# Indexing
--index # Index all repositories
--force # Force reindex of all documents
# Search
--query "search terms" # Search query
--repo "repo-name" # Search within specific repository
--max-results 10 # Maximum results to return
--min-similarity 0.1 # Minimum similarity threshold
# Information
--stats # Show indexing statistics
--list-repos # List indexed repositories
# Configuration
--kb-path /path/to/kb # Path to knowledge base
--db-path embeddings.db # Path to embeddings database
--model model-name # Sentence transformer model
--ignore-dirs [DIRS...] # Directories to ignore during indexing
```
### mcp_server.py
```bash
--kb-path /path/to/kb # Path to knowledge base
--docs-db-path embeddings.db # Path to docs embeddings database
--model model-name # Sentence transformer model
```
## ONNX Model Conversion
For faster inference, you can convert the sentence transformer model to ONNX format:
```bash
# Convert model to ONNX
uv run python onnx_convert.py --convert
# Test ONNX model
uv run python onnx_convert.py --test
# Convert and test in one command
uv run python onnx_convert.py --convert --test
```
## Example Usage
```bash
# Index documentation
uv run python embeddings_searcher.py --index
# Search for API documentation
uv run python embeddings_searcher.py --query "API endpoints"
# Search for authentication in specific repository
uv run python embeddings_searcher.py --query "user authentication" --repo "my-project.git"
# Get detailed statistics
uv run python embeddings_searcher.py --stats
```
## Performance
- **Indexing**: ~1400 documents in ~1 minute
- **Search**: Sub-second response times
- **Storage**: ~50MB for embeddings database with 6500+ chunks
- **Memory**: ~500MB during indexing, ~200MB during search
## Troubleshooting
### Unicode Errors
Some files may have encoding issues. The system automatically falls back to latin-1 encoding for problematic files.
### Large Files
Files larger than 1MB are automatically skipped to prevent memory issues.
### Model Loading
If sentence-transformers is not available, the system will attempt to use ONNX models or fall back to dummy embeddings for testing.