# LODA MCP Server
> **LLM-Optimized Document Access** - A Model Context Protocol server for token-efficient document search in Claude Desktop and Claude Code.
[](https://opensource.org/licenses/MIT)
[](https://nodejs.org/)
[](https://modelcontextprotocol.io/)
[](#testing)
---
## What is LODA?
**LODA** (LLM-Optimized Document Access) is a search strategy designed specifically for how LLMs consume documents. Instead of returning raw matches or arbitrary chunks, LODA understands document structure and returns the most relevant *sections* within your token budget.
### The Problem
When LLMs work with large documents, they face a fundamental challenge:
| Traditional Approach | Problem |
|---------------------|---------|
| Load entire document | Exceeds context limits |
| Keyword search | No relevance ranking, returns too much |
| RAG/Vector search | Requires infrastructure, 200-500ms latency |
| Chunk-based retrieval | Arbitrary boundaries break coherence |
We discovered a **"gap zone"** at 25-35% document positions where traditional smart retrieval actually performed *worse* than brute-force loading.
### The Solution
LODA combines lightweight techniques to achieve **vector search quality** at **grep-like speeds**:
```
┌─────────────────┐ ┌──────────────────────┐ ┌─────────────────┐
│ Large Document │────▶│ LODA Search Engine │────▶│ Relevant Sections│
│ (5000+ lines) │ │ • Bloom Filters │ │ within budget │
│ │ │ • Token Budget │ │ (~200 tokens) │
│ │ │ • Relevance Scoring │ │ │
└─────────────────┘ │ • Smart Caching │ └─────────────────┘
└──────────────────────┘
```
**Results:**
- **70-95% token savings** compared to loading full document
- **1-5ms search latency** (cached) vs 200-500ms for RAG
- **Zero external dependencies** - no vector database needed
---
## Quick Start
### 1. Installation
```bash
git clone https://github.com/patrickkarle/loda-mcp-server.git
cd loda-mcp-server
npm install
```
### 2. Configure Claude Desktop
Find your config file:
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
- **Linux**: `~/.config/Claude/claude_desktop_config.json`
Add this to the file:
```json
{
"mcpServers": {
"loda": {
"command": "node",
"args": ["/full/path/to/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
}
}
}
```
### 3. Configure Claude Code
Add to your project's `.claude/settings.json` or global `~/.claude/settings.json`:
```json
{
"mcpServers": {
"loda": {
"command": "node",
"args": ["/full/path/to/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
}
}
}
```
### 4. Use It!
Ask Claude:
> "Use loda_search to find the authentication section in api-docs.md"
> "Search architecture.md for deployment instructions with a 500 token budget"
---
## How LODA Works
### 1. Bloom Filter Elimination
Before scoring, LODA uses [Bloom filters](https://en.wikipedia.org/wiki/Bloom_filter) to instantly eliminate sections that definitely don't contain your search terms. This O(1) operation typically eliminates 80%+ of sections.
### 2. Section-Aware Parsing
LODA respects your document's structure. It understands markdown headings and returns complete logical sections, not arbitrary text chunks.
### 3. Relevance Scoring
Each candidate section is scored based on:
- **Query term presence** in content (0.8 base score)
- **Header match bonus** (+0.2 for header matches)
- **Multi-term coverage** (all terms weighted equally)
### 4. Token Budget Selection
You specify a token budget, LODA returns the best sections that fit:
```javascript
// "I need info about auth, but only have 500 tokens of context"
{
query: "authentication",
contextBudget: 500
}
```
### 5. Aggressive Caching
Document structures and Bloom filters are cached with TTL (60s default). Repeated searches on the same document are 10x+ faster.
---
## API Reference
### loda_search
The main search tool.
**Parameters:**
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| `documentPath` | string | Yes | - | Path to document (relative to staging or absolute) |
| `query` | string | Yes | - | Search keywords or phrase |
| `contextBudget` | number | No | null | Maximum tokens to return (null = unlimited) |
| `maxSections` | number | No | 5 | Maximum sections to return |
**Example Request:**
```json
{
"documentPath": "api-docs.md",
"query": "authentication oauth",
"contextBudget": 500,
"maxSections": 3
}
```
**Example Response:**
```json
{
"query": "authentication oauth",
"documentPath": "/path/to/api-docs.md",
"sections": [
{
"id": "section-5",
"header": "OAuth 2.0 Authentication",
"level": 3,
"score": 1.0,
"lineRange": [27, 41],
"tokenEstimate": 88
},
{
"id": "section-4",
"header": "API Key Authentication",
"level": 3,
"score": 0.8,
"lineRange": [15, 26],
"tokenEstimate": 66
}
],
"metadata": {
"totalSections": 21,
"candidatesAfterBloom": 5,
"scoredAboveZero": 3,
"returnedSections": 2,
"totalTokens": 154,
"budgetStatus": "SAFE",
"truncated": false,
"cacheHit": true
}
}
```
### Budget Status Values
| Status | Meaning |
|--------|---------|
| `UNLIMITED` | No budget was specified |
| `SAFE` | Total tokens under 80% of budget |
| `WARNING` | Total tokens between 80-100% of budget |
| `EXCEEDED` | Over budget (first section always returned) |
### Other Tools
| Tool | Description |
|------|-------------|
| `list_document_sections` | Get hierarchical structure of document |
| `read_section` | Read specific section by ID with context |
| `read_lines` | Read specific line range |
| `search_content` | Basic regex search (no LODA optimization) |
---
## Staging Directory
By default, LODA looks for documents in the `staging/` subdirectory:
```
loda-mcp-server/
├── staging/ ← Put documents here
│ ├── api-docs.md
│ ├── architecture.md
│ └── user-guide.md
└── document_access_mcp_server.js
```
You can also use absolute paths to search any document on your system.
---
## HTTP Mode (Development/Testing)
For testing without Claude, run the server in HTTP mode:
```bash
node document_access_mcp_server.js --mode=http --port=49400
```
Then test with curl:
```bash
# Health check
curl http://localhost:49400/health
# List tools
curl http://localhost:49400/tools
# Search
curl -X POST http://localhost:49400/tools/loda_search \
-H "Content-Type: application/json" \
-d '{"documentPath": "api-docs.md", "query": "authentication"}'
```
---
## Performance
| Metric | Target | Achieved |
|--------|--------|----------|
| Search latency (cached) | <10ms | **1-5ms** |
| Search latency (cold) | <100ms | **20-50ms** |
| Token savings | >70% | **70-95%** |
| Bloom filter effectiveness | >80% | **~85%** |
| Cache hit rate | >80% | **~90%** |
---
## Testing
```bash
# Run all LODA tests
npm test
# Run specific component tests
npm test -- tests/loda_search_handler.test.js
# Run with coverage
npm test -- --coverage
```
**Test Results: 46/46 Passing**
| Component | Tests | Status |
|-----------|-------|--------|
| token_estimator | 6 | ✅ |
| relevance_scorer | 8 | ✅ |
| budget_manager | 6 | ✅ |
| bloom_filter | 10 | ✅ |
| loda_index | 8 | ✅ |
| loda_search_handler | 8 | ✅ |
---
## Architecture
```
loda/
├── token_estimator.js # Pure token estimation (~4 chars/token)
├── relevance_scorer.js # Section relevance scoring
├── budget_manager.js # Token budget selection
├── loda_index.js # Cached document structure (TTL + LRU)
├── bloom_filter.js # O(1) section elimination
├── loda_search_handler.js # Main orchestrator
└── index.js # Module entry
document_access_mcp_server.js # MCP server with 5 tools
```
---
## Research & Development
This project was built using the **Continuum Development Process (CDP)**, a 13-phase methodology that emphasizes traceability and quality gates.
### Why We Built This
We tried several approaches before arriving at LODA:
| Approach | Why It Failed |
|----------|---------------|
| **Semantic Chunking** | Arbitrary boundaries split logical units |
| **RAG + Vector Search** | Too much infrastructure for single-doc access |
| **JIT-Steg Retrieval** | "Gap zone" at 25-35% where overhead exceeded brute-force |
| **Simple Grep** | No relevance ranking, no token awareness |
LODA combines the best of each: section awareness, fast elimination, budget control, and zero external dependencies.
### Research Documents
- [ULTRATHINK Analysis](warehouse/blueprints/cdp-loda-decomposition/loda-mcp-server/00-ultrathink/) - Problem analysis from 5+ perspectives
- [Research Notes](warehouse/blueprints/cdp-loda-decomposition/loda-mcp-server/01-research-notes/) - Literature review and approach comparison
- [Implementation Plan](warehouse/blueprints/cdp-loda-decomposition/loda-mcp-server/02-plan/) - Technical architecture
- [Testing Plan](warehouse/blueprints/cdp-loda-decomposition/loda-mcp-server/04-testing-plan/) - 61 test cases specified
---
## Configuration Examples
### Claude Desktop (Windows)
`%APPDATA%\Claude\claude_desktop_config.json`:
```json
{
"mcpServers": {
"loda": {
"command": "node",
"args": ["C:/Users/YourName/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
}
}
}
```
### Claude Desktop (macOS/Linux)
`~/Library/Application Support/Claude/claude_desktop_config.json`:
```json
{
"mcpServers": {
"loda": {
"command": "node",
"args": ["/home/yourname/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
}
}
}
```
### Claude Code (Project-level)
`.claude/settings.json`:
```json
{
"mcpServers": {
"loda": {
"command": "node",
"args": ["/path/to/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
}
}
}
```
---
## Contributing
1. Fork the repository
2. Create a feature branch
3. Write tests for new functionality
4. Submit a PR with documentation
---
## License
MIT License - see [LICENSE](LICENSE) for details.
---
## Acknowledgments
- Built for [Model Context Protocol](https://modelcontextprotocol.io/)
- Developed using [Continuum Development Process](https://github.com/continuum-persistence/continuum)
- Inspired by information retrieval research on probabilistic data structures
---
**Made with 🧠 for LLMs that need to read documents efficiently.**