# Lenny RAG MCP Server
An MCP server providing hierarchical RAG over 299 Lenny Rachitsky podcast transcripts. Enables product development brainstorming by retrieving relevant insights, real-world examples, and full transcript context.
## Quick Start
```bash
# Clone the repository (includes pre-built index via Git LFS)
git clone git@github.com:mpnikhil/lenny-rag-mcp.git
cd lenny-rag-mcp
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate
# Install the package
pip install -e .
```
### Claude Code
```bash
claude mcp add lenny --scope user -- /path/to/lenny-rag-mcp/venv/bin/python -m src.server
```
Or add to `~/.claude.json`:
```json
{
"mcpServers": {
"lenny": {
"type": "stdio",
"command": "/path/to/lenny-rag-mcp/venv/bin/python",
"args": ["-m", "src.server"],
"cwd": "/path/to/lenny-rag-mcp"
}
}
}
```
### Claude Desktop
Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
```json
{
"mcpServers": {
"lenny": {
"command": "/path/to/lenny-rag-mcp/venv/bin/python",
"args": ["-m", "src.server"],
"cwd": "/path/to/lenny-rag-mcp"
}
}
}
```
### Cursor
Add to `.cursor/mcp.json` in your project or `~/.cursor/mcp.json` globally:
```json
{
"mcpServers": {
"lenny": {
"command": "/path/to/lenny-rag-mcp/venv/bin/python",
"args": ["-m", "src.server"],
"cwd": "/path/to/lenny-rag-mcp"
}
}
}
```
> Replace `/path/to/lenny-rag-mcp` with your actual clone location in all configs.
---
## MCP Tools
### `search_lenny`
Semantic search across the entire corpus. Returns pointers for progressive disclosure.
| Parameter | Type | Description |
|-----------|------|-------------|
| `query` | string | Search query (e.g., "pricing B2B products", "founder mode") |
| `top_k` | integer | Number of results (default: 5, max: 20) |
| `type_filter` | string | Filter by type: `insight`, `example`, `topic`, `episode` |
**Returns:** Ranked results with relevance scores, episode references, and topic IDs for drilling down.
### `get_chapter`
Load a specific topic with full context. Use after `search_lenny` to get details.
| Parameter | Type | Description |
|-----------|------|-------------|
| `episode` | string | Episode filename (e.g., "Brian Chesky.txt") |
| `topic_id` | string | Topic ID (e.g., "topic_3") |
**Returns:** Topic summary, all insights, all examples, and raw transcript segment.
### `get_full_transcript`
Load complete episode transcript with metadata.
| Parameter | Type | Description |
|-----------|------|-------------|
| `episode` | string | Episode filename (e.g., "Brian Chesky.txt") |
**Returns:** Full transcript (10-40K tokens), episode metadata, and topic list.
### `list_episodes`
Browse available episodes, optionally filtered by expertise.
| Parameter | Type | Description |
|-----------|------|-------------|
| `expertise_filter` | string | Filter by tag (e.g., "growth", "pricing", "AI") |
**Returns:** List of 299 episodes with guest names and expertise tags.
---
## Data Curation Approach
### Hierarchical Extraction
Each transcript is processed into a 4-level hierarchy enabling progressive disclosure:
```
Episode
├── Topics (10-20 per episode)
│ ├── Insights (2-4 per topic)
│ └── Examples (1-3 per topic)
```
This allows Claude to start with lightweight search results and drill down only when needed, keeping context windows efficient.
### Extraction Schema
```json
{
"episode": {
"guest": "Guest Name",
"expertise_tags": ["growth", "pricing", "leadership"],
"summary": "150-200 word episode summary",
"key_frameworks": ["Framework 1", "Framework 2"]
},
"topics": [{
"id": "topic_1",
"title": "Searchable topic title",
"summary": "Topic summary",
"line_start": 1,
"line_end": 150
}],
"insights": [{
"id": "insight_1",
"text": "Actionable insight or contrarian take",
"context": "Additional context",
"topic_id": "topic_1",
"line_start": 45,
"line_end": 52
}],
"examples": [{
"id": "example_1",
"explicit_text": "The story as told in the transcript",
"inferred_identity": "Airbnb",
"confidence": "high",
"tags": ["marketplace", "growth", "launch strategy"],
"lesson": "Specific lesson from this example",
"topic_id": "topic_1",
"line_start": 60,
"line_end": 85
}]
}
```
### Implicit Anchor Detection
Many guests reference companies without naming them ("at my previous company..."). The extraction prompt instructs the model to infer identities based on the guest's background:
- Brian Chesky saying "when we started" → Airbnb (high confidence)
- A marketplace expert saying "one ride-sharing company" → likely Uber/Lyft (medium confidence)
This surfaces examples that wouldn't be found by keyword search alone.
### Quality Thresholds
Each transcript extraction is validated against minimum thresholds:
| Element | Minimum | Typical |
|---------|---------|---------|
| Topics | 10 | 15-20 |
| Insights | 15 | 25-35 |
| Examples | 10 | 18-25 |
Extractions below thresholds trigger warnings for manual review.
---
## Models & Tech Stack
| Component | Model/Tool | Purpose |
|-----------|------------|---------|
| Preprocessing | **Claude Haiku** (via Claude CLI) | Extract structured hierarchy from transcripts |
| Embeddings | **bge-small-en-v1.5** | Semantic similarity for search |
| Vector DB | **ChromaDB** | Persistent vector storage |
| MCP Framework | **mcp** (Python SDK) | Tool interface for Claude |
### Why Claude Haiku for Preprocessing?
- **Quality**: Haiku follows complex extraction prompts reliably
- **Cost**: ~$0.02-0.03 per transcript (~$6-9 total for 299 episodes)
- **Speed**: ~30 seconds per transcript
### Why bge-small-en-v1.5 for Embeddings?
- **Performance**: Top-tier retrieval quality for its size
- **Efficiency**: 384 dimensions, fast inference
- **Local**: Runs entirely on CPU, no API calls needed
---
## Corpus Statistics
| Metric | Count |
|--------|-------|
| Episodes | 299 |
| Topics | 6,183 |
| Insights | 8,840 |
| Examples | 6,502 |
| Avg topics/episode | 20.7 |
| Avg insights/episode | 29.6 |
| Avg examples/episode | 21.7 |
---
## Rebuilding the Index
The repo includes a pre-built ChromaDB index. To rebuild from scratch:
### Reprocess Transcripts (requires Claude CLI)
```bash
# Process all unprocessed transcripts
python scripts/preprocess_haiku.py
# Process specific file
python scripts/preprocess_haiku.py --file "Brian Chesky.txt"
# Parallel processing (4 batches of 50)
python scripts/preprocess_haiku.py --limit 50 --offset 0 &
python scripts/preprocess_haiku.py --limit 50 --offset 50 &
python scripts/preprocess_haiku.py --limit 50 --offset 100 &
python scripts/preprocess_haiku.py --limit 50 --offset 150 &
```
### Rebuild Embeddings
```bash
# Incremental (only new files)
python scripts/embed.py
# Full rebuild
python scripts/embed.py --rebuild
```
---
## Project Structure
```
lenny-rag-mcp/
├── transcripts/ # 299 raw .txt podcast transcripts
├── preprocessed/ # Extracted JSON hierarchy (one per episode)
├── chroma_db/ # Vector embeddings (Git LFS)
├── prompts/
│ └── extraction.md # Haiku extraction prompt
├── src/
│ ├── server.py # MCP server & tool definitions
│ ├── retrieval.py # LennyRetriever class (ChromaDB wrapper)
│ └── utils.py # File loading utilities
├── scripts/
│ ├── preprocess_haiku.py # Claude CLI preprocessing
│ └── embed.py # ChromaDB embedding pipeline
└── pyproject.toml
```
---
## License
MIT