## YTPipe MCP Architecture - Implementation Complete (Phase 1)
### π― What We Built
We've successfully transformed ytpipe from a monolithic CLI tool into a **modular MCP backend** with microservices architecture. This is Phase 1 of the full transformation.
---
## π Architecture Overview
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP SERVER LAYER β
β (FastMCP - stdio transport) β
β Currently: 8 tools (pipeline + query) β
β Target: 12 tools (+ analytics) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββ΄βββββββββββ
β β
PIPELINE TOOLS (4) QUERY TOOLS (4)
ββ process_video ββ search
ββ download ββ find_similar
ββ transcribe ββ get_chunk
ββ embed ββ get_metadata
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PIPELINE ORCHESTRATOR β
β ytpipe/core/pipeline.py - coordinates services β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββΌββββββββββββββββββββββ
β β β
EXTRACTORS PROCESSORS INTELLIGENCE
β β β
DownloadService ChunkerService SearchService
TranscriberService EmbedderService (SEO - TODO)
VectorStoreService (Timeline - TODO)
β
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA MODELS LAYER β
β (Pydantic - canonical data structures) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
---
## ποΈ New Directory Structure
```
ytpipe/
βββ __init__.py β
CREATED
βββ core/
β βββ __init__.py β
CREATED
β βββ models.py β
CREATED (500 lines, 11 models)
β βββ exceptions.py β
CREATED (150 lines, 10 exceptions)
β βββ pipeline.py β
CREATED (250 lines, orchestrator)
β
βββ services/
β βββ __init__.py β
CREATED
β βββ extractors/
β β βββ __init__.py β
CREATED
β β βββ downloader.py β
CREATED (200 lines, yt-dlp wrapper)
β β βββ transcriber.py β
CREATED (150 lines, Whisper wrapper)
β β
β βββ processors/
β β βββ __init__.py β
CREATED
β β βββ chunker.py β
CREATED (200 lines, semantic chunking + timestamps)
β β βββ embedder.py β
CREATED (180 lines, sentence-transformers)
β β βββ vector_store.py β
CREATED (250 lines, multi-backend wrapper)
β β
β βββ intelligence/
β βββ __init__.py β
CREATED
β βββ search.py β
CREATED (200 lines, full-text search)
β βββ seo.py β³ TODO
β βββ timeline.py β³ TODO
β
βββ mcp/
βββ __init__.py β
CREATED
βββ server.py β
CREATED (300 lines, 8 MCP tools)
```
**Total: 16 files created, ~2,300 lines of production code**
---
## π οΈ What Each Layer Does
### 1. Data Models (`ytpipe/core/models.py`)
**Purpose**: Canonical data structures enforced by Pydantic
**Key Models**:
- `VideoMetadata` - Video info from YouTube
- `Chunk` - Text chunk with **timestamps** (NEW) and optional embedding
- `ProcessingResult` - Complete pipeline output
- `SearchResult` - Full-text search result with context
- `SimilarityResult` - Semantic search result
- `VectorStoreConfig` - Vector database configuration
**Why Pydantic?**
- Type safety at runtime
- Automatic validation
- JSON serialization (.dict())
- API documentation from field descriptions
### 2. Exceptions (`ytpipe/core/exceptions.py`)
**Purpose**: Domain-specific errors for precise error handling
**Exceptions**:
- `DownloadError` - yt-dlp failures, invalid URLs
- `TranscriptionError` - Whisper failures, missing audio
- `EmbeddingError` - Model loading, dimension mismatches
- `VectorStoreError` - Backend initialization, search failures
- `SearchError` - Empty queries, invalid parameters
### 3. Services Layer
#### **Extractors** (Pull data from external sources)
**DownloadService** (`extractors/downloader.py`):
- Downloads YouTube videos with yt-dlp
- Extracts comprehensive metadata
- Audio-only optimization (faster)
- Video ID extraction from URLs
**TranscriberService** (`extractors/transcriber.py`):
- Whisper AI transcription
- Lazy model loading (memory efficient)
- Model caching across calls
- GPU acceleration (automatic)
#### **Processors** (Transform data)
**ChunkerService** (`processors/chunker.py`):
- Semantic text chunking (sliding window + overlap)
- **NEW**: Timestamp calculation for each chunk (MM:SS format)
- Quality score assignment
- Character position tracking
**EmbedderService** (`processors/embedder.py`):
- Sentence-transformers embeddings
- 384-dimensional vectors (all-MiniLM-L6-v2)
- Batch processing
- Query embedding for search
**VectorStoreService** (`processors/vector_store.py`):
- Multi-backend vector storage (ChromaDB, FAISS, Qdrant)
- Wraps existing `VectorStoreManager`
- Semantic similarity search
- Chunk retrieval by ID
#### **Intelligence** (High-level analysis)
**SearchService** (`intelligence/search.py`):
- **NEW**: Full-text transcript search
- Context extraction (before/after matches)
- Keyword highlighting
- Occurrence counting
### 4. Pipeline Orchestrator (`ytpipe/core/pipeline.py`)
**Purpose**: Coordinates all services in sequence
**8 Phases** (currently implements 5):
1. β
Download (DownloadService)
2. β
Transcription (TranscriberService)
3. β
Chunking (ChunkerService with **timestamps**)
4. β
Embeddings (EmbedderService)
5. β
Export (JSON/JSONL/TXT files)
6. β³ Dashboard (HTML generation) - TODO
7. β³ Docling (Granite-Docling processing) - TODO
8. β
Vector Storage (VectorStoreService)
**Features**:
- Async execution throughout
- Per-phase timing tracking
- Graceful error handling
- Progress reporting
### 5. MCP Server (`ytpipe/mcp/server.py`)
**Purpose**: Expose ytpipe to AI agents via MCP protocol
**8 Tools Implemented**:
#### Pipeline Tools (4)
1. `ytpipe_process_video` - Full 8-phase pipeline
2. `ytpipe_download` - Download only (Phase 1)
3. `ytpipe_transcribe` - Transcribe audio file (Phase 2)
4. `ytpipe_embed` - Generate embedding for text
#### Query Tools (4)
5. `ytpipe_search` - Full-text search with context
6. `ytpipe_find_similar` - Semantic similarity search
7. `ytpipe_get_chunk` - Retrieve specific chunk
8. `ytpipe_get_metadata` - Get video metadata
**Transport**: stdio (for Claude Code integration)
---
## π How to Use
### 1. Install Dependencies
```bash
cd /Users/lech/PROJECTS_all/PROJECT_youtube/_PRODUCT
# Install MCP server dependencies
pip install mcp fastmcp pydantic
# Existing dependencies (already installed)
# yt-dlp, whisper, sentence-transformers, chromadb
```
### 2. Run MCP Server (for Claude Code)
```bash
# Start MCP server on stdio
python -m ytpipe.mcp.server
```
### 3. Use from Claude Code
**Process a video**:
```
User: "Process this YouTube video: https://youtube.com/watch?v=VIDEO_ID"
Claude: *calls ytpipe_process_video*
Claude: "Video processed! Found 42 chunks. Stored in ChromaDB."
```
**Search transcript**:
```
User: "Search for mentions of 'OpenClaw' in video VIDEO_ID"
Claude: *calls ytpipe_search*
Claude: "Found 5 mentions across 3 chunks:
- Chunk 12 (2:30-3:15): '...OpenClaw integration allows...'
- Chunk 28 (7:45-8:20): '...OpenClaw API provides...'"
```
**Find similar content**:
```
User: "Find chunks similar to chunk 12 in video VIDEO_ID"
Claude: *calls ytpipe_find_similar*
Claude: "Most similar chunks:
1. Chunk 28 (similarity: 0.92)
2. Chunk 15 (similarity: 0.87)
3. Chunk 33 (similarity: 0.81)"
```
### 4. Direct Python Usage
```python
from ytpipe.core.pipeline import Pipeline
# Create pipeline
pipeline = Pipeline(
output_dir="./KNOWLEDGE_YOUTUBE",
vector_backend="chromadb",
whisper_model="base"
)
# Process video
result = await pipeline.process("https://youtube.com/watch?v=VIDEO_ID")
if result.success:
print(f"β
Processed {result.metadata.title}")
print(f" Chunks: {len(result.chunks)}")
print(f" Time: {result.processing_time:.1f}s")
else:
print(f"β Error: {result.error}")
```
---
## π― Key Achievements
### β
Architecture Improvements
- **Microservices**: Services are independent, stateless, composable
- **Type Safety**: Pydantic models throughout
- **Async-First**: Non-blocking I/O operations
- **Error Handling**: Domain-specific exceptions
- **Lazy Loading**: Models load only when needed (memory efficient)
### β
New Capabilities
- **Timestamps**: All chunks have MM:SS timeline positions
- **Search**: Full-text transcript search with context
- **MCP Integration**: AI agents can call ytpipe functions
- **Granular Control**: 8 tools for different use cases
- **Vector Search**: Semantic similarity via embeddings
### β
Performance Optimizations
- **Model Caching**: Whisper and embedder models reuse
- **Batch Processing**: Embeddings generated in batches
- **Parallel Downloads**: Audio-only for faster downloads
### β
Backward Compatibility
- **VectorStoreManager**: Wrapped existing code
- **Output Structure**: Same directory layout
- **Data Formats**: Compatible JSONL/JSON exports
---
## π Metrics
### Code Statistics
- **Files Created**: 16
- **Lines of Code**: ~2,300
- **Models Defined**: 11 Pydantic models
- **Services**: 6 core services
- **MCP Tools**: 8 (target: 12)
### Test Coverage
- **Unit Tests**: TODO (0%)
- **Integration Tests**: TODO (0%)
- **MCP Protocol Tests**: TODO (0%)
---
## β³ TODO (Remaining Work)
### Phase 2: Complete Intelligence Services (4 hours)
- [ ] `SEOService` - SEO optimization recommendations
- [ ] `TimelineService` - Temporal content analysis
- [ ] `AnalyzerService` - Content quality analysis
- [ ] `BenchmarkService` - Performance comparison
### Phase 3: Complete MCP Tools (2 hours)
- [ ] `ytpipe_seo_optimize` - SEO recommendations
- [ ] `ytpipe_benchmark` - Performance analysis
- [ ] `ytpipe_quality_report` - Quality metrics
- [ ] `ytpipe_topic_timeline` - Topic evolution over time
### Phase 4: Pipeline Enhancements (4 hours)
- [ ] Phase 6: HTML dashboard generation
- [ ] Phase 7: Granite-Docling processing
- [ ] Exporter service for multiple formats
### Phase 5: CLI Wrapper (2 hours)
- [ ] Backward-compatible CLI (`ytpipe` command)
- [ ] Update `setup.py` entry points
- [ ] Maintain existing interface
### Phase 6: Testing (6 hours)
- [ ] Unit tests for each service
- [ ] Integration tests for pipeline
- [ ] MCP protocol tests
- [ ] End-to-end tests
### Phase 7: Documentation (3 hours)
- [ ] Update README with MCP usage
- [ ] Tool usage examples
- [ ] Migration guide
- [ ] API reference
**Total Remaining**: ~21 hours
---
## π Architectural Insights
### Why This Architecture?
**1. Microservices Pattern**
- Each service does ONE thing well
- Easy to test in isolation
- Easy to swap implementations
- Reusable across projects
**2. Type-Safe Interfaces**
- Pydantic models = API contracts
- Errors caught at runtime, not production
- IDE autocomplete + type checking
- Self-documenting code
**3. MCP Protocol**
- **Standard**: AI tool-calling protocol
- **Composable**: Agents can chain tools
- **Language-Agnostic**: Works with any MCP client
- **Future-Proof**: Growing ecosystem
**4. Lazy Loading**
- Models load only when needed
- Reduces memory footprint
- Faster startup times
- Better resource management
### Design Decisions
**Why FastMCP?**
- Official MCP SDK from Anthropic
- Automatic schema generation from types
- Built-in stdio transport
- Well-maintained
**Why Pydantic?**
- Industry standard for Python data validation
- Type hints = automatic validation
- JSON serialization out of the box
- Excellent IDE support
**Why Async/Await?**
- Non-blocking I/O operations
- Better CPU utilization
- MCP protocol compatible
- Future-ready for concurrent processing
**Why Wrap Existing Code?**
- Don't rewrite what works
- Gradual migration path
- Minimize risk
- Preserve institutional knowledge
---
## π Next Steps (Immediate)
### Priority 1: Complete MCP Tools (2 hours)
Implement remaining 4 analytics tools to reach the 12-tool target.
### Priority 2: Testing (4 hours)
Write integration tests to validate the pipeline end-to-end.
### Priority 3: CLI Wrapper (2 hours)
Ensure backward compatibility with existing `ytpipe` command.
### Priority 4: Documentation (2 hours)
Update README with MCP usage examples and migration guide.
**Total: 10 hours to production-ready MVP**
---
## π Usage Examples
### MCP Tool Call Patterns
**Process Video**:
```json
{
"tool": "ytpipe_process_video",
"arguments": {
"url": "https://youtube.com/watch?v=VIDEO_ID",
"output_dir": "./KNOWLEDGE_YOUTUBE",
"backend": "chromadb",
"whisper_model": "base"
}
}
```
**Search Transcript**:
```json
{
"tool": "ytpipe_search",
"arguments": {
"video_id": "VIDEO_ID",
"query": "OpenClaw integration",
"max_results": 10
}
}
```
**Find Similar Chunks**:
```json
{
"tool": "ytpipe_find_similar",
"arguments": {
"video_id": "VIDEO_ID",
"chunk_id": 12,
"top_k": 5,
"backend": "chromadb"
}
}
```
---
## π― Success Metrics
### Before (Monolithic CLI)
- β AI agents cannot call ytpipe
- β No type safety
- β No timestamps on chunks
- β No transcript search
- β No semantic search
- β Hard to test individual phases
### After (Microservices + MCP)
- β
AI agents can call 8 tools (target: 12)
- β
Full type safety with Pydantic
- β
Timestamps on all chunks (MM:SS format)
- β
Full-text search with context
- β
Semantic similarity search
- β
Each service independently testable
---
**Status**: Phase 1 Complete (40% of full plan)
**Next**: Complete analytics tools + testing
**ETA**: 10 hours to production-ready MVP