# Architecture Overview
## System Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ YOUTUBE KB SYSTEM │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ DATA LAYER (Supabase) │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │ │
│ │ │ videos │ │ chunks │ │ embeddings │ │ │
│ │ │ table │◄──►│ table │◄──►│ (pgvector) │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────────────┘ │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ Full-Text Search Index (tsvector) │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └───────────────────────────────┬───────────────────────────────────────┘ │
│ │ │
│ │ SQL + pgvector │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ SERVICE LAYER (Vercel) │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ MCP Server (TypeScript) │ │ │
│ │ │ │ │ │
│ │ │ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌─────────┐ │ │ │
│ │ │ │ search │ │list_domains│ │ stats │ │list_vids│ │ │ │
│ │ │ │ tool │ │ tool │ │ tool │ │ tool │ │ │ │
│ │ │ └───────────┘ └───────────┘ └───────────┘ └─────────┘ │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Transport: StreamableHTTP (POST /api/mcp) │ │
│ │ │ │
│ └───────────────────────────────┬───────────────────────────────────────┘ │
│ │ │
│ │ HTTPS │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ CLIENT LAYER (Plugin) │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ Claude Code Plugin (~5KB) │ │ │
│ │ │ │ │ │
│ │ │ plugin.json ──► .mcp.json ──► skills/ ──► README │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ INGESTION PIPELINE (Offline) │ │
│ │ │ │
│ │ YouTube ──► Transcripts ──► Chunking ──► Embeddings ──► Supabase │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## Component Breakdown
### 1. Data Layer (Supabase)
**Purpose**: Store and index all video metadata and embeddings for fast retrieval.
**Technology**: PostgreSQL with pgvector extension
**Key Tables**:
- `videos` - Video metadata (title, channel, URL, domain)
- `chunks` - Text chunks with embeddings
- Full-text search index for hybrid queries
**Why Supabase?**
- Free tier sufficient for initial scale
- Native pgvector support
- Hybrid search (semantic + keyword)
- Open source (can self-host)
- Official MCP server as reference
### 2. Service Layer (Vercel)
**Purpose**: Host the MCP server that handles search requests.
**Technology**: TypeScript + mcp-handler + Vercel Functions
**Key Features**:
- StreamableHTTP transport (current MCP standard)
- Auto-scaling to zero when idle
- Edge caching for common queries
- < 500ms response times
**Why Vercel?**
- Free tier with generous limits
- Native mcp-handler support
- Optimized for MCP workloads
- Easy deployment from GitHub
### 3. Client Layer (Claude Code Plugin)
**Purpose**: Provide easy installation and usage for end users.
**Technology**: Claude Code plugin format
**Components**:
- `plugin.json` - Manifest with metadata
- `.mcp.json` - Connection to hosted server
- `skills/` - Domain-specific search helpers
- Documentation
**Why Plugin?**
- One-command installation
- Automatic MCP registration
- Marketplace distribution
- Version management
### 4. Ingestion Pipeline (Offline)
**Purpose**: Fetch, process, and index YouTube content.
**Technology**: Python + yt-dlp + OpenAI
**Steps**:
1. Fetch video metadata from YouTube
2. Extract transcripts (auto-generated or manual)
3. Chunk transcripts into ~500 token segments
4. Generate embeddings via OpenAI
5. Upload to Supabase
**Why Offline?**
- Cost control (embedding costs)
- Quality control (manual curation)
- No real-time requirements
---
## Data Flow
### Search Request Flow
```
User Query
│
▼
┌─────────────────┐
│ Claude Code │
│ (via plugin) │
└────────┬────────┘
│ MCP Tool Call
▼
┌─────────────────┐
│ Vercel MCP │
│ Server │
└────────┬────────┘
│
┌────┴────┐
│ │
▼ ▼
┌───────┐ ┌───────┐
│OpenAI │ │Supabase│
│Embed │ │Query │
└───┬───┘ └───┬───┘
│ │
└────┬────┘
│
▼
┌─────────────────┐
│ Hybrid Search │
│ (Vector + FTS) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Ranked Results │
│ with Citations │
└─────────────────┘
```
### Ingestion Flow
```
YouTube Channel
│
▼
┌─────────────────┐
│ yt-dlp │
│ (fetch meta) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Transcript │
│ Extraction │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Quality │
│ Filtering │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Chunking │
│ (~500 tokens) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ OpenAI │
│ Embeddings │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Supabase │
│ Upload │
└─────────────────┘
```
---
## Design Decisions
### D1: Supabase over Pinecone
| Criterion | Supabase | Pinecone |
|-----------|----------|----------|
| Cost | Free tier | Free tier (limited) |
| Open source | Yes | No |
| Hybrid search | Native | Limited |
| Self-hosting | Easy | No |
| MCP reference | Official server | Official server |
**Decision**: Supabase - better for open source, hybrid search native.
### D2: Vercel over Railway/Fly
| Criterion | Vercel | Railway | Fly.io |
|-----------|--------|---------|--------|
| MCP support | Native (mcp-handler) | Manual | Manual |
| Free tier | Generous | Limited | Limited |
| Scaling | Auto | Manual | Auto |
| Deployment | Git push | Git push | CLI |
**Decision**: Vercel - native MCP support, best free tier.
### D3: StreamableHTTP over SSE
| Criterion | StreamableHTTP | SSE |
|-----------|----------------|-----|
| Status | Current standard | Deprecated |
| Efficiency | High (no persistent conn) | Low |
| Scaling | Excellent | Poor |
| SDK support | Full | Legacy only |
**Decision**: StreamableHTTP - it's the current MCP standard.
### D4: OpenAI Embeddings over Open Source
| Criterion | OpenAI text-embedding-3-small | Sentence Transformers |
|-----------|------------------------------|----------------------|
| Quality | High | Medium-High |
| Cost | $0.02/1M tokens | Free (compute) |
| Setup | API call | Model hosting |
| Consistency | Stable | Version-dependent |
**Decision**: OpenAI - better quality, simpler setup for launch.
Can migrate to open source models later if needed.
### D5: Plugin Distribution over npm
| Criterion | Claude Plugin | npm Package |
|-----------|---------------|-------------|
| Installation | `claude plugin install` | Manual setup |
| Discovery | Marketplace | Search |
| Updates | Automatic | Manual |
| MCP config | Automatic | Manual |
**Decision**: Plugin - better UX, marketplace visibility.
---
## Security Considerations
### Data Security
| Aspect | Approach |
|--------|----------|
| PII | None stored (only video metadata) |
| Credentials | Environment variables only |
| HTTPS | Enforced by Vercel |
| Database | Supabase RLS (disabled for read-only) |
### API Security
| Aspect | Approach |
|--------|----------|
| Authentication | None (public API) |
| Rate limiting | Vercel defaults (can add custom) |
| Input validation | Zod schemas |
| SQL injection | Parameterized queries |
### Content Security
| Aspect | Approach |
|--------|----------|
| Copyright | Embeddings only, not full text |
| Attribution | All results include source citation |
| Filtering | Quality filters on ingestion |
---
## Scalability Plan
### Current Scale (Launch)
- Videos: ~500
- Chunks: ~50,000
- Requests/day: ~100
- Cost: ~$0/month
### Near-term Scale (6 months)
- Videos: ~2,000
- Chunks: ~200,000
- Requests/day: ~1,000
- Cost: ~$25/month (Supabase Pro)
### Long-term Scale (1 year+)
- Videos: ~10,000
- Chunks: ~1,000,000
- Requests/day: ~10,000
- Cost: ~$100/month
### Scaling Triggers
| Metric | Threshold | Action |
|--------|-----------|--------|
| DB size | > 500MB | Upgrade Supabase |
| Latency P95 | > 1s | Add caching layer |
| Requests/day | > 10,000 | Add rate limiting |
| Error rate | > 5% | Debug and fix |
---
## Monitoring & Observability
### Metrics to Track
| Category | Metrics |
|----------|---------|
| Performance | Latency (P50, P95, P99), throughput |
| Availability | Uptime, error rate, 5xx count |
| Usage | Requests/day, unique users, popular queries |
| Cost | Vercel usage, Supabase usage, OpenAI usage |
### Tools
| Purpose | Tool |
|---------|------|
| APM | Vercel Analytics (built-in) |
| Logging | Vercel Logs |
| Uptime | Vercel monitoring |
| Alerting | Vercel + Supabase alerts |
---
## Disaster Recovery
### Backup Strategy
| Data | Frequency | Retention | Location |
|------|-----------|-----------|----------|
| Supabase DB | Daily | 7 days | Supabase backups |
| Embeddings | On ingestion | Permanent | Also in local LanceDB |
| Code | Every commit | Permanent | GitHub |
### Recovery Procedures
| Scenario | RTO | RPO | Procedure |
|----------|-----|-----|-----------|
| Vercel outage | 1 hour | 0 | Wait for recovery (stateless) |
| Supabase outage | 4 hours | 24h | Restore from backup |
| Data corruption | 4 hours | 24h | Restore + re-ingest |
| Complete failure | 8 hours | 24h | Full rebuild from backups |
---
## Future Considerations
### Potential Enhancements
1. **User Authentication** - Track usage, preferences
2. **Custom Domains** - Users add their own channels
3. **Real-time Ingestion** - Auto-add new videos
4. **Multi-language** - Support non-English content
5. **Caching Layer** - Redis for common queries
6. **Analytics Dashboard** - Usage insights
### Migration Paths
1. **Database**: Supabase → Neon/PlanetScale if needed
2. **Hosting**: Vercel → Railway/Fly if needed
3. **Embeddings**: OpenAI → Cohere/local models
4. **Search**: pgvector → Elasticsearch if scale requires