Enables processing of YouTube videos into structured knowledge bases, including downloading, transcription, semantic chunking, and metadata extraction, as well as SEO optimization and topic timeline analysis.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@YTPipeTurn this video into a searchable knowledge base: https://youtu.be/dQw4w9WgXcQ"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.

π¬ YTPipe - AI-Native YouTube Processing Pipeline
Transform YouTube videos into LLM-ready knowledge bases with a production-ready MCP backend.
Quick Start β’ Features β’ Documentation β’ MCP Tools
β¨ Features
π€ MCP Integration - 12 AI-callable tools for seamless agent integration
π― Smart Chunking - Semantic text chunking with timeline timestamps
π§ Vector Embeddings - 384-dimensional embeddings for semantic search
π Full-Text Search - Context-aware transcript search
π SEO Intelligence - AI-powered title, tag, and description optimization
β±οΈ Timeline Analysis - Topic evolution and keyword density tracking
ποΈ Microservices - 11 independent, composable services
π Type-Safe - Pydantic models throughout
β‘ Async-First - Non-blocking I/O operations
ποΈ Multi-Backend - ChromaDB, FAISS, Qdrant support
π Quick Start
Result: Metadata + Transcript + Semantic Chunks + Embeddings + Vector Storage
π― Usage Examples
MCP Server (AI Agents)
Then from Claude Code:
CLI (Humans)
Python API (Developers)
π MCP Tools
Pipeline (4 tools)
ytpipe_process_video- Full pipelineytpipe_download- Download onlyytpipe_transcribe- Transcribe audioytpipe_embed- Generate embeddings
Query (4 tools)
ytpipe_search- Full-text searchytpipe_find_similar- Semantic searchytpipe_get_chunk- Get chunk by IDytpipe_get_metadata- Get video info
Analytics (4 tools)
ytpipe_seo_optimize- SEO recommendationsytpipe_quality_report- Quality metricsytpipe_topic_timeline- Topic evolutionytpipe_benchmark- Performance analysis
ποΈ Architecture
Services:
Extractors (2): Download, Transcriber
Processors (4): Chunker, Embedder, VectorStore, Docling
Intelligence (4): Search, SEO, Timeline, Analyzer
Exporters (1): Dashboard
8 Processing Phases:
Download β 2. Transcription β 3. Chunking β 4. Embeddings β
Export β 6. Dashboard β 7. Docling β 8. Vector Storage
π Performance
Metric | Value |
Processing Speed | 4-13x real-time |
Memory Usage | <2GB peak |
Chunk Quality | 85%+ high quality |
Embedding Dimension | 384 |
π§ Requirements
Python 3.8+
FFmpeg (for audio extraction)
4GB+ RAM recommended
GPU optional (CUDA for acceleration)
π Documentation
π€ Contributing
Contributions welcome! Please read CONTRIBUTING.md first.
π License
MIT License - see LICENSE for details.
π Credits
Built with:
FastMCP - MCP server framework
OpenAI Whisper - Speech-to-text
sentence-transformers - Text embeddings
Model Context Protocol - AI tool standard
π§ Contact
Leonardo Lech
Email: leonardo.lech@gmail.com
GitHub: @leolech14
β Star this repo if you find it useful!
Transform YouTube β Knowledge Base in seconds