YouTube Transcript MCP Server

CLAUDE.md•4.62 kB

# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with this repository. ## Project Overview This is a **production-ready** YouTube Transcript MCP Server that fetches transcripts from YouTube videos using **yt-dlp CLI** for reliable subtitle extraction. It provides AI systems with the ability to extract, search, and analyze YouTube video transcripts through the Model Context Protocol (MCP). **Status:** Fully functional with CLI-based implementation that bypasses YouTube's rate limiting. ## Quick Commands ### Running the Server ```bash # Production (HTTP transport) uvicorn src.server:app --host 0.0.0.0 --port 8080 # Development (STDIO transport) python src/server.py # Health check curl http://localhost:8080/health ``` ### Testing Tools ```bash # Discover tools mcp tools .venv/bin/python src/server.py # Test transcript fetching mcp call get_transcript --params '{"video_id":"jNQXAC9IVRw"}' .venv/bin/python src/server.py # Test with time filtering mcp call get_transcript --params '{"video_id":"jNQXAC9IVRw", "start_time": 10, "end_time": 60}' .venv/bin/python src/server.py # Test search mcp call search_transcript --params '{"video_id":"jNQXAC9IVRw", "query":"example"}' .venv/bin/python src/server.py # Test summary analytics mcp call get_transcript_summary --params '{"video_id":"jNQXAC9IVRw"}' .venv/bin/python src/server.py # Test language detection mcp call get_available_languages --params '{"video_id":"jNQXAC9IVRw"}' .venv/bin/python src/server.py ``` ## Available Tools 1. **get_transcript** - Fetch video transcripts with optional time filtering 2. **search_transcript** - Search for specific text within transcripts with context 3. **get_transcript_summary** - Advanced analytics including pace, filler words, top words, content indicators 4. **get_available_languages** - List available transcript languages ## Key Features - **Universal MCP Client Compatibility**: Time parameters accept integers, floats, strings, or nulls - **CLI-Based Implementation**: Uses yt-dlp subprocess to avoid rate limiting - **Multi-Language Support**: 100+ languages with auto-generated and manual transcripts - **Advanced Analytics**: Speaking pace, filler words, content analysis, engagement metrics - **Dual Transport**: Both STDIO and HTTP transport modes - **Docker Support**: Multi-stage Alpine build with health checks ## Essential Dependencies - `fastmcp>=0.9.0` - MCP server framework - `yt-dlp>=2025.8.11` - YouTube transcript extraction via CLI - `pydantic>=2.0.0` - Data validation and models - `uvicorn>=0.24.0` - ASGI server for HTTP transport ## Environment Variables ```bash YT_TRANSCRIPT_SERVER_PORT=8080 # Server port (default: 8080) YT_TRANSCRIPT_SERVER_HOST=0.0.0.0 # Server host (default: 0.0.0.0) YT_TRANSCRIPT_DEBUG=false # Debug mode ``` ## Package Management This project uses `uv` for package management: ```bash # Install dependencies uv pip install -e . # Add new dependency uv add <package_name> ``` ## Project Structure ``` src/ ├── server.py # Main MCP server ├── tools/ # MCP tool implementations │ └── transcript_tools.py ├── models/ # Pydantic data models │ └── transcript.py tests/ # Test files └── README.md ``` ## Transport Compatibility **STDIO Transport** (Default): `python src/server.py` **HTTP Transport** (Production): `uvicorn src.server:app --port 8080` Server automatically selects transport based on startup method. ## Common Patterns for Claude Code When working with this codebase: 1. **Tool Enhancement**: Add new transcript tools in `src/tools/transcript_tools.py` 2. **Model Updates**: Extend Pydantic models in `src/models/transcript.py` 3. **Error Handling**: Use `ToolError` for user-facing errors 4. **Video ID Handling**: Support both video IDs and full YouTube URLs 5. **Testing**: Use `mcp call` commands with various video types and edge cases ## Configuration Patterns Always use absolute paths and proper error handling. For configuration that needs to be set before tool execution, use the pattern: ```python @mcp.tool() async def my_tool(): from src.config.settings import CONFIG_VALUE # Import inside function # Use CONFIG_VALUE here ``` ## Troubleshooting - **Tool not found**: Check `@mcp.tool()` decorator - **Validation errors**: Video IDs must be 11 characters, times must be non-negative - **Time filtering issues**: Parameters accept multiple formats (int, float, string, null) - **Transport issues**: Use `uvicorn` for HTTP, `python src/server.py` for STDIO

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/suckerfish/yttranscript_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server