Provides tools for extracting YouTube video metadata (title, views, description, duration) and generating multilingual transcriptions with automatic language detection, translation support for 99 languages, and intelligent caching.
YouTube MCP Server
A powerful Model Context Protocol (MCP) server for YouTube video transcription and metadata extraction. This server provides advanced tools for AI agents to retrieve video metadata and generate high-quality transcriptions with native language support.
🌟 Features
Metadata Extraction: Retrieve comprehensive video details (title, description, views, duration, etc.) without downloading the video.
Smart Transcription:
In-Memory Processing: fast, efficient, and disk-I/O free pipeline.
VAD (Voice Activity Detection): uses Silero VAD for precise segmentation.
Multilingual Support: supports 99 languages.
Translation: Transcribe to any supported language.
Caching: Intelligent file-based caching to avoid redundant processing.
Optimized Performance:
Uses
yt-dlpfor robust extraction.Hardware acceleration (MPS/CUDA) for Whisper inference.
Parallel processing for transcription segments.
🛠️ Prerequisites
Python 3.10+
ffmpeg: Required for audio processing.
Mac:
brew install ffmpegLinux:
sudo apt install ffmpegWindows: Download and add to PATH.
📦 Installation
Clone the repository:
git clone https://github.com/mourad-ghafiri/youtube-mcp-server cd youtube-mcp-serverInstall dependencies: Using
uv(recommended):uv sync
⚙️ Configuration
The server configuration is located in src/youtube_mcp_server/config.py. You can adjust the following parameters:
Directories
TRANSCRIPTIONS_DIR: Directory where transcription JSON files are cached (default:"transcriptions").
Models
WHISPER_MODEL_NAME: OpenAI Whisper model to use. Options:"tiny","base","small","medium","large","turbo". (default:"tiny").Note: Larger models require more RAM and a GPU (CUDA/MPS).
SILERO_REPO: VAD model repository and ID.
Audio Processing
SAMPLING_RATE: Audio sampling rate for Whisper/VAD (default:16000Hz).SEGMENT_PADDING_MS: Padding added to each audio segment to avoid cutting off words (default:200ms).
Concurrency
MAX_WORKERS: Number of parallel threads for transcribing audio segments (default:4). Increasing this speeds up transcription but uses more CPU/Memory.
🚀 Usage
1. Start the Server
The server runs on SSE (Server-Sent Events) transport at
2. Configure MCP Client
Add the server configuration to your MCP client:
🛠️ Tools Reference
get_video_info
Retrieves metadata for a given YouTube video.
Input:
url(string)Output: JSON object with title, views, description, thumbnails, etc.
{ "id": "VIDEO_ID", "title": "Video Title", "description": "Video description...", "view_count": 1000000, "duration": 212, "uploader": "Channel Name", "upload_date": "20091025", "thumbnail": "https://i.ytimg.com/...", "tags": ["tag1", "tag2"], "categories": ["Music"] }
transcribe_video
Transcribes a video with optional translation.
Inputs:
url(string): Video URL.language(string, default="auto"):"auto": Transcribe in detected language."en": Translate to English."fr","es", etc.: Transcribe in specific language.
Output: JSON with segments and metadata.
{ "id": "VIDEO_ID", "title": "Video Title", "duration": 212, "transcription": [ { "from": "00:00:00", "to": "00:00:05", "transcription": "First segment text..." }, { "from": "00:00:05", "to": "00:00:10", "transcription": "Second segment text..." } ] }
🏗️ Technical Architecture
Services:
DownloadService,VADService(Silero),WhisperService(OpenAI),CacheService.In-Memory Pipeline: Audio is downloaded -> loaded to RAM -> segmented by VAD -> transcribed by Whisper -> Cached.
Concurrency: Parallel segment transcription.
🌍 Appendix: Supported Languages
Country (Primary/Region) | Language | Code |
South Africa | Afrikaans | af |
Ethiopia | Amharic | am |
Arab World | Arabic | ar |
India | Assamese | as |
Azerbaijan | Azerbaijani | az |
Russia | Bashkir | ba |
Belarus | Belarusian | be |
Bulgaria | Bulgarian | bg |
Bangladesh | Bengali | bn |
Tibet | Tibetan | bo |
France (Brittany) | Breton | br |
Bosnia and Herzegovina | Bosnian | bs |
Spain (Catalonia) | Catalan | ca |
Czech Republic | Czech | cs |
Wales | Welsh | cy |
Denmark | Danish | da |
Germany | German | de |
Greece | Greek | el |
USA / UK | English | en |
Spain | Spanish | es |
Estonia | Estonian | et |
Spain (Basque) | Basque | eu |
Iran | Persian | fa |
Finland | Finnish | fi |
Faroe Islands | Faroese | fo |
France | French | fr |
Spain (Galicia) | Galician | gl |
India | Gujarati | gu |
Nigeria | Hausa | ha |
Hawaii | Hawaiian | haw |
Israel | Hebrew | he |
India | Hindi | hi |
Croatia | Croatian | hr |
Haiti | Haitian Creole | ht |
Hungary | Hungarian | hu |
Armenia | Armenian | hy |
Indonesia | Indonesian | id |
Iceland | Icelandic | is |
Italy | Italian | it |
Japan | Japanese | ja |
Indonesia (Java) | Javanese | jw |
Georgia | Georgian | ka |
Kazakhstan | Kazakh | kk |
Cambodia | Khmer | km |
India | Kannada | kn |
South Korea | Korean | ko |
Ancient Rome | Latin | la |
Luxembourg | Luxembourgish | lb |
Congo | Lingala | ln |
Laos | Lao | lo |
Lithuania | Lithuanian | lt |
Latvia | Latvian | lv |
Madagascar | Malagasy | mg |
New Zealand | Maori | mi |
North Macedonia | Macedonian | mk |
India | Malayalam | ml |
Mongolia | Mongolian | mn |
India | Marathi | mr |
Malaysia | Malay | ms |
Malta | Maltese | mt |
Myanmar | Myanmar | my |
Nepal | Nepali | ne |
Netherlands | Dutch | nl |
Norway | Nynorsk | nn |
Norway | Norwegian | no |
France (Occitania) | Occitan | oc |
India (Punjab) | Punjabi | pa |
Poland | Polish | pl |
Afghanistan | Pashto | ps |
Portugal / Brazil | Portuguese | pt |
Romania | Romanian | ro |
Russia | Russian | ru |
India | Sanskrit | sa |
Pakistan | Sindhi | sd |
Sri Lanka | Sinhala | si |
Slovakia | Slovak | sk |
Slovenia | Slovenian | sl |
Zimbabwe | Shona | sn |
Somalia | Somali | so |
Albania | Albanian | sq |
Serbia | Serbian | sr |
Indonesia | Sundanese | su |
Sweden | Swedish | sv |
East Africa | Swahili | sw |
India | Tamil | ta |
India | Telugu | te |
Tajikistan | Tajik | tg |
Thailand | Thai | th |
Turkmenistan | Turkmen | tk |
Philippines | Tagalog | tl |
Turkey | Turkish | tr |
Russia (Tatarstan) | Tatar | tt |
Ukraine | Ukrainian | uk |
Pakistan | Urdu | ur |
Uzbekistan | Uzbek | uz |
Vietnam | Vietnamese | vi |
Ashkenazi Jewish | Yiddish | yi |
Nigeria | Yoruba | yo |
China (Guangdong) | Cantonese | yue |
China | Chinese | zh |
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Fork the project
Create your feature branch (
git checkout -b feature/AmazingFeature)Commit your changes (
git commit -m 'Add some AmazingFeature')Push to the branch (
git push origin feature/AmazingFeature)Open a Pull Request
📄 License
Distributed under the MIT License. See LICENSE for more information.