Uses OpenAI's Whisper model for local AI-powered transcription of YouTube video audio when official transcripts are unavailable.
Provides tools for searching YouTube videos and fetching their transcripts, either from official YouTube transcripts or by downloading audio and generating transcripts using Whisper AI.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP-YouTube-Transcribe transcribe the latest SpaceX Starship test flight video"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP-YouTube-Transcribe
An MCP server that provides a tool to fetch transcripts from YouTube videos. It first attempts to retrieve a pre-existing, official transcript. If one is not available, it downloads the video's audio and uses OpenAI's Whisper model for local AI-powered transcription.
This project is designed to be a simple, self-contained tool that can be easily integrated into any system capable of communicating with an MCP server.
Features
YouTube Video Search: Finds the most relevant YouTube video based on a text query.
Official Transcript Priority: Intelligently fetches manually created or auto-generated YouTube transcripts first for speed and accuracy.
Fast AI-Powered Transcription: Uses whisper.cpp (if available) for blazing fast transcription. Falls back to OpenAI's Python Whisper
tinymodel if whisper.cpp is not installed.MCP Server Interface: Exposes the transcription functionality as a simple tool (
get_youtube_transcript) via the lightweight model context protocol.
Requirements
Python 3.12+
uv A fast Python package installer and resolver. You will need to install on your system first.
FFmpeg Must be installed and available in your system's PATH. Required for audio processing.
whisper.cpp (Highly recommended): MCP-YouTube-Transcribe will first try to use whisper.cpp for lightning-fast local transcription and only fall back to Python Whisper if the executable is not found.
macOS:
brew install whisper-cppLinux: Build from source following the whisper.cpp installation guide
Windows: Build from source or grab a pre-built binary from the releases page
After installation, make sure the
whisper-cli(orwhisper-cppon older versions) command is in your PATH.Finally, download a Whisper model. The tiny model offers the best speed-to-quality ratio for most use-cases:
mkdir -p models curl -L -o models/ggml-tiny.bin \ https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.binPlace additional models in the same
models/folder if you wish to experiment.
Installation with uv
Using uv is recommended as it's extremely fast and handles both environment creation and package installation
seamlessly.
Clone the repository:
git clone https://github.com/<your-username>/YouTubeTranscriber.git cd YouTubeTranscriberCreate and activate a virtual environment: This command creates a
.venvfolder in your project directory and activates it.uvwill automatically use this environment for all subsequent commands.uv venvInstall the project and its dependencies: This command reads the
pyproject.tomlfile and installs all required libraries into the virtual environment.uv sync
Usage
Running the MCP Server
Once installed, you can start the server by running the mcp_server.py script. The server will listen for JSON-RPC
requests on stdin and send responses to stdout.
The server will log its activity to a file named mcp_server.log in the project's root directory.
Connecting to Gemini CLI on Windows
You can connect this MCP server to the Google Gemini CLI to use the function as a native tool directly from your terminal. These instructions are for a Windows environment.
Step 1: Create a Startup Script run_server.bat
The Gemini CLI needs a single, reliable command to start your server. A batch script is the perfect way to handle this on Windows, as it ensures the correct virtual environment and Python interpreter are used.
Create a new file named in the root of your project directory.
run_server.batCopy and paste the following content into the file:
This script activates the virtual environment in your project and then runs the server, ensuring all the correct dependencies are available.
Step 2: Configure the Gemini CLI
Now, you need to tell the Gemini CLI how to find and run your new server.
Locate your Gemini CLI
config.jsonfile. On Windows, this is typically found at:C:\Users\<Your-Username>\.gemini\config.jsonOpen the
config.jsonfile in a text editor. Add the following entry to themcpServersobject. IfmcpServersdoesn't exist, create it as shown below.
Crucially, you must replace both instances of with the full, absolute path to where you cloned the
YouTubeTranscriberrepository.
Example: If your project is located at C:\dev\YouTubeTranscriber, the entry would look like this:
Note: JSON requires backslashes to be escaped, so you must use double backslashes (\\) in your paths.
Step 3: Verify the Connection
After saving the config.json file, you can verify that Gemini CLI recognizes and can use your new tool.
Run Gemini CLI and press ctrl+t
You should see MCP-YouTube-Transcribe listed as an available tool.
Connecting to Gemini CLI on Mac/Unix
You can also connect this MCP server to the Google Gemini CLI on Mac or other Unix-like systems. The process is similar to Windows but uses a shell script instead of a batch file.
Step 1: Prepare the Startup Script
The repository already includes a run_server.sh script. Just make it executable:
Step 2: Configure the Gemini CLI
Locate your Gemini CLI
config.jsonfile. On Mac/Unix systems, this is typically found at:~/.gemini/config.jsonOpen the
config.jsonfile in a text editor. Add the following entry to themcpServersobject. IfmcpServersdoesn't exist, create it as shown below:
Replace both instances of with the absolute path to where you cloned the repository.
Example: If your project is located at /Users/username/MCP-YouTube-Transcribe, the entry would look like this:
Step 3: Verify the Connection
After saving the config.json file, you can verify that Gemini CLI recognizes and can use your new tool.
Run Gemini CLI and press ctrl+t
You should see MCP-YouTube-Transcribe listed as an available tool.
MCP Client Example
You can interact with the server using any client that supports the MCP protocol over stdio. The server exposes one
primary tool: get_youtube_transcript.
Here is an example of a call_tool request to get a transcript for the query "What is an API? by MuleSoft".
Request:
query: The search term for the YouTube video.force_whisper: (Optional) A boolean that, iftrue, skips the check for an official transcript and generates one directly with Whisper. Defaults tofalse.
Testing
This project includes a test suite to verify its functionality.
Core Function Test ( This script tests the server's handler functions directly without needing to run a separate server process. It's the quickest way to check if the core logic is working.
python simple.pyFull Server Test ( This script starts the MCP server as a subprocess and sends it live JSON-RPC requests, providing an end-to-end test of the server's functionality.
python test_mcp.py
Configuration
Logging: Server activity is logged to
mcp_server.log.Audio Cache: When Whisper is used, downloaded audio files are temporarily stored in
testing/audio_cache/. You may wish to change this path inyoutube_tool.pyfor a production environment.
Contributing
Contributions are welcome! If you'd like to improve the YouTube Transcriber, please feel free to fork the repository and submit a pull request.
Please read our CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests to us.
License
This project is licensed under the MIT License - see the LICENSE file for details.