Enables downloading and transcribing news videos from CNN using yt-dlp and OpenAI Whisper, with support for multiple output formats and transcription models.
Enables downloading and transcribing educational videos from Coursera using yt-dlp and OpenAI Whisper, with support for multiple output formats and transcription models.
Enables downloading and transcribing videos from Dailymotion using yt-dlp and OpenAI Whisper, with support for multiple output formats and language detection.
Enables downloading and transcribing educational videos from edX using yt-dlp and OpenAI Whisper, with support for multiple output formats and language detection.
Enables downloading and transcribing videos from Facebook using yt-dlp and OpenAI Whisper, with support for multiple output formats and language detection.
Enables downloading and transcribing videos from Instagram using yt-dlp and OpenAI Whisper, with support for multiple output formats and transcription models.
Enables downloading and transcribing educational videos from Khan Academy using yt-dlp and OpenAI Whisper, with support for multiple output formats and configurable transcription settings.
Enables downloading and transcribing news videos from NBC using yt-dlp and OpenAI Whisper, with support for multiple output formats and language options.
Enables downloading and transcribing videos from Reddit using yt-dlp and OpenAI Whisper, with support for multiple output formats and configurable transcription settings.
Enables downloading and transcribing videos from TikTok using yt-dlp and OpenAI Whisper, with support for multiple output formats and configurable transcription settings.
Enables downloading and transcribing videos from Twitch using yt-dlp and OpenAI Whisper, with support for multiple output formats and language options.
Enables downloading and transcribing educational videos from Udemy using yt-dlp and OpenAI Whisper, with support for multiple output formats and language options.
Enables downloading and transcribing videos from Vimeo using yt-dlp and OpenAI Whisper, with support for multiple output formats and language options.
Enables downloading and transcribing videos from YouTube using yt-dlp and OpenAI Whisper, with support for multiple output formats (TXT, JSON, Markdown) and configurable transcription models.
Video Transcriber MCP Server
A Model Context Protocol (MCP) server that transcribes videos from 1000+ platforms using OpenAI's Whisper model. Built with TypeScript for type safety and available via npx for easy installation.
๐ Looking for better performance? Check out the Rust version which uses whisper.cpp for significantly faster transcription with lower memory usage. Available on crates.io with
cargo install video-transcriber-mcp.
โจ What's New
๐ Multi-Platform Support: Now supports 1000+ video platforms (YouTube, Vimeo, TikTok, Twitter/X, Facebook, Instagram, Twitch, educational sites, and more) via yt-dlp
๐ป Cross-Platform: Works on macOS, Linux, and Windows
๐๏ธ Configurable Whisper Models: Choose from tiny, base, small, medium, or large models
๐ Language Support: Transcribe in 90+ languages or use auto-detection
๐ Automatic Retries: Network failures are handled automatically with exponential backoff
๐ฏ Platform Detection: Automatically detects the video platform
๐ List Supported Sites: New tool to see all 1000+ supported platforms
โก Improved Error Handling: More specific and helpful error messages
๐ Better Filename Handling: Improved sanitization preserving more characters
โ ๏ธ Legal Notice
This tool is intended for educational, accessibility, and research purposes only.
Before using this tool, please understand:
Most platforms' Terms of Service generally prohibit downloading content
You are responsible for ensuring your use complies with applicable laws
This tool should primarily be used for:
โ Your own content
โ Creating accessibility features (captions for deaf/hard of hearing)
โ Educational and research purposes (where permitted)
โ Content you have explicit permission to download
Please read
We do not encourage or endorse violation of any platform's Terms of Service or copyright infringement. Use responsibly and ethically.
Features
๐ฅ Download audio from 1000+ video platforms (powered by yt-dlp)
๐ Transcribe local video files** (mp4, avi, mov, mkv, and more)
๐ค Transcribe using OpenAI Whisper (local, no API key needed)
๐๏ธ Configurable Whisper models (tiny, base, small, medium, large)
๐ Support for 90+ languages with auto-detection
๐ Generate transcripts in multiple formats (TXT, JSON, Markdown)
๐ List and read previous transcripts as MCP resources
๐ Integrate seamlessly with Claude Code or any MCP client
โก TypeScript + npx for easy installation
๐ Full type safety with TypeScript
๐ Automatic dependency checking
๐ Automatic retry logic for network failures
๐ฏ Platform detection (shows which platform you're transcribing from)
Supported Platforms
Thanks to yt-dlp, this tool supports 1000+ video platforms including:
Social Media: YouTube, TikTok, Twitter/X, Facebook, Instagram, Reddit, LinkedIn
Video Hosting: Vimeo, Dailymotion, Twitch
Educational: Coursera, Udemy, Khan Academy, LinkedIn Learning, edX
News: BBC, CNN, NBC, PBS
Conference/Tech: YouTube (tech talks), Vimeo (conferences)
And many, many more!
Run the list_supported_sites tool to see the complete list of 1000+ supported platforms.
Prerequisites
macOS
Linux
Windows
Option 1: Using pip (recommended)
Option 2: Using winget
Verify installations (all platforms)
Quick Start
For End Users (Using npx)
Add to your Claude Code config (~/.claude/settings.json):
Or use directly from GitHub:
That's it! No installation needed. npx will automatically download and run the package.
For Local Development
Usage
From Claude Code
Once configured, you can use these tools in Claude Code:
Transcribe a video from any platform
Transcribe a local video file
Claude will use the transcribe_video tool automatically with optional parameters for model and language.
List all supported platforms
List available transcripts
Check dependencies
Read a transcript
Programmatic Usage
If you install the package:
You can import and use it programmatically:
Output
Transcripts are saved to ~/Downloads/video-transcripts/ by default.
For each video, three files are generated:
.txt- Plain text transcript.json- JSON with timestamps and metadata.md- Markdown with video metadata and formatted transcript
Example
MCP Tools
transcribe_video
Transcribe videos from 1000+ platforms or local video files to text.
Parameters:
url(required): Video URL from any supported platform OR path to a local video file (mp4, avi, mov, mkv, etc.)output_dir(optional): Output directory pathmodel(optional): Whisper model - "tiny", "base" (default), "small", "medium", "large"language(optional): Language code (ISO 639-1: "en", "es", "fr", etc.) or "auto" (default)
Model Comparison:
Model | Speed | Accuracy | Use Case |
tiny | โกโกโกโกโก | โญโญ | Quick drafts, testing |
base | โกโกโกโก | โญโญโญ | General use (default) |
small | โกโกโก | โญโญโญโญ | Better accuracy |
medium | โกโก | โญโญโญโญโญ | High accuracy |
large | โก | โญโญโญโญโญโญ | Best accuracy, slow |
list_transcripts
List all available transcripts with metadata.
Parameters:
output_dir(optional): Directory to list
check_dependencies
Verify that all required dependencies are installed.
list_supported_sites
List all 1000+ supported video platforms.
Configuration Examples
Claude Code (Recommended)
From GitHub (Latest)
Local Development
Development
Setup
Project Structure
Scripts
Command | Description |
| Compile TypeScript to JavaScript |
| Development mode with hot reload (Bun) |
| TypeScript type checking |
| Remove dist/ directory |
| Pre-publish build (automatic) |
Publishing
Troubleshooting
Dependencies not installed
See the Prerequisites section above for platform-specific installation instructions.
npx can't find the package
Make sure the package is:
Published to npm, OR
Available on GitHub with proper package.json
TypeScript errors
Permission denied
The build process automatically makes dist/index.js executable via the fix-shebang script.
"Unsupported URL" error
The platform might not be supported by yt-dlp. Run list_supported_sites to see all supported platforms.
Performance
Video Length | Processing Time (base model) | Output Size |
5 minutes | ~1-2 minutes | ~5-10 KB |
10 minutes | ~2-4 minutes | ~10-20 KB |
30 minutes | ~5-10 minutes | ~30-50 KB |
1 hour | ~10-20 minutes | ~60-100 KB |
Times are approximate and depend on CPU speed and model choice
Advanced Configuration
Custom Whisper Model
Specify in the tool call parameters:
Custom Language
Specify the language code:
Custom Output Directory
Specify in the tool call:
Contributing
Contributions welcome! Please:
Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request
License
MIT License - see LICENSE file for details
TypeScript vs Rust Version
This is the TypeScript version - great for:
โ Quick setup with npx (no installation)
โ Node.js ecosystem familiarity
โ Easy to modify and extend
โ Good for learning and prototyping
Consider the Rust version if you need:
๐ Faster transcription (uses whisper.cpp)
๐พ Lower memory usage
โก Native performance
๐ฆ Standalone binary (no Node.js required)
Both versions support the same MCP protocol and work identically with Claude Code!
Links
๐ฆ Rust Version โ For better performance
Acknowledgments
OpenAI Whisper for transcription
yt-dlp for multi-platform video downloading (1000+ sites)
Model Context Protocol SDK
Claude by Anthropic
Made with โค๏ธ for the MCP community