video-toolkit-mcp
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| DEBUG | No | Enable debug logging | 0 |
| FFMPEG_PATH | No | Path to ffmpeg binary | ffmpeg |
| YT_DLP_PATH | No | Path to yt-dlp binary | yt-dlp |
| OPENAI_API_KEY | No | OpenAI API key for Whisper-based subtitle generation | |
| WHISPER_MODEL_PATH | No | Path to whisper model (for local whisper) | Auto-download |
| WHISPER_BINARY_PATH | No | Path to local whisper binary | whisper |
| VIDEO_TOOLKIT_STORAGE_DIR | No | Default directory for downloaded videos | ~/.video-toolkit/downloads |
| VIDEO_TOOLKIT_WHISPER_ENGINE | No | Preferred whisper engine: openai, local, or auto | auto |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| get-transcriptB | Retrieve the transcript of a video from supported platforms (YouTube, Bilibili, Vimeo, etc.). Accepts various URL formats and returns the full transcript with timestamps. |
| list-transcript-languagesA | List all available transcript languages for a video from any supported platform. |
| download-videoA | Download a video from any supported platform (YouTube, Vimeo, etc.) to local storage. Returns the file path of the downloaded video. |
| list-downloadsA | List all downloaded video files in the storage directory or a specified directory. |
| generate-subtitlesA | Generate subtitles for a local video file using AI speech-to-text (OpenAI Whisper or local whisper). Creates an SRT or VTT file alongside the video. |
| transcribe-audioA | Transcribes audio via Whisper. Preferred: audio_url (most token-efficient; server fetches bytes). audio_base64 is for small clips only (<= ~60KB raw per call). audio_path only works when the MCP host shares a filesystem with the caller (often false on Claude.ai / Claude Code). For larger payloads in sandboxed environments, use transcribe_upload_start / transcribe_upload_append / transcribe_upload_finalize. Server re-encodes to Opus 16kHz mono 16kbps before Whisper unless skip_compression=true. Long audio (>5min) or async=true returns a job_id; poll transcribe_get_job. |
| transcribe_upload_startA | Begin a chunked audio upload for large payloads. Returns upload_id and max_chunk_bytes (~60KB). |
| transcribe_upload_appendC | Append one base64 chunk to an upload session. |
| transcribe_upload_finalizeB | Finalize a chunked upload, run compression + Whisper, return structured JSON (or text with as_text). |
| transcribe_get_jobC | Poll an async transcription job created by transcribe-audio. |
| transcribe_cancel_jobC | Cancel an async transcription job (best-effort). |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/JamesANZ/transcript-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server