Schema | video-toolkit-mcp

video-toolkit-mcp

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`DEBUG`	No	Enable debug logging	0
`FFMPEG_PATH`	No	Path to ffmpeg binary	ffmpeg
`YT_DLP_PATH`	No	Path to yt-dlp binary	yt-dlp
`OPENAI_API_KEY`	No	OpenAI API key for Whisper-based subtitle generation
`WHISPER_MODEL_PATH`	No	Path to whisper model (for local whisper)	Auto-download
`WHISPER_BINARY_PATH`	No	Path to local whisper binary	whisper
`VIDEO_TOOLKIT_STORAGE_DIR`	No	Default directory for downloaded videos	~/.video-toolkit/downloads
`VIDEO_TOOLKIT_WHISPER_ENGINE`	No	Preferred whisper engine: openai, local, or auto	auto

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
get-transcriptB	Retrieve the transcript of a video from supported platforms (YouTube, Bilibili, Vimeo, etc.). Accepts various URL formats and returns the full transcript with timestamps.
list-transcript-languagesA	List all available transcript languages for a video from any supported platform.
download-videoA	Download a video from any supported platform (YouTube, Vimeo, etc.) to local storage. Returns the file path of the downloaded video.
list-downloadsA	List all downloaded video files in the storage directory or a specified directory.
generate-subtitlesA	Generate subtitles for a local video file using AI speech-to-text (OpenAI Whisper or local whisper). Creates an SRT or VTT file alongside the video.
transcribe-audioA	Transcribes audio via Whisper. Preferred: audio_url (most token-efficient; server fetches bytes). audio_base64 is for small clips only (<= ~60KB raw per call). audio_path only works when the MCP host shares a filesystem with the caller (often false on Claude.ai / Claude Code). For larger payloads in sandboxed environments, use transcribe_upload_start / transcribe_upload_append / transcribe_upload_finalize. Server re-encodes to Opus 16kHz mono 16kbps before Whisper unless skip_compression=true. Long audio (>5min) or async=true returns a job_id; poll transcribe_get_job.
transcribe_upload_startA	Begin a chunked audio upload for large payloads. Returns upload_id and max_chunk_bytes (~60KB).
transcribe_upload_appendC	Append one base64 chunk to an upload session.
transcribe_upload_finalizeB	Finalize a chunked upload, run compression + Whisper, return structured JSON (or text with as_text).
transcribe_get_jobC	Poll an async transcription job created by transcribe-audio.
transcribe_cancel_jobC	Cancel an async transcription job (best-effort).

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/JamesANZ/transcript-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server