whisper-windows-mcp
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| FFMPEG_PATH | No | Path to ffmpeg if not in system PATH | |
| WHISPER_MODEL | Yes | Path to model .bin file | |
| WHISPER_THREADS | No | CPU thread count override | |
| WHISPER_CLI_PATH | Yes | Path to whisper-cli.exe |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| transcribe_audioA | Transcribe a single audio or video file using whisper.cpp on Windows. Natively supports mp3 and wav. Automatically converts mp4, mkv, avi, mov, webm, m4a, flac, ogg etc. via FFmpeg — no manual conversion needed. Output defaults to timestamps format (with time codes). For files that may take more than 4 minutes, set background=true to run as a detached job and use check_progress to monitor it. ⚠️ Privacy: transcript text returned by this tool is processed by Claude's API. Pass privacy_mode=true to this tool to enable metadata-only responses per call — no transcript text will be transmitted. Set WHISPER_PRIVACY_MODE=true in env to enable globally. When privacy mode is active, a confirmation is required before every operation. |
| check_progressA | Check the status of a background transcription job started with transcribe_audio (background=true). Returns current progress, elapsed time, last processed timestamp, and the transcript when complete. Call this repeatedly until the job shows as complete or failed. ⚠️ Privacy: transcript text returned on completion is processed by Claude's API. Pass privacy_mode=true to return metadata only for this check, regardless of how the job was started. |
| transcribe_batchA | Transcribe multiple audio/video files in a folder interactively, one file at a time. Shows a preview of each transcript and waits for confirmation before continuing. Saves each transcript as a .txt file next to its source. Files already transcribed (with matching .txt) are shown as done and skipped. Supported formats: mp3, wav, mp4, mkv, avi, mov, webm, m4a, flac, ogg. NOTE: For large unattended batch jobs, use start_batch instead. ⚠️ Privacy: transcript previews are processed by Claude's API. Pass privacy_mode=true to suppress previews and return metadata only. When privacy mode is active, confirmation is required before each file. |
| generate_subtitlesA | Generate subtitle files for an audio or video file using whisper.cpp. Set language='auto' to detect the spoken language automatically. Set translate_to_english=true to also generate an English translation subtitle file. Supports SRT and WebVTT (VTT) output formats. When both native and translation are requested, two files are saved: one in the original language and one English translation. Load SRT in VLC via Subtitle → Add Subtitle File. VTT works in web players and HTML5 video. Supports all standard formats plus .3gp and .ts. |
| check_configA | Verify whisper-cli.exe, model, and FFmpeg are all available. Run this first if anything fails. |
| start_batchA | Start an automated sequential batch transcription of all untranscribed files in a folder. Scans for files without a matching .txt, sorts by duration (shortest first), and processes them one at a time as background jobs. Each file is validated after completion — empty or suspiciously short outputs are flagged. Batch self-advances without polling when each file finishes. Returns a batch ID to use with check_batch_progress. ⚠️ Privacy: when privacy_mode is active, one confirmation is required before the batch starts. All files then process unattended. No transcript text is returned to the API. |
| check_batch_progressA | Check the status of a batch started with start_batch. Automatically advances to the next file when the current one finishes. Returns overall progress, current file, failed files, and elapsed time. Call repeatedly until the batch shows as complete. |
| analyze_mediaA | Analyze one or more media files using FFprobe before transcribing. For a single file: returns duration, size, codec, and estimated transcription time on CPU and GPU. For a folder: scans all supported media files and returns a sorted table with the same info for each. Use this to plan batch work, estimate how long transcription will take, or check what's already been transcribed. |
| check_systemA | Detect GPU hardware and verify Vulkan acceleration is available. Reports GPU name, VRAM, whether the Vulkan binary is installed, and recommends the best Whisper model for your hardware. |
| list_modelsA | List all Whisper model files installed in your models directory. Shows filename, size, whether it is currently active, quantization status, and recommended use case for each model. No network calls — reads local filesystem only. |
| download_modelA | Download a Whisper model from Hugging Face directly into your models directory. Accepts a model name (e.g. large-v3-turbo, medium.en-q5_0) and handles the download automatically. Downloads only from trusted Hugging Face namespaces (ggerganov/whisper.cpp and ggml-org). After downloading, use switch_model to activate it for the current session. |
| switch_modelA | Switch the active Whisper model for the current session without restarting Claude Desktop. Accepts a model filename (e.g. ggml-large-v3-turbo.bin) or full path. The model must already be installed in your models directory. Change is session-scoped — does not persist after Claude Desktop restarts. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/eviscerations/whisper-windows-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server