Skip to main content
Glama

Whisper Speech Recognition MCP Server

by Jacobngai

Whisper Speech Recognition MCP Server


中文文档

A high-performance speech recognition MCP server based on Faster Whisper, providing efficient audio transcription capabilities.

Features

  • Integrated with Faster Whisper for efficient speech recognition

  • Batch processing acceleration for improved transcription speed

  • Automatic CUDA acceleration (if available)

  • Support for multiple model sizes (tiny to large-v3)

  • Output formats include VTT subtitles, SRT, and JSON

  • Support for batch transcription of audio files in a folder

  • Model instance caching to avoid repeated loading

  • Dynamic batch size adjustment based on GPU memory

Installation

Dependencies

  • Python 3.10+

  • faster-whisper>=0.9.0

  • torch==2.6.0+cu126

  • torchaudio==2.6.0+cu126

  • mcp[cli]>=1.2.0

Installation Steps

  1. Clone or download this repository

  2. Create and activate a virtual environment (recommended)

  3. Install dependencies:

pip install -r requirements.txt

PyTorch Installation Guide

Install the appropriate version of PyTorch based on your CUDA version:

  • CUDA 12.6:

    pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
  • CUDA 12.1:

    pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu121
  • CPU version:

    pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cpu

You can check your CUDA version with nvcc --version or nvidia-smi.

Usage

Starting the Server

On Windows, simply run start_server.bat.

On other platforms, run:

python whisper_server.py

Configuring Claude Desktop

  1. Open the Claude Desktop configuration file:

    • Windows: %APPDATA%\Claude\claude_desktop_config.json

    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

  2. Add the Whisper server configuration:

{ "mcpServers": { "whisper": { "command": "python", "args": ["D:/path/to/whisper_server.py"], "env": {} } } }
  1. Restart Claude Desktop

Available Tools

The server provides the following tools:

  1. get_model_info - Get information about available Whisper models

  2. transcribe - Transcribe a single audio file

  3. batch_transcribe - Batch transcribe audio files in a folder

Performance Optimization Tips

  • Using CUDA acceleration significantly improves transcription speed

  • Batch processing mode is more efficient for large numbers of short audio files

  • Batch size is automatically adjusted based on GPU memory size

  • Using VAD (Voice Activity Detection) filtering improves accuracy for long audio

  • Specifying the correct language can improve transcription quality

Local Testing Methods

  1. Use MCP Inspector for quick testing:

mcp dev whisper_server.py
  1. Use Claude Desktop for integration testing

  2. Use command line direct invocation (requires mcp[cli]):

mcp run whisper_server.py

Error Handling

The server implements the following error handling mechanisms:

  • Audio file existence check

  • Model loading failure handling

  • Transcription process exception catching

  • GPU memory management

  • Batch processing parameter adaptive adjustment

Project Structure

  • whisper_server.py: Main server code

  • model_manager.py: Whisper model loading and caching

  • audio_processor.py: Audio file validation and preprocessing

  • formatters.py: Output formatting (VTT, SRT, JSON)

  • transcriber.py: Core transcription logic

  • start_server.bat: Windows startup script

License

MIT

Acknowledgements

This project was developed with the assistance of these amazing AI tools and models:

Special thanks to these incredible tools and the teams behind them.

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Jacobngai/fast-whisper-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server