Provides audio transcription capabilities through OpenAI's Whisper model, supporting various models (base, small, medium, large, large-v2, large-v3) and output formats.
Enables execution of shell commands with basic security validation, allowing to run commands in specified working directories while blocking potentially dangerous operations.
Whisper CLI MCP Server
An MCP server that provides shell command execution and OpenAI Whisper transcription capabilities.
Features
- whisper_transcribe: Transcribe audio files using OpenAI Whisper
- shell_command: Execute shell commands safely with basic security validation
Installation
- Install dependencies:
- Make the server executable:
Usage
Running the Server
Tools Available
whisper_transcribe
Transcribe audio files using whisper-cli.
Parameters:
audio_file
(required): Path to the audio filemodel
(optional): Whisper model (base, small, medium, large, large-v2, large-v3)language
(optional): Language code for transcriptionoutput_format
(optional): Output format (txt, vtt, srt, json)
shell_command
Execute shell commands with basic security validation.
Parameters:
command
(required): Shell command to executeworking_directory
(optional): Working directory for the command
Security
The shell_command tool includes basic security validation to prevent execution of potentially dangerous commands. Commands containing the following patterns are blocked:
rm -rf
sudo
chmod 777
dd if=
> /dev/
Configuration
To use this server with Claude Desktop, add the following to your claude_desktop_config.json
:
This server cannot be installed
An MCP server that provides shell command execution and OpenAI Whisper transcription capabilities for audio files.
Related MCP Servers
- -securityAlicense-qualityEnables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.Last updated -6PythonMIT License
- AsecurityAlicenseAqualityA MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.Last updated -127JavaScriptMIT License
- -securityAlicense-qualityAn MCP server that enables LLMs to generate spoken audio from text using OpenAI's Text-to-Speech API, supporting various voices, models, and audio formats.Last updated -21JavaScriptMIT License
- -securityFlicense-qualityAn MCP server that downloads videos/extracts audio from various platforms like YouTube, Bilibili, and TikTok, then transcribes them to text using OpenAI's Whisper model.Last updated -5Python