Skip to main content

MCP Server Whisper

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`OPENAI_API_KEY`	Yes	Your OpenAI API key for accessing Whisper and GPT-4o models
`AUDIO_FILES_PATH`	Yes	Path to your audio files directory

Capabilities

Server capabilities have not been inspected yet.

Tools

Functions exposed to the LLM to take actions

Name	Description
get_latest_audio	Get the most recent audio file from the audio path. ONLY USE THIS IF THE USER ASKS FOR THE LATEST FILE.
list_audio_files	List, filter, and sort audio files from the audio path. Supports regex pattern matching, filtering by metadata (size, duration, date, format), and sorting.
convert_audio	A tool used to convert audio files to mp3 or wav which are gpt-4o compatible.
compress_audio	A tool used to compress audio files which are >25mb. ONLY USE THIS IF THE USER REQUESTS COMPRESSION OR IF OTHER TOOLS FAIL DUE TO FILES BEING TOO LARGE.
transcribe_audio	A tool used to transcribe audio files. It is recommended to use `gpt-4o-mini-transcribe` by default. If the user wants maximum performance, use `gpt-4o-transcribe`. Rarely should you use `whisper-1` as it is least performant, but it is available if needed. You can use prompts to guide the transcription process based on the users preference.
chat_with_audio	A tool used to chat with audio files. The response will be a response to the audio file sent. It is recommended to use `gpt-4o-audio-preview` by default for best results. Note: `gpt-4o-mini-audio-preview` has limitations with audio chat and may not process audio correctly.
transcribe_with_enhancement	Transcribe audio with GPT-4 using specific enhancement prompts. Enhancement types: - detailed: Provides detailed description including tone, emotion, and background - storytelling: Transforms the transcription into a narrative - professional: Formats the transcription in a formal, business-appropriate way - analytical: Includes analysis of speech patterns, key points, and structure Args: input_file_name: Name of the input audio file to process enhancement_type: Type of enhancement to apply to the transcription model: The transcription model to use response_format: The response format timestamp_granularities: Optional timestamp granularities Returns: ------- TranscriptionResult with enhanced transcription
create_audio	Create text-to-speech audio using OpenAI's TTS API with model and voice selection.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/arcaputo3/mcp-server-whisper'

If you have feedback or need assistance with the MCP directory API, please join our Discord server