Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
OPENAI_API_KEYYesYour OpenAI API key for accessing Whisper and GPT-4o models
AUDIO_FILES_PATHYesPath to your audio files directory

Capabilities

Server capabilities have not been inspected yet.

Tools

Functions exposed to the LLM to take actions

NameDescription
get_latest_audio

Get the most recent audio file from the audio path. ONLY USE THIS IF THE USER ASKS FOR THE LATEST FILE.

list_audio_files

List, filter, and sort audio files from the audio path. Supports regex pattern matching, filtering by metadata (size, duration, date, format), and sorting.

convert_audio

A tool used to convert audio files to mp3 or wav which are gpt-4o compatible.

compress_audio

A tool used to compress audio files which are >25mb. ONLY USE THIS IF THE USER REQUESTS COMPRESSION OR IF OTHER TOOLS FAIL DUE TO FILES BEING TOO LARGE.

transcribe_audio

A tool used to transcribe audio files. It is recommended to use gpt-4o-mini-transcribe by default. If the user wants maximum performance, use gpt-4o-transcribe. Rarely should you use whisper-1 as it is least performant, but it is available if needed. You can use prompts to guide the transcription process based on the users preference.

chat_with_audio

A tool used to chat with audio files. The response will be a response to the audio file sent. It is recommended to use gpt-4o-audio-preview by default for best results. Note: gpt-4o-mini-audio-preview has limitations with audio chat and may not process audio correctly.

transcribe_with_enhancement

Transcribe audio with GPT-4 using specific enhancement prompts.

Enhancement types: - detailed: Provides detailed description including tone, emotion, and background - storytelling: Transforms the transcription into a narrative - professional: Formats the transcription in a formal, business-appropriate way - analytical: Includes analysis of speech patterns, key points, and structure Args: input_file_name: Name of the input audio file to process enhancement_type: Type of enhancement to apply to the transcription model: The transcription model to use response_format: The response format timestamp_granularities: Optional timestamp granularities Returns: ------- TranscriptionResult with enhanced transcription
create_audio

Create text-to-speech audio using OpenAI's TTS API with model and voice selection.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/arcaputo3/mcp-server-whisper'

If you have feedback or need assistance with the MCP directory API, please join our Discord server