Schema | whisper-telegram-mcp

whisper-telegram-mcp

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`WHISPER_MODEL`	No	Whisper model size to use for local inference (e.g., tiny, base, small, medium, large-v3, turbo).	base
`OPENAI_API_KEY`	No	OpenAI API key. Required if using the 'openai' backend.
`WHISPER_BACKEND`	No	Transcription backend to use: 'auto' (tries local first, falls back to OpenAI), 'local' (faster-whisper), or 'openai' (Whisper API).	auto
`WHISPER_LANGUAGE`	No	ISO-639-1 language code (e.g., 'en'). Defaults to auto-detection if not specified.
`TELEGRAM_BOT_TOKEN`	No	Telegram Bot API token. Required for the transcribe_telegram_voice tool.

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
transcribe_audioA	Transcribe an audio file to text using Whisper. Supports OGG (Telegram voice), WAV, MP3, FLAC, and most common audio formats. Args: file_path: Absolute path to the audio file to transcribe. language: Optional ISO-639-1 language code (e.g. 'en', 'fr'). None = auto-detect. word_timestamps: If True, include word-level timestamps in segments. Returns: dict with: text, language, language_probability, duration, segments, backend, success, error
transcribe_telegram_voiceA	Download and transcribe a Telegram voice message. Downloads the voice message from Telegram, transcribes it, then deletes the temp file. Args: file_id: The file_id from a Telegram voice message (from the Message object). bot_token: Telegram bot token. Falls back to TELEGRAM_BOT_TOKEN env var. language: Optional ISO-639-1 language code. None = auto-detect. word_timestamps: Include word-level timestamps in segments. Returns: Same dict structure as transcribe_audio.
list_modelsB	List available Whisper model sizes with performance characteristics. Configure the active model via the WHISPER_MODEL environment variable. Default is 'base' -- a good balance of speed and accuracy for voice messages.
check_backendsA	Check which transcription backends are available and configured. Call this first to verify your setup before transcribing.
speak_textA	Convert text to speech and return an OGG/Opus audio file path. Plays as a native voice note in Telegram when sent as an attachment. TTS backends (in priority order): Kokoro (local, free, natural-sounding) -- auto-starts via `uvx kokoro-fastapi` OpenAI TTS (cloud, requires OPENAI_API_KEY, ~$0.015/1k chars) macOS say (Mac only fallback, sounds robotic) Configure via TTS_BACKEND env var: "auto" \| "kokoro" \| "openai" \| "macos" Args: text: Text to synthesise. voice: Voice name. Kokoro voices: af_sky, af_bella, af_sarah, am_adam, am_michael, bf_emma, bm_george, bm_lewis. OpenAI voices: alloy, echo, fable, onyx, nova, shimmer. Configure default via TTS_VOICE env var. output_path: Optional absolute path for the output .ogg file. Returns: dict with: file_path (absolute .ogg path), backend, voice, success, error

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/abid-mahdi/whisper-telegram-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server