speak_text
Convert text to speech into OGG/Opus audio files that play as native Telegram voice notes when attached, using configurable TTS backends including local Kokoro or OpenAI options.
Instructions
Convert text to speech and return an OGG/Opus audio file path.
Plays as a native voice note in Telegram when sent as an attachment.
TTS backends (in priority order):
Kokoro (local, free, natural-sounding) -- auto-starts via
uvx kokoro-fastapiOpenAI TTS (cloud, requires OPENAI_API_KEY, ~$0.015/1k chars)
macOS say (Mac only fallback, sounds robotic)
Configure via TTS_BACKEND env var: "auto" | "kokoro" | "openai" | "macos"
Args: text: Text to synthesise. voice: Voice name. Kokoro voices: af_sky, af_bella, af_sarah, am_adam, am_michael, bf_emma, bm_george, bm_lewis. OpenAI voices: alloy, echo, fable, onyx, nova, shimmer. Configure default via TTS_VOICE env var. output_path: Optional absolute path for the output .ogg file.
Returns: dict with: file_path (absolute .ogg path), backend, voice, success, error
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | ||
| voice | No | af_sky | |
| output_path | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||