generate_audio
Convert text to speech using TTS synthesis. Create voiceovers, narrations, or podcast audio with customizable voice, format, and speed.
Instructions
Generate audio from text using text-to-speech (TTS) synthesis. Create voiceovers, narrations, podcast audio, or any spoken audio from text input. Supports multiple voices (alloy, echo, fable, onyx, nova, shimmer), formats (MP3, FLAC, AAC, Opus), and playback speeds. Use the intent-specific generate_article_narration tool for article content.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Text to synthesize into audio (max 4096 characters). | |
| voice | No | Voice to use for synthesis (default: nova). | |
| format | No | Output audio format (default: mp3). | |
| speed | No | Playback speed multiplier, 0.25–4.0 (default: 1.0). |