speak
Convert text to speech using VOICEVOX TTS MCP server. Process text line by line for multi-character conversations with configurable playback controls.
Instructions
Convert text to speech and play it. Text is split by line breaks (\n) into separate speech units. Each line is processed as an independent audio segment.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Text split by line breaks (\n). IMPORTANT: Each line = one speech unit (processed and played separately). Keep the FIRST LINE SHORT for quick playback start - audio begins as soon as the first line is synthesized. Example: "Hi!\nThis is a longer explanation that follows." Optional speaker prefix per line: "1:Hello\n2:World" | |
| query | No | Voice synthesis query | |
| speaker | No | Default speaker ID (optional) | |
| speedScale | No | Playback speed (optional, default from environment) | |
| immediate | No | If true, stops current playback and plays new audio immediately. If false, waits for current playback to finish. Default depends on environment variable. | |
| waitForStart | No | Wait for playback to start (optional, default: false) | |
| waitForEnd | No | Wait for playback to end (optional, default: false) |