Skip to main content
Glama

speak

Convert text to speech using VOICEVOX TTS MCP server. Process text line by line for multi-character conversations with configurable playback controls.

Instructions

Convert text to speech and play it. Text is split by line breaks (\n) into separate speech units. Each line is processed as an independent audio segment.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
textYesText split by line breaks (\n). IMPORTANT: Each line = one speech unit (processed and played separately). Keep the FIRST LINE SHORT for quick playback start - audio begins as soon as the first line is synthesized. Example: "Hi!\nThis is a longer explanation that follows." Optional speaker prefix per line: "1:Hello\n2:World"
queryNoVoice synthesis query
speakerNoDefault speaker ID (optional)
speedScaleNoPlayback speed (optional, default from environment)
immediateNoIf true, stops current playback and plays new audio immediately. If false, waits for current playback to finish. Default depends on environment variable.
waitForStartNoWait for playback to start (optional, default: false)
waitForEndNoWait for playback to end (optional, default: false)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kajidog/mcp-tts-voicevox'

If you have feedback or need assistance with the MCP directory API, please join our Discord server