text_to_speech
Convert written text into spoken audio files. Specify the text, output path, and optionally choose voice and model.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Text to convert to speech | |
| output_path | Yes | Absolute path to write the output audio file (e.g. /tmp/output.mp3) | |
| voice | No | Voice ID to use. Defaults to af_heart | |
| model | No | TTS model ID to use. Defaults to speaches-ai/Kokoro-82M-v1.0-ONNX |
Implementation Reference
- src/index.js:66-112 (registration)The `text_to_speech` tool is registered on the MCP server via `server.tool(...)`. It accepts text, output_path, optional voice, and optional model parameters. The handler calls the OpenAI-compatible TTS API, writes the audio buffer to disk, and returns a confirmation message.
server.tool( "text_to_speech", { text: z.string().describe("Text to convert to speech"), output_path: z .string() .describe("Absolute path to write the output audio file (e.g. /tmp/output.mp3)"), voice: z .string() .optional() .describe(`Voice ID to use. Defaults to ${TTS_VOICE}`), model: z .string() .optional() .describe(`TTS model ID to use. Defaults to ${TTS_MODEL}`), }, async ({ text, output_path, voice, model }) => { const dir = path.dirname(output_path); if (!fs.existsSync(dir)) { return { content: [{ type: "text", text: `Error: output directory does not exist: ${dir}` }], isError: true, }; } let buffer; try { const response = await client.audio.speech.create({ model: model || TTS_MODEL, voice: voice || TTS_VOICE, input: text, }); buffer = Buffer.from(await response.arrayBuffer()); } catch (err) { return { content: [{ type: "text", text: `Error: TTS failed: ${err.message}` }], isError: true, }; } fs.writeFileSync(output_path, buffer); return { content: [{ type: "text", text: `Audio saved to ${output_path}` }], }; } ); - src/index.js:68-81 (schema)Input schema for the `text_to_speech` tool, defined using Zod: `text` (required string), `output_path` (required string), `voice` (optional string, defaults to TTS_VOICE env var), and `model` (optional string, defaults to TTS_MODEL env var).
{ text: z.string().describe("Text to convert to speech"), output_path: z .string() .describe("Absolute path to write the output audio file (e.g. /tmp/output.mp3)"), voice: z .string() .optional() .describe(`Voice ID to use. Defaults to ${TTS_VOICE}`), model: z .string() .optional() .describe(`TTS model ID to use. Defaults to ${TTS_MODEL}`), }, - src/index.js:82-111 (handler)The handler function for `text_to_speech`. It validates the output directory exists, calls `client.audio.speech.create()` (OpenAI-compatible TTS API), buffers the response, writes it to the output file, and returns a success or error message.
async ({ text, output_path, voice, model }) => { const dir = path.dirname(output_path); if (!fs.existsSync(dir)) { return { content: [{ type: "text", text: `Error: output directory does not exist: ${dir}` }], isError: true, }; } let buffer; try { const response = await client.audio.speech.create({ model: model || TTS_MODEL, voice: voice || TTS_VOICE, input: text, }); buffer = Buffer.from(await response.arrayBuffer()); } catch (err) { return { content: [{ type: "text", text: `Error: TTS failed: ${err.message}` }], isError: true, }; } fs.writeFileSync(output_path, buffer); return { content: [{ type: "text", text: `Audio saved to ${output_path}` }], }; }