speak
Convert text into speech using the VOICEVOX engine, enabling audio generation and playback for applications requiring voice synthesis.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes |
Implementation Reference
- src/index.ts:51-69 (handler)Handler function that takes text input, uses VOICEVOX API via helpers to generate speech audio, saves it to /tmp/voicevox.wav, plays it using afplay, and returns success response.async ({ text }) => { const resolvedSpeakerId = Number(process.env.SPEAKER_ID); if (!resolvedSpeakerId || isNaN(resolvedSpeakerId)) { throw new Error("speaker_idが指定されてないか、環境変数SPEAKER_IDが不正です"); } const query = await createAudioQuery(text, resolvedSpeakerId); const buffer = await synthesizeVoice(query, resolvedSpeakerId); const filePath = "/tmp/voicevox.wav"; saveAudioFile(buffer, filePath); playAudio(filePath); return { content: [ { type: "text", text: "OK", } ] }; }
- src/index.ts:50-50 (schema)Zod schema defining the input parameter 'text' as a required string.{ text: z.string() },
- src/index.ts:49-70 (registration)MCP server.tool call registering the 'speak' tool with its schema and handler function.server.tool("speak", { text: z.string() }, async ({ text }) => { const resolvedSpeakerId = Number(process.env.SPEAKER_ID); if (!resolvedSpeakerId || isNaN(resolvedSpeakerId)) { throw new Error("speaker_idが指定されてないか、環境変数SPEAKER_IDが不正です"); } const query = await createAudioQuery(text, resolvedSpeakerId); const buffer = await synthesizeVoice(query, resolvedSpeakerId); const filePath = "/tmp/voicevox.wav"; saveAudioFile(buffer, filePath); playAudio(filePath); return { content: [ { type: "text", text: "OK", } ] }; } )
- src/api.ts:10-15 (helper)Helper function to create an audio synthesis query using VOICEVOX API.export async function createAudioQuery(text: string, speakerId: number) { const res = await fetch(`${VOICEVOX_API_URL}/audio_query?speaker=${speakerId}&text=${encodeURIComponent(text)}`, { method: "POST", }); return await res.json(); }
- src/api.ts:18-30 (helper)Helper function to synthesize audio from query using VOICEVOX API, applying custom speed scale from env.export async function synthesizeVoice(query: any, speakerId: number) { const speedScale = process.env.SPEED_SCALE ? Number(process.env.SPEED_SCALE) : 1.2; query.speedScale = speedScale; const res = await fetch(`${VOICEVOX_API_URL}/synthesis?speaker=${speakerId}`, { method: "POST", headers: { "Content-Type": "application/json", }, body: JSON.stringify(query), }); const audioBlob = await res.blob(); return Buffer.from(await audioBlob.arrayBuffer()); }