speak_response
Convert text to speech with language and emotion customization using the Zonos TTS MCP Server.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | ||
| language | No | en-us | |
| emotion | No | neutral |
Implementation Reference
- src/server.ts:110-155 (handler)The handler function for the 'speak_response' tool. It calls the Zonos TTS API with emotion parameters, saves the WAV audio to a temp file, plays it using platform-specific tools, cleans up, and returns a success message. Handles errors by throwing.async ({ text, language, emotion }: ZonosRequestParams) => { try { const emotionParams = this.emotionMap[emotion]; console.log(`Converting to speech: "${text}" with ${emotion} emotion`); // Use new OpenAI-style endpoint const response = await axios.post(`${API_BASE_URL}/v1/audio/speech`, { model: "Zyphra/Zonos-v0.1-transformer", input: text, language: language, emotion: emotionParams, speed: 1.0, response_format: "wav" // Using WAV for better compatibility }, { responseType: 'arraybuffer' }); // Save the audio response to a temporary file const tempAudioPath = `/tmp/tts_output_${Date.now()}.wav`; const fs = await import('fs/promises'); await fs.writeFile(tempAudioPath, response.data); // Play the audio await this.playAudio(tempAudioPath); // Clean up the temporary file await fs.unlink(tempAudioPath); return { content: [ { type: "text", text: `Successfully spoke: "${text}" with ${emotion} emotion`, }, ], }; } catch (error) { const errorMessage = error instanceof Error ? error.message : "Unknown error"; console.error("TTS Error:", errorMessage); if (axios.isAxiosError(error) && error.response) { console.error("API Response:", error.response.data); } throw new Error(`TTS failed: ${errorMessage}`); } } );
- src/server.ts:105-109 (schema)Zod input schema for the 'speak_response' tool defining parameters: text (required string), language (string default 'en-us'), emotion (enum default 'neutral').{ text: z.string(), language: z.string().default("en-us"), emotion: z.enum(["neutral", "happy", "sad", "angry"]).default("neutral"), },
- src/server.ts:103-156 (registration)Registration of the 'speak_response' tool on the MCP server via this.mcp.tool(), providing name, input schema, and inline handler function. Called within setupTools().this.mcp.tool( "speak_response", { text: z.string(), language: z.string().default("en-us"), emotion: z.enum(["neutral", "happy", "sad", "angry"]).default("neutral"), }, async ({ text, language, emotion }: ZonosRequestParams) => { try { const emotionParams = this.emotionMap[emotion]; console.log(`Converting to speech: "${text}" with ${emotion} emotion`); // Use new OpenAI-style endpoint const response = await axios.post(`${API_BASE_URL}/v1/audio/speech`, { model: "Zyphra/Zonos-v0.1-transformer", input: text, language: language, emotion: emotionParams, speed: 1.0, response_format: "wav" // Using WAV for better compatibility }, { responseType: 'arraybuffer' }); // Save the audio response to a temporary file const tempAudioPath = `/tmp/tts_output_${Date.now()}.wav`; const fs = await import('fs/promises'); await fs.writeFile(tempAudioPath, response.data); // Play the audio await this.playAudio(tempAudioPath); // Clean up the temporary file await fs.unlink(tempAudioPath); return { content: [ { type: "text", text: `Successfully spoke: "${text}" with ${emotion} emotion`, }, ], }; } catch (error) { const errorMessage = error instanceof Error ? error.message : "Unknown error"; console.error("TTS Error:", errorMessage); if (axios.isAxiosError(error) && error.response) { console.error("API Response:", error.response.data); } throw new Error(`TTS failed: ${errorMessage}`); } } ); }
- src/server.ts:158-189 (helper)Supporting helper function called by the handler to play the generated TTS audio file using platform-specific commands (afplay on macOS, paplay on Linux with PulseAudio env, PowerShell on Windows).private async playAudio(audioPath: string): Promise<void> { try { console.log("Playing audio from:", audioPath); switch (process.platform) { case "darwin": await execAsync(`afplay ${audioPath}`); break; case "linux": // Try paplay for PulseAudio const XDG_RUNTIME_DIR = process.env.XDG_RUNTIME_DIR || '/run/user/1000'; const env = { ...process.env, PULSE_SERVER: `unix:${XDG_RUNTIME_DIR}/pulse/native`, PULSE_COOKIE: `${process.env.HOME}/.config/pulse/cookie` }; await execAsync(`paplay ${audioPath}`, { env }); break; case "win32": await execAsync( `powershell -c (New-Object Media.SoundPlayer '${audioPath}').PlaySync()` ); break; default: throw new Error(`Unsupported platform: ${process.platform}`); } } catch (error) { const errorMessage = error instanceof Error ? error.message : "Unknown error"; console.error("Audio playback error:", errorMessage); throw new Error(`Audio playback failed: ${errorMessage}`); } }
- src/server.ts:36-41 (schema)TypeScript interface defining the parameters for the Zonos TTS request, used in the handler signature.interface ZonosRequestParams { text: string; language: string; emotion: Emotion; }