Skip to main content
Glama

speak_response

Generate spoken responses from text in multiple languages and emotions using Claude integration. Allows customization of tone and language for precise TTS output.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
emotionNoneutral
languageNoen-us
textYes

Implementation Reference

  • src/server.ts:103-155 (registration)
    Registration of the 'speak_response' tool with MCP server, including input schema and inline handler function that generates and plays TTS audio with specified emotion.
    this.mcp.tool( "speak_response", { text: z.string(), language: z.string().default("en-us"), emotion: z.enum(["neutral", "happy", "sad", "angry"]).default("neutral"), }, async ({ text, language, emotion }: ZonosRequestParams) => { try { const emotionParams = this.emotionMap[emotion]; console.log(`Converting to speech: "${text}" with ${emotion} emotion`); // Use new OpenAI-style endpoint const response = await axios.post(`${API_BASE_URL}/v1/audio/speech`, { model: "Zyphra/Zonos-v0.1-transformer", input: text, language: language, emotion: emotionParams, speed: 1.0, response_format: "wav" // Using WAV for better compatibility }, { responseType: 'arraybuffer' }); // Save the audio response to a temporary file const tempAudioPath = `/tmp/tts_output_${Date.now()}.wav`; const fs = await import('fs/promises'); await fs.writeFile(tempAudioPath, response.data); // Play the audio await this.playAudio(tempAudioPath); // Clean up the temporary file await fs.unlink(tempAudioPath); return { content: [ { type: "text", text: `Successfully spoke: "${text}" with ${emotion} emotion`, }, ], }; } catch (error) { const errorMessage = error instanceof Error ? error.message : "Unknown error"; console.error("TTS Error:", errorMessage); if (axios.isAxiosError(error) && error.response) { console.error("API Response:", error.response.data); } throw new Error(`TTS failed: ${errorMessage}`); } } );
  • The core handler function for the speak_response tool. It calls a local TTS API with emotion parameters to generate WAV audio, saves it temporarily, plays it using platform-specific playback, cleans up, and returns a success message.
    async ({ text, language, emotion }: ZonosRequestParams) => { try { const emotionParams = this.emotionMap[emotion]; console.log(`Converting to speech: "${text}" with ${emotion} emotion`); // Use new OpenAI-style endpoint const response = await axios.post(`${API_BASE_URL}/v1/audio/speech`, { model: "Zyphra/Zonos-v0.1-transformer", input: text, language: language, emotion: emotionParams, speed: 1.0, response_format: "wav" // Using WAV for better compatibility }, { responseType: 'arraybuffer' }); // Save the audio response to a temporary file const tempAudioPath = `/tmp/tts_output_${Date.now()}.wav`; const fs = await import('fs/promises'); await fs.writeFile(tempAudioPath, response.data); // Play the audio await this.playAudio(tempAudioPath); // Clean up the temporary file await fs.unlink(tempAudioPath); return { content: [ { type: "text", text: `Successfully spoke: "${text}" with ${emotion} emotion`, }, ], }; } catch (error) { const errorMessage = error instanceof Error ? error.message : "Unknown error"; console.error("TTS Error:", errorMessage); if (axios.isAxiosError(error) && error.response) { console.error("API Response:", error.response.data); } throw new Error(`TTS failed: ${errorMessage}`); } }
  • Zod input schema defining parameters for the speak_response tool: text (required), language (default en-us), emotion (default neutral).
    { text: z.string(), language: z.string().default("en-us"), emotion: z.enum(["neutral", "happy", "sad", "angry"]).default("neutral"), },
  • Helper method to play the generated audio file using platform-specific commands (afplay on macOS, paplay on Linux, PowerShell on Windows). Called by the speak_response handler.
    private async playAudio(audioPath: string): Promise<void> { try { console.log("Playing audio from:", audioPath); switch (process.platform) { case "darwin": await execAsync(`afplay ${audioPath}`); break; case "linux": // Try paplay for PulseAudio const XDG_RUNTIME_DIR = process.env.XDG_RUNTIME_DIR || '/run/user/1000'; const env = { ...process.env, PULSE_SERVER: `unix:${XDG_RUNTIME_DIR}/pulse/native`, PULSE_COOKIE: `${process.env.HOME}/.config/pulse/cookie` }; await execAsync(`paplay ${audioPath}`, { env }); break; case "win32": await execAsync( `powershell -c (New-Object Media.SoundPlayer '${audioPath}').PlaySync()` ); break; default: throw new Error(`Unsupported platform: ${process.platform}`); } } catch (error) { const errorMessage = error instanceof Error ? error.message : "Unknown error"; console.error("Audio playback error:", errorMessage); throw new Error(`Audio playback failed: ${errorMessage}`); } }
  • Emotion parameter mappings used by the speak_response handler to configure TTS emotions for the API call.
    this.emotionMap = { neutral: { happiness: 0.2, sadness: 0.2, anger: 0.2, disgust: 0.05, fear: 0.05, surprise: 0.1, other: 0.1, neutral: 0.8, }, happy: { happiness: 1, sadness: 0.05, anger: 0.05, disgust: 0.05, fear: 0.05, surprise: 0.2, other: 0.1, neutral: 0.2, }, sad: { happiness: 0.05, sadness: 1, anger: 0.05, disgust: 0.2, fear: 0.2, surprise: 0.05, other: 0.1, neutral: 0.2, }, angry: { happiness: 0.05, sadness: 0.2, anger: 1, disgust: 0.4, fear: 0.2, surprise: 0.2, other: 0.1, neutral: 0.1, }, };

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PhialsBasement/Zonos-TTS-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server