respondAudio

respondAudio

Convert text prompts into audio responses using customizable voice and format options for accessible content creation.

Instructions

Generate an audio response to a text prompt

Input Schema

TableJSON Schema

Name	Required	Description
`prompt`	Yes	The text prompt to respond to with audio
`voice`	No	Voice to use for audio generation (default: "alloy")
`format`	No	Format of the audio (mp3, wav, etc.)
`voiceInstructions`	No	Additional instructions for voice character/style (e.g., "Speak with enthusiasm" or "Use a calm tone")

Implementation Reference

src/services/audioService.js:32-112 (handler)
The handler function that implements the core logic of the respondAudio tool. It takes a text prompt, generates audio using the Pollinations Text-to-Speech API, converts it to base64, optionally plays it, and returns an MCP response with the audio data.
async function respondAudio(params) { const { prompt, voice = "alloy", format = "mp3", voiceInstructions, audioPlayer, tempDir, } = params; if (!prompt || typeof prompt !== "string") { throw new Error("Prompt is required and must be a string"); } // Prepare the query parameters const queryParams = { model: "openai-audio", voice, format, }; // Prepare the prompt let finalPrompt = prompt; // Add voice instructions if provided if (voiceInstructions) { finalPrompt = `${voiceInstructions}\n\n${prompt}`; } // Build the URL using the utility function const url = buildUrl( AUDIO_API_BASE_URL, encodeURIComponent(finalPrompt), queryParams, ); try { // Fetch the audio from the URL const response = await fetch(url); if (!response.ok) { throw new Error(`Failed to generate audio: ${response.statusText}`); } // Get the audio data as an ArrayBuffer const audioBuffer = await response.arrayBuffer(); // Convert the ArrayBuffer to a base64 string const base64Data = Buffer.from(audioBuffer).toString("base64"); // Determine the mime type from the format const mimeType = `audio/${format === "mp3" ? "mpeg" : format}`; // Play the audio if an audio player is provided if (audioPlayer) { const tempDirPath = tempDir || os.tmpdir(); await playAudio( base64Data, mimeType, "respond_audio", audioPlayer, tempDirPath, ); } // Return the response in MCP format return createMCPResponse([ { type: "audio", data: base64Data, mimeType, }, createTextContent( `Generated audio response for prompt: "${prompt}"\n\nVoice: ${voice}\nFormat: ${format}`, ), ]); } catch (error) { console.error("Error generating audio:", error); throw error; } }
src/services/audioService.js:329-354 (registration)
The registration entry for the respondAudio tool within the exported audioTools array, which is used to register the tool with the MCP server. Includes name, description, input schema, and reference to the handler function.
[ "respondAudio", "Generate an audio response to a text prompt", { prompt: z .string() .describe("The text prompt to respond to with audio"), voice: z .string() .optional() .describe( 'Voice to use for audio generation (default: "alloy")', ), format: z .string() .optional() .describe("Format of the audio (mp3, wav, etc.)"), voiceInstructions: z .string() .optional() .describe( 'Additional instructions for voice character/style (e.g., "Speak with enthusiasm" or "Use a calm tone")', ), }, respondAudio, ],
src/services/audioService.js:332-352 (schema)
Zod schema defining the input parameters for the respondAudio tool, including prompt, voice, format, and voiceInstructions.
{ prompt: z .string() .describe("The text prompt to respond to with audio"), voice: z .string() .optional() .describe( 'Voice to use for audio generation (default: "alloy")', ), format: z .string() .optional() .describe("Format of the audio (mp3, wav, etc.)"), voiceInstructions: z .string() .optional() .describe( 'Additional instructions for voice character/style (e.g., "Speak with enthusiasm" or "Use a calm tone")', ), },
src/services/audioService.js:269-302 (helper)
Helper function used by respondAudio to play the generated audio in the terminal using the provided audioPlayer, saving to a temporary file first.
function playAudio(audioData, mimeType, prefix, audioPlayer, tempDir) { if (!audioPlayer || !tempDir) { return Promise.resolve(); } return new Promise((resolve, reject) => { try { const format = getFormatFromMimeType(mimeType); const tempFile = path.join( tempDir, `${prefix}_${Date.now()}.${format}`, ); fs.writeFileSync(tempFile, Buffer.from(audioData, "base64")); audioPlayer.play(tempFile, (err) => { if (err) { console.error("Error playing audio:", err); } // Clean up temp file after playing try { fs.unlinkSync(tempFile); } catch (e) { console.error("Error removing temp file:", e); } resolve(); }); } catch (error) { console.error("Error playing audio:", error); reject(error); } }); }

Pollinations Multimodal MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API