sayText
Convert text into spoken audio using customizable voices and formats for accessible content creation.
Instructions
Generate speech that says the provided text verbatim
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | The text to speak verbatim | |
| voice | No | Voice to use for audio generation (default: "alloy") | |
| format | No | Format of the audio (mp3, wav, etc.) | |
| voiceInstructions | No | Additional instructions for voice character/style (e.g., "Speak with enthusiasm" or "Use a calm tone") |
Implementation Reference
- src/services/audioService.js:126-208 (handler)The handler function that executes the sayText tool: generates verbatim text-to-speech audio using the Pollinations API, supports voice, format, instructions, and optional playback.async function sayText(params) { const { text, voice = "alloy", format = "mp3", voiceInstructions, audioPlayer, tempDir, } = params; if (!text || typeof text !== "string") { throw new Error("Text is required and must be a string"); } // Prepare the query parameters const queryParams = { model: "openai-audio", voice, format, }; // Prepare the prompt with the verbatim instruction let finalPrompt = `Say verbatim: ${text}`; // Add voice instructions if provided if (voiceInstructions) { finalPrompt = `${voiceInstructions}\n\n${finalPrompt}`; } // Build the URL using the utility function const url = buildUrl( AUDIO_API_BASE_URL, encodeURIComponent(finalPrompt), queryParams, ); try { // Fetch the audio from the URL const response = await fetch(url); if (!response.ok) { throw new Error( `Failed to generate speech: ${response.statusText}`, ); } // Get the audio data as an ArrayBuffer const audioBuffer = await response.arrayBuffer(); // Convert the ArrayBuffer to a base64 string const base64Data = Buffer.from(audioBuffer).toString("base64"); // Determine the mime type from the format const mimeType = `audio/${format === "mp3" ? "mpeg" : format}`; // Play the audio if an audio player is provided if (audioPlayer) { const tempDirPath = tempDir || os.tmpdir(); await playAudio( base64Data, mimeType, "say_text", audioPlayer, tempDirPath, ); } // Return the response in MCP format return createMCPResponse([ { type: "audio", data: base64Data, mimeType, }, createTextContent( `Generated audio for text: "${text}"\n\nVoice: ${voice}\nFormat: ${format}`, ), ]); } catch (error) { console.error("Error generating audio:", error); throw error; } }
- src/services/audioService.js:359-378 (schema)Input schema using Zod for validating parameters of the sayText tool: text (required), voice, format, voiceInstructions (optional).{ text: z.string().describe("The text to speak verbatim"), voice: z .string() .optional() .describe( 'Voice to use for audio generation (default: "alloy")', ), format: z .string() .optional() .describe("Format of the audio (mp3, wav, etc.)"), voiceInstructions: z .string() .optional() .describe( 'Additional instructions for voice character/style (e.g., "Speak with enthusiasm" or "Use a calm tone")', ), }, sayText,
- src/services/audioService.js:356-379 (registration)Registration entry for the sayText tool in the audioTools export array, formatted for MCP server.tool() calls.[ "sayText", "Generate speech that says the provided text verbatim", { text: z.string().describe("The text to speak verbatim"), voice: z .string() .optional() .describe( 'Voice to use for audio generation (default: "alloy")', ), format: z .string() .optional() .describe("Format of the audio (mp3, wav, etc.)"), voiceInstructions: z .string() .optional() .describe( 'Additional instructions for voice character/style (e.g., "Speak with enthusiasm" or "Use a calm tone")', ), }, sayText, ],