vocea_generate_audio
Generate speech from text with a specified voice, language, and emotional tone. Returns an audio URL for immediate use.
Instructions
Convert text to speech using a Vocea voice. Returns an audio URL.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| voice_id | Yes | Voice UUID to use for synthesis | |
| text | Yes | Text to convert to speech (max 10000 chars) | |
| language_code | Yes | Language code, e.g. 'en', 'es', 'fr' | |
| emotion | No | Emotional tone (default: neutral) | |
| speaking_rate | No | Speaking rate multiplier 0.5–1.5 (default 1.0) |
Implementation Reference
- src/index.ts:109-133 (handler)Handler for the vocea_generate_audio tool. It extracts the required arguments (voice_id, text, language_code) and optional arguments (emotion, speaking_rate), calls the Vocea SDK's audios.generate() method, and returns a JSON response with id, audioUrl, durationSeconds, and characterCount.
case "vocea_generate_audio": { const a = args as { voice_id: string; text: string; language_code: string; emotion?: string; speaking_rate?: number }; const audio = await vocea.audios.generate({ voice_id: a.voice_id, text: a.text, language_code: a.language_code, voice_setting: { emotion: (a.emotion as never) ?? "neutral", speakingRate: a.speaking_rate ?? 1.0, }, }); return { content: [ { type: "text", text: JSON.stringify({ id: audio.id, audioUrl: audio.audioUrl, durationSeconds: audio.durationSeconds, characterCount: audio.characterCount, }), }, ], }; } - src/index.ts:30-47 (schema)Schema/registration definition for vocea_generate_audio in the ListToolsResponse. Defines the inputSchema with required properties: voice_id (string), text (string, max 10000 chars), language_code (string), and optional properties: emotion (enum of emotional tones), speaking_rate (number 0.5-1.5).
{ name: "vocea_generate_audio", description: "Convert text to speech using a Vocea voice. Returns an audio URL.", inputSchema: { type: "object", properties: { voice_id: { type: "string", description: "Voice UUID to use for synthesis" }, text: { type: "string", description: "Text to convert to speech (max 10000 chars)" }, language_code: { type: "string", description: "Language code, e.g. 'en', 'es', 'fr'" }, emotion: { type: "string", enum: ["neutral", "happy", "sad", "angry", "fearful", "surprised", "disgusted", "whisper"], description: "Emotional tone (default: neutral)", }, speaking_rate: { type: "number", description: "Speaking rate multiplier 0.5–1.5 (default 1.0)" }, }, required: ["voice_id", "text", "language_code"], }, - src/index.ts:28-100 (registration)The tool is registered with the MCP server via ListToolsRequestSchema, listing all six tools including vocea_generate_audio as part of the tools array.
server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ { name: "vocea_generate_audio", description: "Convert text to speech using a Vocea voice. Returns an audio URL.", inputSchema: { type: "object", properties: { voice_id: { type: "string", description: "Voice UUID to use for synthesis" }, text: { type: "string", description: "Text to convert to speech (max 10000 chars)" }, language_code: { type: "string", description: "Language code, e.g. 'en', 'es', 'fr'" }, emotion: { type: "string", enum: ["neutral", "happy", "sad", "angry", "fearful", "surprised", "disgusted", "whisper"], description: "Emotional tone (default: neutral)", }, speaking_rate: { type: "number", description: "Speaking rate multiplier 0.5–1.5 (default 1.0)" }, }, required: ["voice_id", "text", "language_code"], }, }, { name: "vocea_list_voices", description: "List the authenticated user's cloned voices.", inputSchema: { type: "object", properties: { page: { type: "number", description: "Page number (default 1)" }, limit: { type: "number", description: "Results per page (default 20)" }, }, }, }, { name: "vocea_list_public_voices", description: "List public community voices available for use.", inputSchema: { type: "object", properties: { page: { type: "number" }, limit: { type: "number" }, ageRange: { type: "string", enum: ["young", "adult", "senior"] }, }, }, }, { name: "vocea_transcribe", description: "Transcribe audio from a base64-encoded string to text (STT).", inputSchema: { type: "object", properties: { audio_base64: { type: "string", description: "Base64-encoded audio file (mp3, wav, ogg, webm, flac, max 10MB)" }, mime_type: { type: "string", description: "MIME type, e.g. audio/mpeg", default: "audio/mpeg" }, language: { type: "string", description: "BCP-47 language code, e.g. en-US (default: es-ES)" }, }, required: ["audio_base64"], }, }, { name: "vocea_get_credits", description: "Get the current credit balance for the authenticated user.", inputSchema: { type: "object", properties: {} }, }, { name: "vocea_get_audio", description: "Get details (including CDN URL) for a previously generated audio.", inputSchema: { type: "object", properties: { audio_id: { type: "string" } }, required: ["audio_id"], }, }, ], }));