speak
Convert text into speech via text-to-speech technology and mark delivered utterances as responded for efficient voice-based communication.
Instructions
Speak text using text-to-speech and mark delivered utterances as responded
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | The text to speak |
Implementation Reference
- src/unified-server.ts:855-916 (handler)MCP CallToolRequestSchema request handler implementing the 'speak' tool. Proxies the call to the local /api/speak HTTP endpoint and handles success/error responses.mcpServer.setRequestHandler(CallToolRequestSchema, async (request) => { const { name, arguments: args } = request.params; try { if (name === 'speak') { const text = args?.text as string; if (!text || !text.trim()) { return { content: [ { type: 'text', text: 'Error: Text is required for speak tool', }, ], isError: true, }; } const response = await fetch(`http://localhost:${HTTP_PORT}/api/speak`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ text }), }); const data = await response.json() as any; if (response.ok) { return { content: [ { type: 'text', text: '', // Return empty string for success }, ], }; } else { return { content: [ { type: 'text', text: `Error speaking text: ${data.error || 'Unknown error'}`, }, ], isError: true, }; } } throw new Error(`Unknown tool: ${name}`); } catch (error) { return { content: [ { type: 'text', text: `Error: ${error instanceof Error ? error.message : String(error)}`, }, ], isError: true, }; } });
- src/unified-server.ts:833-854 (registration)MCP ListToolsRequestSchema request handler that registers the 'speak' tool, including its description and input schema.mcpServer.setRequestHandler(ListToolsRequestSchema, async () => { // Only expose the speak tool - voice input is auto-delivered via hooks return { tools: [ { name: 'speak', description: 'Speak text using text-to-speech and mark delivered utterances as responded', inputSchema: { type: 'object', properties: { text: { type: 'string', description: 'The text to speak', }, }, required: ['text'], }, } ] }; });
- src/unified-server.ts:840-849 (schema)Input schema/JSON Schema definition for the 'speak' tool parameters.inputSchema: { type: 'object', properties: { text: { type: 'string', description: 'The text to speak', }, }, required: ['text'], },
- src/unified-server.ts:673-728 (helper)HTTP endpoint /api/speak called by the MCP handler. Notifies connected browser clients via SSE for TTS, adds assistant message to conversation history, marks user utterances as responded, and updates last speak timestamp.app.post('/api/speak', async (req: Request, res: Response) => { const { text } = req.body; if (!text || !text.trim()) { res.status(400).json({ error: 'Text is required' }); return; } // Check if voice responses are enabled if (!voicePreferences.voiceResponsesEnabled) { debugLog(`[Speak] Voice responses disabled, returning error`); res.status(400).json({ error: 'Voice responses are disabled', message: 'Cannot speak when voice responses are disabled' }); return; } try { // Always notify browser clients - they decide how to speak notifyTTSClients(text); debugLog(`[Speak] Sent text to browser for TTS: "${text}"`); // Note: The browser will decide whether to use system voice or browser voice // Store assistant's response in conversation history queue.addAssistantMessage(text); // Mark all delivered utterances as responded const deliveredUtterances = queue.utterances.filter(u => u.status === 'delivered'); deliveredUtterances.forEach(u => { u.status = 'responded'; debugLog(`[Queue] marked as responded: "${u.text}" [id: ${u.id}]`); // Sync status in messages array const message = queue.messages.find(m => m.id === u.id && m.role === 'user'); if (message) { message.status = 'responded'; } }); lastSpeakTimestamp = new Date(); res.json({ success: true, message: 'Text spoken successfully', respondedCount: deliveredUtterances.length }); } catch (error) { debugLog(`[Speak] Failed to speak text: ${error}`); res.status(500).json({ error: 'Failed to speak text', details: error instanceof Error ? error.message : String(error) }); } });