speak
Convert text to speech and play it through speakers with customizable voice, speed, and volume settings for audio output.
Instructions
Convert text to speech and play it through the speakers.
Args:
text: The text to convert to speech
speaker_id: Voice speaker ID (default: 0)
length_scale: Speech speed control (default: 1.1, lower = faster)
noise_scale: Voice variation control (default: 0.667)
noise_w_scale: Pronunciation variation control (default: 0.333)
volume: Volume level from 0.01 to 1.00 (default: 0.15)
Returns:
Success or error message
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | ||
| speaker_id | No | ||
| length_scale | No | ||
| noise_scale | No | ||
| noise_w_scale | No | ||
| volume | No |
Implementation Reference
- server.py:41-110 (handler)Core execution logic of the 'speak' tool: validates parameters, calls TTS service at localhost:5000, receives audio, plays it using pygame (memory or file fallback), handles errors.try: # Validate volume parameter volume = max(0.01, min(1.00, volume)) # Clamp between 0.01 and 1.00 # Prepare data for TTS API data = { "text": text, "speaker_id": speaker_id, "length_scale": length_scale, "noise_scale": noise_scale, "noise_w_scale": noise_w_scale } # Make request to TTS service response = requests.post( "http://localhost:5000", headers={"Content-Type": "application/json"}, json=data, timeout=30 ) if response.status_code != 200: return f"TTS service error: HTTP {response.status_code}" # Try to play audio from memory first, fallback to file method try: # Initialize pygame mixer pygame.mixer.init() pygame.mixer.music.set_volume(volume) # Create in-memory file-like object audio_data = io.BytesIO(response.content) # Load and play from memory pygame.mixer.music.load(audio_data) pygame.mixer.music.play() # Wait for playback to complete while pygame.mixer.music.get_busy(): pygame.time.wait(100) except Exception as memory_error: # Fallback to file method if memory method fails filename = f"speak_{int(time.time())}.wav" with open(filename, "wb") as f: f.write(response.content) pygame.mixer.init() pygame.mixer.music.set_volume(volume) pygame.mixer.music.load(filename) pygame.mixer.music.play() # Wait for playback to complete while pygame.mixer.music.get_busy(): pygame.time.wait(100) # Clean up the audio file try: os.remove(filename) except Exception: pass # Ignore cleanup errors return f"Successfully spoke: '{text}'" except requests.exceptions.ConnectionError: return "Error: TTS service not available at localhost:5000" except requests.exceptions.Timeout: return "Error: TTS service request timed out" except Exception as e: return f"Error: {str(e)}"
- server.py:19-40 (schema)Input schema defined by function parameters with type annotations, default values, and comprehensive docstring describing args and return.def speak( text: str, speaker_id: Optional[int] = 0, length_scale: Optional[float] = 1.1, noise_scale: Optional[float] = 0.667, noise_w_scale: Optional[float] = 0.333, volume: Optional[float] = 0.15 ) -> str: """ Convert text to speech and play it through the speakers. Args: text: The text to convert to speech speaker_id: Voice speaker ID (default: 0) length_scale: Speech speed control (default: 1.1, lower = faster) noise_scale: Voice variation control (default: 0.667) noise_w_scale: Pronunciation variation control (default: 0.333) volume: Volume level from 0.01 to 1.00 (default: 0.15) Returns: Success or error message """
- server.py:18-18 (registration)The @mcp.tool() decorator registers the 'speak' function as an MCP tool with FastMCP instance.@mcp.tool()