The Zonos TTS MCP Server enables text-to-speech functionality in Claude through the speak_response
tool, allowing it to generate and play spoken audio from text with the following capabilities:
- Multi-Language Support: Generate speech in different languages (default:
en-us
) - Emotion Control: Customize the emotional tone (neutral, happy, sad, angry)
- PulseAudio Integration: Ensures proper audio playback
- MCP Integration: Seamlessly works with Claude's Model Context Protocol
The server requires a running instance of Zonos API with proper Node.js and PulseAudio setup.
Zonos MCP Integration
A Model Context Protocol integration for Zonos TTS, allowing Claude to generate speech directly.
Setup
Installing via Smithery
To install Zonos TTS Integration for Claude Desktop automatically via Smithery:
Manual installation
- Make sure you have Zonos running with our API implementation (PhialsBasement/zonos-api)
- Install dependencies:
- Configure PulseAudio access:
- Build the MCP server:
- Add to Claude's config file:
Edit your Claude config file (usually in
~/.config/claude/config.json
) and add this to themcpServers
section:
Replace /path/to/your/zonos-mcp
with the actual path where you installed the MCP server.
Using with Claude
Once configured, Claude automatically knows how to use the speak_response
tool:
Features
- Text-to-speech through Claude
- Multiple emotions support
- Multi-language support
- Proper audio playback through PulseAudio
Requirements
- Node.js
- PulseAudio setup
- Running instance of Zonos API (PhialsBasement/zonos-api)
- Working audio output device
Notes
- Make sure both the Zonos API server and this MCP server are running
- Audio playback requires proper PulseAudio configuration
You must be authenticated.
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Tools
Facilitates direct speech generation using Claude for multiple languages and emotions, integrating with a Zonos TTS setup via the Model Context Protocol.
Related Resources
Related MCP Servers
- AsecurityAlicenseAqualityEnables text generation using the Qwen Max language model with configurable parameters and seamless integration with Claude Desktop via the Model Context Protocol (MCP).Last updated -19JavaScriptMIT License
- -securityFlicense-qualityProvides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.Last updated -2Python
- -securityFlicense-qualityA Model Context Protocol server that provides text-to-speech capabilities using the Kokoro TTS model, offering multiple voice options and customizable speech parameters.Last updated -239JavaScript
- -securityAlicense-qualityA server that enables Claude 3.7 and other AI agents to access VOICEVOX-compatible speech synthesis engines (AivisSpeech, VOICEVOX, COEIROINK) through the Model Context Protocol.Last updated -2TypeScriptMIT License