Kokoro TTS MCP Server
A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Kokoro TTS engine. This server exposes TTS functionality through MCP tools, making it easy to integrate speech synthesis into your applications.
Prerequisites
Python 3.10 or higher
uv
package manager
Installation
First, install the
uv
package manager:
Clone this repository and install dependencies:
Features
Text-to-speech synthesis with customizable voices
Adjustable speech speed
Support for saving audio to files or direct playback
Cross-platform audio playback support (Windows, macOS, Linux)
Usage
The server provides a single MCP tool generate_speech
with the following parameters:
text
(required): The text to convert to speechvoice
(optional): Voice to use for synthesis (default: "af_heart")speed
(optional): Speech speed multiplier (default: 1.0)save_path
(optional): Directory to save audio filesplay_audio
(optional): Whether to play the audio immediately (default: False)
Example Usage
Dependencies
kokoro >= 0.8.4
mcp[cli] >= 1.3.0
soundfile >= 0.13.1
Platform Support
Audio playback is supported on:
Windows (using
start
)macOS (using
afplay
)Linux (using
aplay
)
MCP Configuration
Add the following configuration to your MCP settings file:
License
[Add your license information here]
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
Related Resources
Related MCP Servers
- -securityAlicense-qualityA Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.Last updated -01MIT License
- -securityFlicense-qualityEnables seamless integration with Typecast API through the Model Context Protocol, allowing clients to manage voices, convert text to speech, and play audio in a standardized way.Last updated -2
Gladia MCPofficial
-securityAlicense-qualityOfficial Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.- -securityAlicense-qualityA Model Context Protocol server that enables developers to integrate advanced text-to-speech and video translation capabilities into their applications through simple API calls.