Kokoro TTS MCP Server

A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Kokoro TTS engine. This server exposes TTS functionality through MCP tools, making it easy to integrate speech synthesis into your applications.

Prerequisites

Python 3.10 or higher
uv package manager

Installation

First, install the uv package manager:

curl -LsSf https://astral.sh/uv/install.sh | sh

Clone this repository and install dependencies:

uv venv
source .venv/bin/activate  # On Windows, use: .venv\Scripts\activate
uv pip install .

Features

Text-to-speech synthesis with customizable voices
Adjustable speech speed
Support for saving audio to files or direct playback
Cross-platform audio playback support (Windows, macOS, Linux)

Usage

The server provides a single MCP tool generate_speech with the following parameters:

text (required): The text to convert to speech
voice (optional): Voice to use for synthesis (default: "af_heart")
speed (optional): Speech speed multiplier (default: 1.0)
save_path (optional): Directory to save audio files
play_audio (optional): Whether to play the audio immediately (default: False)

Example Usage

from mcp.client import Client

async with Client() as client:
    await client.connect("kokoro-tts")
    
    # Generate and play speech
    result = await client.call_tool(
        "generate_speech",
        {
            "text": "Hello, world!",
            "voice": "af_heart",
            "speed": 1.0,
            "play_audio": True
        }
    )

Dependencies

kokoro >= 0.8.4
mcp[cli] >= 1.3.0
soundfile >= 0.13.1

Platform Support

Audio playback is supported on:

Windows (using start)
macOS (using afplay)
Linux (using aplay)

MCP Configuration

Add the following configuration to your MCP settings file:

{
  "mcpServers": {
    "kokoro-tts": {
      "command": "/Users/giannisan/pinokio/bin/miniconda/bin/uv",
      "args": [
        "--directory",
        "/Users/giannisan/Documents/Cline/MCP/kokoro-tts-mcp",
        "run",
        "tts-mcp.py"
      ]
    }
  }
}

License

[Add your license information here]

This server cannot be installed

security - not tested

license - not found

quality - not tested

How are these scores calculated?

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.

Related Resources

Reddit Discussion about this server

Related MCP Servers

ElevenLabs Text-to-Speech MCP
georgi-io
-
security
F
license
-
quality
Integrates ElevenLabs Text-to-Speech capabilities with Cursor through the Model Context Protocol, allowing users to convert text to speech with selectable voices within the Cursor editor.
Last updated -
1
Python
Speech MCP Server
hammeiam
-
security
F
license
-
quality
A Model Context Protocol server that provides text-to-speech capabilities using the Kokoro TTS model, offering multiple voice options and customizable speech parameters.
Last updated -
239
JavaScript
AivisSpeech MCP Server
kentaro
-
security
F
license
-
quality
A Model Context Protocol server that enables AI assistants to utilize AivisSpeech Engine's high-quality voice synthesis capabilities through a standardized API interface.
Last updated -
TypeScript
TTS-MCP
nakamurau1
-
security
A
license
-
quality
A Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.
Last updated -
TypeScript
MIT License

View all related MCP servers

Kokoro TTS MCP Server