Skip to main content
Glama
giannisanni

Kokoro TTS MCP Server

Kokoro TTS MCP Server

A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Kokoro TTS engine. This server exposes TTS functionality through MCP tools, making it easy to integrate speech synthesis into your applications.

Prerequisites

  • Python 3.10 or higher

  • uv package manager

Related MCP server: Typecast API MCP Server

Installation

  1. First, install the uv package manager:

curl -LsSf https://astral.sh/uv/install.sh | sh
  1. Clone this repository and install dependencies:

uv venv
source .venv/bin/activate  # On Windows, use: .venv\Scripts\activate
uv pip install .

Features

  • Text-to-speech synthesis with customizable voices

  • Adjustable speech speed

  • Support for saving audio to files or direct playback

  • Cross-platform audio playback support (Windows, macOS, Linux)

  • Optional OpenAI-compatible remote backend (e.g. kokoro-fastapi) to offload synthesis to a GPU box

Usage

The server provides a single MCP tool generate_speech with the following parameters:

  • text (required): The text to convert to speech

  • voice (optional): Voice to use for synthesis (default: "af_heart")

  • speed (optional): Speech speed multiplier (default: 1.0)

  • save_path (optional): Directory to save audio files

  • play_audio (optional): Whether to play the audio immediately (default: False)

Example Usage

from mcp.client import Client

async with Client() as client:
    await client.connect("kokoro-tts")
    
    # Generate and play speech
    result = await client.call_tool(
        "generate_speech",
        {
            "text": "Hello, world!",
            "voice": "af_heart",
            "speed": 1.0,
            "play_audio": True
        }
    )

Remote backend (OpenAI-compatible)

By default the server runs Kokoro locally. If you already run an OpenAI-compatible TTS endpoint such as kokoro-fastapi (handy for running on a GPU), point the server at it with environment variables — no local torch/kokoro needed:

Variable

Default

Description

KOKORO_BASE_URL

(unset)

OpenAI-compatible base URL, e.g. http://localhost:8880/v1. When set, synthesis is sent here instead of running locally.

KOKORO_API_KEY

not-needed

Bearer token, if your endpoint requires one.

KOKORO_MODEL

kokoro

Model name passed to the endpoint.

Under the hood this calls POST {KOKORO_BASE_URL}/audio/speech with the standard OpenAI payload (model, input, voice, speed, response_format: wav).

Docker

docker build -t kokoro-tts-mcp .
docker run --rm -i kokoro-tts-mcp

To use a remote backend instead of bundling Kokoro:

docker run --rm -i -e KOKORO_BASE_URL=http://host.docker.internal:8880/v1 kokoro-tts-mcp

Dependencies

  • kokoro >= 0.8.4

  • mcp[cli] >= 1.3.0

  • soundfile >= 0.13.1

  • httpx >= 0.27.0

Platform Support

Audio playback is supported on:

  • Windows (using start)

  • macOS (using afplay)

  • Linux (using aplay)

MCP Configuration

Add the following configuration to your MCP settings file:

{
  "mcpServers": {
    "kokoro-tts": {
      "command": "/Users/giannisan/pinokio/bin/miniconda/bin/uv",
      "args": [
        "--directory",
        "/Users/giannisan/Documents/Cline/MCP/kokoro-tts-mcp",
        "run",
        "tts-mcp.py"
      ]
    }
  }
}

License

MIT © Gianni Sanrochman

Install Server
A
license - permissive license
D
quality
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/giannisanni/kokoro-tts-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server