Kokoro TTS MCP Server

by giannisanni

Integrations

  • Supports audio playback on Linux systems using the 'aplay' command for the generated speech audio files.

  • Supports audio playback on macOS using the 'afplay' command for the generated speech audio files.

Kokoro TTS MCP Server

A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Kokoro TTS engine. This server exposes TTS functionality through MCP tools, making it easy to integrate speech synthesis into your applications.

Prerequisites

  • Python 3.10 or higher
  • uv package manager

Installation

  1. First, install the uv package manager:
curl -LsSf https://astral.sh/uv/install.sh | sh
  1. Clone this repository and install dependencies:
uv venv source .venv/bin/activate # On Windows, use: .venv\Scripts\activate uv pip install .

Features

  • Text-to-speech synthesis with customizable voices
  • Adjustable speech speed
  • Support for saving audio to files or direct playback
  • Cross-platform audio playback support (Windows, macOS, Linux)

Usage

The server provides a single MCP tool generate_speech with the following parameters:

  • text (required): The text to convert to speech
  • voice (optional): Voice to use for synthesis (default: "af_heart")
  • speed (optional): Speech speed multiplier (default: 1.0)
  • save_path (optional): Directory to save audio files
  • play_audio (optional): Whether to play the audio immediately (default: False)

Example Usage

from mcp.client import Client async with Client() as client: await client.connect("kokoro-tts") # Generate and play speech result = await client.call_tool( "generate_speech", { "text": "Hello, world!", "voice": "af_heart", "speed": 1.0, "play_audio": True } )

Dependencies

  • kokoro >= 0.8.4
  • mcp[cli] >= 1.3.0
  • soundfile >= 0.13.1

Platform Support

Audio playback is supported on:

  • Windows (using start)
  • macOS (using afplay)
  • Linux (using aplay)

MCP Configuration

Add the following configuration to your MCP settings file:

{ "mcpServers": { "kokoro-tts": { "command": "/Users/giannisan/pinokio/bin/miniconda/bin/uv", "args": [ "--directory", "/Users/giannisan/Documents/Cline/MCP/kokoro-tts-mcp", "run", "tts-mcp.py" ] } } }

License

[Add your license information here]

-
security - not tested
F
license - not found
-
quality - not tested

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.

  1. Prerequisites
    1. Installation
      1. Features
        1. Usage
          1. Example Usage
        2. Dependencies
          1. Platform Support
            1. MCP Configuration
              1. License

                Related MCP Servers

                • -
                  security
                  F
                  license
                  -
                  quality
                  Integrates ElevenLabs Text-to-Speech capabilities with Cursor through the Model Context Protocol, allowing users to convert text to speech with selectable voices within the Cursor editor.
                  Last updated -
                  1
                  Python
                  • Linux
                  • Apple
                • -
                  security
                  F
                  license
                  -
                  quality
                  A Model Context Protocol server that provides text-to-speech capabilities using the Kokoro TTS model, offering multiple voice options and customizable speech parameters.
                  Last updated -
                  239
                  JavaScript
                  • Apple
                  • Linux
                • -
                  security
                  F
                  license
                  -
                  quality
                  A Model Context Protocol server that enables AI assistants to utilize AivisSpeech Engine's high-quality voice synthesis capabilities through a standardized API interface.
                  Last updated -
                  TypeScript
                • -
                  security
                  A
                  license
                  -
                  quality
                  A Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.
                  Last updated -
                  TypeScript
                  MIT License

                View all related MCP servers

                ID: 83sig8ty3v