Manages configuration through environment variables stored in .env files for customizing TTS settings and service connections.
Converts generated .wav audio files to .mp3 format for storage and distribution.
Provides access to the Kokoro ONNX weights repository for downloading necessary model files.
Utilizes ONNX model files for text-to-speech processing, specifically loading the Kokoro model weights for voice generation.
Offers a Python client interface for sending TTS requests to the server with customizable voice, speed, and file management options.
Kokoro Text to Speech (TTS) MCP Server
Kokoro Text to Speech MCP server that generates .mp3 files with option to upload to S3.
Uses: https://huggingface.co/spaces/hexgrad/Kokoro-TTS
Configuration
- Clone to a local repo.
- Download the Kokoro Onnx Weights for kokoro-v1.0.onnx and voices-v1.0.bin and store in the same repo.
Add the following to your MCP configs. Update with your own values.
Install ffmmeg
This is needed to convert .wav to .mp3 files
For mac:
To run locally add these to your .env file. See env.example and copy to .env and modify with your own values.
Supported Environment Variables
AWS_ACCESS_KEY_ID
: Your AWS access key IDAWS_SECRET_ACCESS_KEY
: Your AWS secret access keyAWS_S3_BUCKET_NAME
: S3 bucket nameAWS_S3_REGION
: S3 region (e.g., us-east-1)AWS_S3_FOLDER
: Folder path within the S3 bucketAWS_S3_ENDPOINT_URL
: Optional custom endpoint URL for S3-compatible storageMCP_HOST
: Host to bind the server to (default: 0.0.0.0)MCP_PORT
: Port to listen on (default: 9876)MCP_CLIENT_HOST
: Hostname for client connections to the server (default: localhost)DEBUG
: Enable debug mode (set to "true" or "1")S3_ENABLED
: Enable S3 uploads (set to "true" or "1")MP3_FOLDER
: Path to store MP3 files (default is 'mp3' folder in script directory)MP3_RETENTION_DAYS
: Number of days to keep MP3 files before automatic deletionDELETE_LOCAL_AFTER_S3_UPLOAD
: Whether to delete local MP3 files after successful S3 upload (set to "true" or "1")TTS_VOICE
: Default voice for the TTS client (default: af_heart)TTS_SPEED
: Default speed for the TTS client (default: 1.0)TTS_LANGUAGE
: Default language for the TTS client (default: en-us)
Running the Server Locally
Preferred method use UV
Using the TTS Client
The mcp_client.py
script allows you to send TTS requests to the server. It can be used as follows:
Connection Settings
When running the server and client on the same machine:
- Server should bind to
0.0.0.0
(all interfaces) or127.0.0.1
(localhost only) - Client should connect to
localhost
or127.0.0.1
Basic Usage
Reading Text from a File
Customizing Voice and Speed
Disabling S3 Upload
Command-line Options
MP3 File Management
The TTS server generates MP3 files that are stored locally and optionally uploaded to S3. You can configure how these files are managed:
Local Storage
- Set
MP3_FOLDER
in your.env
file to specify where MP3 files are stored - Files are kept in this folder unless automatically deleted
Automatic Cleanup
- Set
MP3_RETENTION_DAYS=30
(or any number) to automatically delete files older than that number of days - Set
DELETE_LOCAL_AFTER_S3_UPLOAD=true
to delete local files immediately after successful S3 upload
S3 Integration
- Enable/disable S3 uploads with
S3_ENABLED=true
orDISABLE_S3=true
- Configure AWS credentials and bucket settings in the
.env
file - S3 uploads can be disabled per-request using the client's
--no-s3
option
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
A server that generates MP3 audio files from text using Kokoro TTS technology with optional S3 upload capabilities.
Related MCP Servers
- -securityFlicense-qualityA server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.Last updated -4JavaScript
- AsecurityAlicenseAqualityA server enabling integration between KoboldAI's text generation capabilities and MCP-compatible applications, with features like chat completion, Stable Diffusion, and OpenAI-compatible API endpoints.Last updated -2053JavaScriptMIT License
- -securityFlicense-qualityA Model Context Protocol server that provides text-to-speech capabilities using the Kokoro TTS model, offering multiple voice options and customizable speech parameters.Last updated -239JavaScript
- AsecurityAlicenseAqualityA MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.Last updated -12JavaScriptMIT License