Skip to main content
Glama

Advanced TTS MCP Server

by samihalawa

Advanced TTS MCP Server

A high-quality, feature-rich Text-to-Speech MCP server with native TypeScript implementation. Designed for professional applications requiring natural, expressive speech synthesis with advanced controls and zero external dependencies.

✨ Features

🎯 Advanced Voice Control

  • 10 High-Quality Voices - Male and female voices with distinct personalities
  • Emotion Control - Neutral, happy, excited, calm, serious, casual, confident
  • Dynamic Pacing - Natural, conversational, presentation, tutorial, narrative modes
  • Speed & Volume - Precise control from 0.25x to 3.0x speed, 0.1x to 2.0x volume

🚀 Professional Capabilities

  • Streaming Audio - Real-time synthesis and playback
  • Batch Processing - Handle multiple text segments efficiently
  • Multiple Formats - WAV, MP3, FLAC, OGG output support
  • Natural Speech Enhancement - Automatic pause insertion and emotion markers
  • Queue Management - Handle multiple concurrent requests

🔧 MCP Integration

  • 6 Powerful Tools - Complete synthesis, batch processing, voice management
  • 2 Rich Resources - Voice capabilities and usage examples
  • Real-time Status - Track processing progress and manage requests
  • File Management - Save, list, and organize audio outputs

🚀 Quick Start

🎯 One-Click Deployment to Smithery Platform

  1. Deploy Now: Visit Smithery.ai and import this repository
  2. Configure: Set your preferred voice and speech settings
  3. Use Instantly: Access via Claude Desktop or any MCP-compatible client

Benefits:

  • ✅ Zero setup required
  • ✅ Automatic scaling and updates
  • ✅ No model downloads needed
  • ✅ Enterprise-grade hosting

📋 Full Smithery Deployment Guide →

Option 2: Local Installation

Prerequisites:

  • Node.js 18+

Installation:

  1. Clone the repository
git clone https://github.com/samihalawa/advanced-tts-mcp.git cd advanced-tts-mcp
  1. Install dependencies
npm install
  1. Configure Claude Desktop

Add to your claude_desktop_config.json:

{ "mcpServers": { "advanced-tts": { "command": "node", "args": ["dist/index.js"], "cwd": "/path/to/advanced-tts-mcp" } } }
  1. Start using!
# Build TypeScript npm run build # Start server npm start

Restart Claude Desktop and start synthesizing with natural, expressive voices.

🎙️ Available Voices

Voice IDNameGenderDescription
af_heartHeartFemaleWarm, friendly voice (default)
af_skySkyFemaleClear, bright voice
af_bellaBellaFemaleElegant, sophisticated voice
af_sarahSarahFemaleProfessional, confident voice
af_nicoleNicoleFemaleGentle, soothing voice
am_adamAdamMaleStrong, authoritative voice
am_michaelMichaelMaleFriendly, approachable voice
bf_emmaEmmaFemaleYoung, energetic voice
bf_isabellaIsabellaFemaleMature, expressive voice
bm_lewisLewisMaleDeep, resonant voice

📚 Usage Examples

Basic Synthesis

# Simple text-to-speech await synthesize_speech( text="Hello! Welcome to Advanced TTS.", voice_id="af_heart" )

Emotional Expression

# Excited announcement await synthesize_speech( text="This is amazing news! You're going to love this new feature!", voice_id="af_heart", emotion="excited", pacing="conversational", speed=1.1 )

Professional Presentation

# Tutorial narration await synthesize_speech( text="Step one: Open your browser. Step two: Navigate to the website.", voice_id="am_adam", emotion="calm", pacing="tutorial", speed=0.9 )

Batch Processing

# Multiple segments with pauses await batch_synthesize( segments=[ "Welcome to our presentation.", "Today we'll cover three main topics.", "Let's begin with the first topic." ], voice_id="af_sarah", emotion="confident", pacing="presentation", merge_output=True, segment_pause=1.0, save_file=True )

🛠️ Available Tools

synthesize_speech

Convert text to natural speech with full control over voice characteristics.

Parameters:

  • text - Text to synthesize (max 10,000 chars)
  • voice_id - Voice selection (see table above)
  • speed - Speech rate (0.25-3.0)
  • emotion - Voice emotion (neutral, happy, excited, calm, serious, casual, confident)
  • pacing - Speech style (natural, conversational, presentation, tutorial, narrative, fast, slow)
  • volume - Audio volume (0.1-2.0)
  • output_format - File format (wav, mp3, flac, ogg)
  • save_file - Save to file (boolean)
  • filename - Custom filename

batch_synthesize

Process multiple text segments efficiently with optional merging.

Parameters:

  • segments - List of text segments
  • merge_output - Combine into single file
  • segment_pause - Pause between segments (0.0-5.0s)
  • All synthesis parameters from above

get_voices

Retrieve complete voice information and capabilities.

get_status

Check processing status for synthesis requests.

cancel_request

Cancel active synthesis operations.

list_output_files

Browse saved audio files with metadata.

🎛️ Voice Controls

Emotions

  • Neutral - Standard, professional tone
  • Happy - Upbeat, cheerful expression
  • Excited - Enthusiastic, energetic delivery
  • Calm - Relaxed, soothing tone
  • Serious - Formal, authoritative delivery
  • Casual - Relaxed, conversational style
  • Confident - Assured, professional tone

Pacing Styles

  • Natural - Balanced, human-like rhythm
  • Conversational - Casual discussion pace
  • Presentation - Professional speaking rhythm
  • Tutorial - Educational, clear delivery
  • Narrative - Storytelling pace
  • Fast - Quick delivery (1.2x base speed)
  • Slow - Deliberate delivery (0.8x base speed)

🎵 Audio Formats

FormatQualityUse Case
WAVUncompressedHighest quality, editing
MP3CompressedWeb, streaming, sharing
FLACLosslessArchival, high-quality storage
OGGCompressedOpen source alternative

🔧 Configuration

Environment Variables

# Model paths (optional) KOKORO_MODEL_PATH=./kokoro-v1.0.onnx KOKORO_VOICES_PATH=./voices-v1.0.bin # Output settings TTS_OUTPUT_DIR=./audio_output TTS_MAX_QUEUE_SIZE=100 # Audio settings TTS_DEFAULT_VOICE=af_heart TTS_ENABLE_STREAMING=true

Server Configuration

config = ServerConfig( model_path="./kokoro-v1.0.onnx", voices_path="./voices-v1.0.bin", output_dir="./audio_output", max_queue_size=100, enable_streaming=True, default_voice="af_heart" )

🏗️ Architecture

├── src/advanced_tts/ │ ├── __init__.py # Package initialization │ ├── server.py # MCP server implementation │ ├── engine.py # Kokoro TTS engine wrapper │ ├── models.py # Data models and validation │ └── utils.py # Utility functions ├── pyproject.toml # Project configuration ├── README.md # Documentation └── LICENSE # MIT License

🤝 Contributing

Contributions welcome! Areas for improvement:

  • Additional voice models
  • Real-time streaming synthesis
  • Advanced audio effects
  • Multi-language support
  • Performance optimizations

📄 License

MIT License - see LICENSE for details.

🙏 Acknowledgments

  • Kokoro TTS - High-quality neural voice synthesis
  • MCP Protocol - Seamless AI model integration
  • FastMCP - Efficient server framework

Developed by Sami Halawa

Transform your text into natural, expressive speech with Advanced TTS MCP Server.

Related MCP Servers

  • A
    security
    A
    license
    A
    quality
    Helps refine AI-generated content to sound more natural and human-like. Built with advanced AI detection and text enhancement capabilities.
    Last updated -
    1
    47
    8
    JavaScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    Enables text-to-speech functionality on macOS using the say command, offering extensive control over speech parameters like voice, rate, volume, and pitch for a customizable auditory experience.
    Last updated -
    2
    7
    11
    JavaScript
    MIT License
    • Apple
  • -
    security
    F
    license
    -
    quality
    A server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.
    Last updated -
    4
    JavaScript
  • -
    security
    F
    license
    -
    quality
    Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
    Last updated -
    2
    Python

View all related MCP servers

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/samihalawa/advanced-tts-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server