This server enables AI assistants to convert text to speech using VOICEVOX Engine with multi-character conversations and advanced playback control.
Core Capabilities:
Text-to-Speech Playback - Convert and play text with multi-line support, per-line speaker assignment (e.g., "1:Hello\n2:World"), configurable speed, queue management, and synchronous/asynchronous playback options
Speaker Management - List available speakers, get detailed speaker information by UUID, and configure default speakers with per-project overrides
Audio File Generation - Create and save audio files from text to specified paths
Playback Control - Stop current audio and clear queue, with immediate or queued playback modes
Voice Synthesis Query Generation - Generate intermediate query objects for advanced synthesis use cases
Health Check - Verify VOICEVOX Engine connectivity and status
Key Features:
Cross-platform compatibility (Windows, macOS, Linux, WSL)
Streaming playback with ffplay for low latency
HTTP mode for remote connections
Configurable defaults via environment variables, command-line arguments, and custom HTTP headers in
mcp.jsonTool management with ability to disable specific tools and restrict options
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@VOICEVOX TTS MCPSay 'Welcome to the stream!' using speaker 3"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
VOICEVOX TTS MCP
English | 日本語
A text-to-speech MCP server using VOICEVOX
🎮 Try the Browser Demo — Test VoicevoxClient directly in your browser
What You Can Do
Make your AI assistant speak — Text-to-speech from MCP clients like Claude Desktop
UI Audio Player (MCP Apps) — Play audio directly in the chat with an interactive player
Multi-character conversations — Switch speakers per segment in a single call
Smooth playback — Queue management, immediate playback, prefetching, streaming
Cross-platform — Works on Windows, macOS, Linux (including WSL)
UI Audio Player (MCP Apps)

The voicevox_speak_player tool uses MCP Apps to render an interactive audio player directly inside the chat. Unlike the standard voicevox_speak tool which plays audio on the server, audio is played on the client side (in the browser/app) — no audio device needed on the server.
Features
Client-side playback — Audio plays in Claude Desktop's chat, not on the server. Works even over remote connections.
Play/Pause controls — Full playback controls embedded in the conversation
Multi-speaker dialogue — Sequential playback of multiple speakers in one player with track navigation
Speaker switching — Change the voice of any segment directly from the player UI
Multi-speaker playback | Track list | Speaker selection |
|
|
|
Note:
voicevox_speak_playerrequires a host that supports MCP Apps (e.g., Claude Desktop). In hosts without MCP Apps support, the tool is not available andvoicevox_speak(server-side playback) can be used instead.
Quick Start
Requirements
Node.js 18.0.0 or higher (or Bun) or Docker
VOICEVOX Engine (must be running; included in Docker Compose)
ffplay (optional, recommended — not needed with Docker)
Installing FFplay
ffplay is a lightweight player included with FFmpeg that supports playback from stdin. When available, it automatically enables low-latency streaming playback.
💡 FFplay is optional. Without it, playback falls back to temp file-based playback (Windows: PowerShell, macOS: afplay, Linux: aplay, etc.).
Easy setup: One-liner installation for each OS (see steps below)
Required:
ffplaymust be in PATH (restart terminal/apps after installation)
Installation examples:
Windows (any of these)
Winget:
winget install --id=Gyan.FFmpeg -eChocolatey:
choco install ffmpegScoop:
scoop install ffmpegOfficial builds: Download from https://www.gyan.dev/ffmpeg/builds/ or https://github.com/BtbN/FFmpeg-Builds and add the
binfolder to PATH
macOS
Homebrew:
brew install ffmpeg
Linux
Debian/Ubuntu:
sudo apt-get update && sudo apt-get install -y ffmpegFedora:
sudo dnf install -y ffmpegArch:
sudo pacman -S ffmpeg
PATH Setup:
Windows: Add
...\ffmpeg\binto environment variables, then restart PowerShell/terminal and editor (Claude/VS Code, etc.)Verify:
powershell -c "$env:Path"should include the ffmpeg path
macOS/Linux: Usually auto-detected. Check with
echo $PATHif needed, restart shell.MCP clients (Claude Desktop/Code): Restart the app to reload PATH.
Verification:
If version info is displayed, installation is complete. CLI/MCP will automatically detect ffplay and use stdin streaming playback.
3 Steps to Get Started
1. Start VOICEVOX Engine
2. Add to Claude Desktop config file
Config file location:
Windows:
%APPDATA%\Claude\claude_desktop_config.jsonmacOS:
~/Library/Application Support/Claude/claude_desktop_config.json
💡 Bun を使う場合は
npxをbunxに置き換えるだけでOK:"command": "bunx", "args": ["@kajidog/mcp-tts-voicevox"]
3. Restart Claude Desktop
That's it! Ask Claude to "say hello" and it will speak!
Quick Start with Docker
You can run both the MCP server and VOICEVOX Engine with a single command using Docker Compose. No Node.js or VOICEVOX installation required.
1. Start the containers
This starts the VOICEVOX Engine and the MCP server (HTTP mode on port 3000).
2. Add to Claude Desktop config file (using mcp-remote)
3. Restart Claude Desktop
Limitations (Docker): The Docker container has no audio device, so the
voicevox_speaktool (server-side playback) is disabled by default. Usevoicevox_speak_playerinstead — it plays audio on the client side (in Claude Desktop) and works without any audio device on the server. See UI Audio Player for details.
MCP Tools
voicevox_speak — Text-to-Speech
The main feature callable from Claude.
Parameter | Description | Default |
| Text to speak (multiple segments separated by newlines) | Required |
| Speaker ID | 1 |
| Playback speed | 1.0 |
| Immediate playback (clears queue) | true |
| Wait for playback completion | false |
Examples:
Tool | Description |
| Speak with UI audio player (disable with |
| Check VOICEVOX Engine connection |
| Get list of available speakers |
| Stop playback and clear queue |
| Generate audio file |
Configuration
VOICEVOX Settings
Variable | Description | Default |
| Engine URL |
|
| Default speaker ID |
|
| Playback speed |
|
Playback Options
Variable | Description | Default |
| Streaming playback (requires |
|
| Immediate playback |
|
| Wait for playback start |
|
| Wait for playback end |
|
Restriction Settings
Restrict AI from specifying certain options.
Variable | Description |
| Restrict |
| Restrict |
| Restrict |
Disable Tools
UI Player Settings
Variable | Description | Default |
| Auto-play audio in UI player |
|
Server Settings
Variable | Description | Default |
| Enable HTTP mode |
|
| HTTP port |
|
| HTTP host |
|
| Allowed hosts (comma-separated) |
|
| Allowed origins (comma-separated) |
|
Command line arguments take priority over environment variables.
Argument | Description |
| Show help |
| Show version |
| VOICEVOX Engine URL |
| Default speaker ID |
| Playback speed |
| Streaming playback |
| Immediate playback |
| Wait for start |
| Wait for end |
| Restrict immediate |
| Restrict waitForStart |
| Restrict waitForEnd |
| Disable tools |
| Auto-play in UI player |
| HTTP mode |
| HTTP port |
| HTTP host |
| Allowed hosts (comma-separated) |
| Allowed origins (comma-separated) |
For remote connections:
Start Server:
Claude Desktop Config (using mcp-remote):
Per-Project Speaker Settings
With Claude Code, you can configure different default speakers per project using custom headers in .mcp.json:
Header | Description |
| Default speaker ID for this project |
Example
This allows each project to use a different voice character automatically.
Priority order:
Explicit
speakerparameter in tool call (highest)Project default from
X-Voicevox-SpeakerheaderGlobal
VOICEVOX_DEFAULT_SPEAKERsetting (lowest)
Connecting from WSL to an MCP server running on Windows:
1. Get Windows Host IP from WSL
2. Start Server on Windows
Add the WSL gateway IP to MCP_ALLOWED_HOSTS to allow access from WSL:
Or with CLI arguments:
3. WSL Configuration (.mcp.json)
⚠️ Within WSL,
localhostrefers to WSL itself. Use the WSL gateway IP to access the Windows host.
Troubleshooting
1. Check if VOICEVOX Engine is running
2. Check platform-specific playback tools
OS | Required Tool |
Linux | One of |
macOS |
|
Windows | PowerShell (pre-installed) |
Check package installation:
npm list -g @kajidog/mcp-tts-voicevoxVerify JSON syntax in config file
Restart the client
Package Structure
Package | Description |
| MCP server |
General-purpose VOICEVOX client library (can be used independently) | |
| React-based audio player UI for browser playback |
Setup
Commands
Command | Description |
| Build all packages |
| Run tests |
| Run lint |
| Start dev server |
| Dev with stdio mode |
| Start dev server with Bun |
| Start HTTP dev server with Bun |
License
ISC


