# voiceroid_daemon-mcp
MCP (Model Context Protocol) server for VOICEROID2 text-to-speech via [voiceroid_daemon](https://github.com/Nkyoku/voiceroid_daemon).
## Features
- Text-to-speech generation with VOICEROID2 voices
- Text-to-kana phonetic conversion
- Customizable voice parameters (volume, speed, pitch, emphasis)
- Audio playback support for macOS, Windows, and Linux
- Basic authentication support
## Prerequisites
- Node.js 18 or higher
- [voiceroid_daemon](https://github.com/Nkyoku/voiceroid_daemon) running on your system
- VOICEROID2 installed (for voiceroid_daemon)
## Installation
```bash
# Clone the repository
git clone https://github.com/mohemohe/voiceroid_daemon-mcp.git
cd voiceroid_daemon-mcp
# Install dependencies
npm install
```
## Configuration
Create a `.env` file in the project root (optional):
```env
# voiceroid_daemon server URL (default: http://127.0.0.1:8080)
VOICEROID_DAEMON_URL=http://127.0.0.1:8080
# Basic authentication (if required)
VOICEROID_DAEMON_USERNAME=your_username
VOICEROID_DAEMON_PASSWORD=your_password
# Default voice parameters (optional)
VOICEROID_DEFAULT_VOLUME=1.0 # 0-2
VOICEROID_DEFAULT_SPEED=1.3 # 0.5-4
VOICEROID_DEFAULT_PITCH=1.0 # 0.5-2
VOICEROID_DEFAULT_EMPHASIS=1.1 # 0-2
VOICEROID_DEFAULT_PAUSE_MIDDLE=150 # 80-500
VOICEROID_DEFAULT_PAUSE_LONG=370 # 100-2000
VOICEROID_DEFAULT_PAUSE_SENTENCE=800 # 0-10000
```
## Usage
### Running the MCP Server
```bash
# Run directly with tsx (no build required)
npm start
# Or for development with auto-reload
npm run dev
```
### Configuring with Claude Desktop
Add the following to your Claude Desktop configuration file:
**macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
**Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
```json
{
"mcpServers": {
"voiceroid-daemon": {
"command": "npx",
"args": ["tsx", "/path/to/voiceroid_daemon-mcp/src/index.ts"],
"env": {
"VOICEROID_DAEMON_URL": "http://127.0.0.1:8080"
}
}
}
}
```
## Available Tools
### test_connection
Test the connection to voiceroid_daemon server.
```
No parameters required
```
### convert_text
Convert Japanese text to phonetic kana reading.
Parameters:
- `text` (string, required): Text to convert to kana
### speak_text
Generate and play speech audio from text.
Parameters:
- `text` (string, required): Text to speak
- `kana` (string, optional): Phonetic reading in kana
- `volume` (number, optional): Voice volume (0-2, default: 1)
- `speed` (number, optional): Speech speed (0.5-4, default: 1)
- `pitch` (number, optional): Voice pitch (0.5-2, default: 1)
- `emphasis` (number, optional): Emphasis level (0-2, default: 1)
## Example Usage in Claude
Once configured, you can use the tools in Claude:
```
Use the test_connection tool to check if voiceroid_daemon is running.
Convert "こんにちは" to kana using the convert_text tool.
Use speak_text to say "こんにちは、今日はいい天気ですね" with speed 1.2.
```
## Troubleshooting
### Connection Failed
1. Ensure voiceroid_daemon is running
2. Check the URL in your configuration
3. Verify firewall settings allow connections
4. Test with curl: `curl http://127.0.0.1:8080/`
### Audio Playback Issues
- **macOS**: Uses `afplay` (built-in)
- **Windows**: Uses PowerShell's `Media.SoundPlayer`
- **Linux**: Requires `aplay` (usually part of alsa-utils)
### Authentication Errors
If voiceroid_daemon requires authentication, ensure you've set:
- `VOICEROID_DAEMON_USERNAME`
- `VOICEROID_DAEMON_PASSWORD`
## Development
```bash
# Type checking
npm run typecheck
# Linting
npm run lint
# Run in development mode
npm run dev
```
## License
MIT