This server integrates AivisSpeech for Japanese text-to-speech synthesis and engine monitoring. Key capabilities include:
- Text-to-Speech Conversion: Transform text to high-quality Japanese speech with customizable parameters
- Voice Selection: Choose from multiple voice characters and styles (default: Anneli ノーマル)
- Speech Customization: Adjust speed, pitch, volume, and intonation for precise voice tuning
- Task Notifications: Voice-based notifications for process completion
- Engine Monitoring: Check status and version of the AivisSpeech engine
- Cross-Platform Support: Works on macOS, Windows, and Linux with automatic audio playback
- Easy Integration: Simple MCP protocol for AI assistant integration
Utilizes ESLint for code quality and maintaining consistent code style in the MCP server implementation
Requires Node.js 18.0.0+ as runtime environment for the MCP server that connects to the AivisSpeech engine
Uses TypeScript for type-safe implementation of the MCP server and AivisSpeech API client
Employs Vitest for testing the MCP server functionality and AivisSpeech client integration
MCP Simple AivisSpeech
🙏 Special Thanks
This project is based on mcp-simple-voicevox by @t09tanaka.
We deeply appreciate their excellent work in creating the original MCP server for VOICEVOX, which served as the foundation for this AivisSpeech adaptation.
A Model Context Protocol (MCP) server for seamless integration with AivisSpeech text-to-speech engine. This project enables AI assistants and applications to convert text to natural-sounding Japanese speech with customizable voice parameters.
✨ Features
- 🎙️ Text-to-Speech Conversion: High-quality Japanese speech synthesis using AivisSpeech
- 👥 Multiple Voice Characters: Support for various speakers and voice styles (default: Anneli ノーマル)
- ⚙️ Configurable Parameters: Adjust speed, pitch, volume, and intonation
- 🔊 Cross-Platform Audio: Automatic audio playback on macOS, Windows, and Linux
- 🔔 Task Notifications: Voice notifications for process completion
- 🚀 Easy Integration: Simple MCP protocol for AI assistant integration
- 📊 Engine Status Monitoring: Real-time status checking of AivisSpeech engine
- 🛡️ Smart Error Handling: Helpful error messages with speaker suggestions
📋 Prerequisites
- Node.js: Version 18.0.0 or higher
- AivisSpeech Engine: Running on
http://127.0.0.1:10101
(default port) - Audio System: System audio capabilities for playback
MCP Simple AivisSpeech Configuration
Using Claude Code
The easiest way to add this MCP server is using Claude Code:
✨ Using npx ensures you always get the latest version automatically - no manual updates needed!
By default, the server is added to the local scope (current project only). To make it available across all projects, use the -s user
option:
You can also add voice notifications to your CLAUDE.md file to automate task completion notifications:
Using Claude Desktop
For manual configuration with Claude Desktop:
✨ Using npx ensures you always get the latest version automatically - no manual updates needed!
⚙️ AivisSpeech Engine Setup
Before using this MCP server, you need to have AivisSpeech running locally:
- Download AivisSpeech from https://aivis-project.com/
- Launch AivisSpeech on your local machine
- The engine will start on the default port 10101
- Verify the engine is running by visiting
http://127.0.0.1:10101/docs
📖 Other Usage Methods
For Local Development
For cloning the repository, installing dependencies, and building:
🛠️ Available Tools
🎤 speak
Convert text to speech and play audio with customizable voice parameters.
Parameters:
text
(required): Text to convert to speechspeaker
(optional): Speaker/voice ID (default:888753760
- Anneli ノーマル)speedScale
(optional): Speech speed multiplier (0.5
-2.0
, default:1.0
)pitchScale
(optional): Pitch adjustment (-0.15
-0.15
, default:0.0
)volumeScale
(optional): Volume level (0.0
-2.0
, default:1.0
)playAudio
(optional): Whether to play the generated audio (default:true
)
Example:
👥 get_speakers
Retrieve a list of all available voice characters and their styles.
Returns: List of speakers with their IDs, names, and available voice styles.
🔔 notify_completion
Play a voice notification when tasks are completed.
Parameters:
message
(optional): Completion message to announce (default:"処理が完了しました"
)speaker
(optional): Speaker ID for the notification voice (default:888753760
- Anneli ノーマル)
Example:
📊 check_engine_status
Check the current status and version of the AivisSpeech engine.
Returns: Engine status, version information, and connectivity details.
🖥️ Platform Support
Audio Playback Systems
Platform | Audio Command | Requirements |
---|---|---|
macOS | afplay | Built-in (no additional setup) |
Windows | PowerShell Media.SoundPlayer | Windows PowerShell |
Linux | aplay | ALSA utils (sudo apt install alsa-utils ) |
Tested Environments
- ✅ macOS 12+ (Intel & Apple Silicon)
- ✅ Windows 10/11
- ✅ Ubuntu 20.04+
- ✅ Node.js 18.x, 20.x, 21.x
🧪 Development
Available Scripts
Local vs NPX Usage
For MCP clients (Production):
- Use
npx @shinshin86/mcp-simple-aivisspeech@latest
in your MCP configuration - No local setup required, always gets latest version
For development:
- Clone repository and use
npm run dev
for hot reload - Use
npm run build && npm start
for testing production builds
Project Architecture
API Client Architecture
The AivisSpeechClient
class provides:
- HTTP Client: Axios-based API communication
- Error Handling: Comprehensive error catching and reporting
- Type Safety: Full TypeScript interfaces for all API responses
- Connection Management: Health checks and status monitoring
Adding New Features
- New Tool: Add handler in
src/index.ts
CallToolRequestSchema
- API Methods: Extend
AivisSpeechClient
class - Types: Update interfaces in
aivisspeech-client.ts
- Tests: Add corresponding test cases
🔧 Troubleshooting
Common Issues
AivisSpeech Engine Not Found
Solution: Ensure AivisSpeech Engine is running on the correct port.
Audio Playback Fails
Solutions:
- macOS: Check if
afplay
is available - Linux: Install ALSA utils:
sudo apt install alsa-utils
- Windows: Ensure PowerShell execution policy allows scripts
Permission Denied
Solution: Check file permissions and system audio settings.
Debug Mode
Enable verbose logging:
📄 License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
🤝 Contributing
We welcome contributions! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
Development Guidelines
- Follow existing TypeScript/ESLint configurations
- Add tests for new functionality
- Update documentation for API changes
- Ensure cross-platform compatibility
🙏 Acknowledgments
- AivisSpeech Project for the excellent TTS engine
- Model Context Protocol for the integration framework
- VOICEVOX MCP for inspiration and reference
📞 Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: AivisSpeech API Docs
Made with ❤️ for the Japanese TTS community
local-only server
The server can only run on the client's local machine because it depends on local resources.
Tools
A Model Context Protocol server that integrates with AivisSpeech to enable AI assistants to convert text to natural-sounding Japanese speech with customizable voice parameters.
- ✨ Features
- 📋 Prerequisites
- MCP Simple AivisSpeech Configuration
- ⚙️ AivisSpeech Engine Setup
- 📖 Other Usage Methods
- 🛠️ Available Tools
- 🖥️ Platform Support
- 🧪 Development
- 🔧 Troubleshooting
- 📄 License
- 🤝 Contributing
- 🙏 Acknowledgments
- 📞 Support
Related Resources
Related MCP Servers
- -securityFlicense-qualityA Model Context Protocol server that enables AI assistants to utilize AivisSpeech Engine's high-quality voice synthesis capabilities through a standardized API interface.Last updated -TypeScript
- AsecurityAlicenseAqualityA Node.js server that enables AI assistants to interact with Bouyomi-chan's text-to-speech functionality through Model Context Protocol (MCP), allowing for voice reading of text with adjustable parameters.Last updated -11JavaScriptMIT License
- AsecurityAlicenseAqualityA Model Context Protocol server that enables AI assistants like Claude to use Bouyomichan (a Japanese text-to-speech program) for voice reading with adjustable voice types, volume, speed, and pitch.Last updated -11JavaScriptMIT License
- -securityFlicense-qualityA Model Context Protocol server that provides text-to-speech functionality for AI agents using Microsoft Edge's text-to-speech technology, supporting multiple voices, languages, and voice customization.Last updated -1Python