Maid-MCP š
A full-featured MCP (Model Context Protocol) server that gives Claude Desktop a maid personality codenamed Mimi with Japanese-accented voice, visual avatar presence, and speech recognition capabilities. Best used with a Claude Max plan, Opus 4 is very good about managing all the maid tools while coding things for you. This project is specifically meant to be for fun, not for productivity. There are already a million productivity mcp servers.
![]()
Features
šµ Japanese-accented voice - Character voice using ja-JP neural voices, its part of her charm the voice is hard to understand. You can also have her change her voice at any time.
š Visual avatar system - Interactive Mimi sprite with 16+ poses and animations
š¤ Speech recognition - Talk to Mimi naturally with voice input
š» Hidden audio playback - Voice plays without any windows appearing
šÆ Audio queue system - Allows Mimi to speak multiple times rapidly without conflicts
š® Interactive controls - Drag, hide, show, and animate the avatar
š§ Full MCP integration - Voice and avatar tools work seamlessly with Claude Desktop
Quick Start
1. Install Dependencies
2. Configure Claude Desktop
Add to your %APPDATA%\Claude\claude_desktop_config.json:
Replace path/to/maid-mcp with the actual path where you cloned this repository.
3. Launch Everything
This automatically:
⨠Cleans up any existing processes
š¤ Launches avatar display window
š„ļø Starts avatar state server (port 3338)
š¤ Opens voice input listener
4. Stop Everything
Voice Loop š¤āš¬āšāš
You speak ā Microphone picks up voice
Speech recognition ā Converts to text
Ultra fast sender ā Sends to Claude Desktop
Claude (Mimi) processes ā Understands and responds
Voice synthesis ā Mimi speaks with Japanese accent
Avatar reacts ā Visual feedback with animations
Available MCP Tools
Voice Tools š
Tool | Description | Parameters |
| Convert text to speech |
|
| Get available voices | None |
| Change current voice |
|
Emotions: neutral, happy, sad, excited, angry, shy
Avatar Tools š
Tool | Description | Parameters |
| Display avatar on screen |
|
| Hide avatar (keeps running) | None |
| Play animation or pose |
|
| Stop current animation | None |
| Reposition avatar |
|
| Create custom sequence |
|
| List all animations | None |
| List available sprites | None |
Avatar Interaction
Action | Result |
Right-click | Hide avatar (stays running) |
Double-click | Close avatar permanently |
Left-click | Cancel animation |
Drag | Move avatar (shows pick_up pose) |
ESC key | Close avatar permanently |
Project Structure
Voice Configuration
Adjust Microphone Sensitivity
Recommended sensitivity values:
Very Quiet Room: 1000-2000
Normal Room: 2000-4000
Office: 4000-6000
Noisy: 6000-10000
Calibrate Microphone
Voice Settings
Edit voice/incoming/voice_config.ini:
Troubleshooting
Voice Input Not Working
Check microphone permissions in Windows
Run calibration to verify microphone levels
Adjust energy_threshold if needed
Ensure Python dependencies are installed
Multiple Avatar Windows
Use
start_all_python.batfor better process managementRun
stop_all.batbefore starting againCheck Task Manager for lingering Python processes
Audio Playback Issues
Check
temp_voice/folder for audio filesVerify Windows Media Player is installed
Restart Claude Desktop if audio queue stuck
Avatar Not Appearing
Verify port 3338 is free
Check if sprites exist in
avatar/library/Look for avatar window behind other windows
Development Notes
Adding New Voices
Edit voice/outgoing/voiceConfig.js to add more Edge TTS voices
Creating New Poses
Add PNG file to
avatar/library/Use filename (without .png) as animation ID
Custom Animations
Recent Updates
v1.0.0 - Released to the world oh god what I have done. I am so sorry Claude.
Credits
Avatar sprites from chatgpt 4o
Voice synthesis using Microsoft Edge TTS
Speech recognition via Google Speech API
License
MIT