Skip to main content
Glama
installation-tools.md8.46 kB
# Voice Mode Installation Tools Voice Mode now includes MCP tools to automatically install and configure whisper.cpp and kokoro-fastapi, making it easier to set up free, private, open-source voice services. ## Overview These tools handle: - System detection (macOS/Linux) - Dependency installation - GPU support configuration - Model downloads - Service configuration ## Available Tools ### install_whisper_cpp Installs [whisper.cpp](https://github.com/ggerganov/whisper.cpp) for speech-to-text (STT) functionality. #### Features - Automatic OS detection (macOS/Linux) - GPU acceleration (Metal on macOS, CUDA on Linux) - Model download management - Build optimization - Service configuration (launchd on macOS, systemd on Linux) - Environment variable support for model selection #### Usage ```python # Basic installation with defaults result = await install_whisper_cpp() # Custom installation result = await install_whisper_cpp( install_dir="~/my-whisper", model="large-v3", use_gpu=True, force_reinstall=False ) ``` #### Parameters - `install_dir` (str, optional): Installation directory (default: `~/.voicemode/whisper.cpp`) - `model` (str, optional): Whisper model to download (default: `large-v2`) - Available models: `tiny`, `tiny.en`, `base`, `base.en`, `small`, `small.en`, `medium`, `medium.en`, `large-v2`, `large-v3` - Note: large-v2 is default for best accuracy (requires ~3GB RAM) - `use_gpu` (bool, optional): Enable GPU support (default: auto-detect) - `force_reinstall` (bool, optional): Force reinstallation (default: false) #### Return Value ```python { "success": True, "install_path": "/Users/user/.voicemode/whisper.cpp", "model_path": "/Users/user/.voicemode/whisper.cpp/models/ggml-large-v2.bin", "gpu_enabled": True, "gpu_type": "metal", # or "cuda" or "cpu" "performance_info": { "system": "Darwin", "gpu_acceleration": "metal", "model": "large-v2", "binary_path": "/Users/user/.voicemode/whisper.cpp/main", "server_port": 2022, "server_url": "http://localhost:2022" }, "launchagent": "/Users/user/Library/LaunchAgents/com.voicemode.whisper-server.plist", # macOS "systemd_service": "/home/user/.config/systemd/user/whisper-server.service", # Linux "start_script": "/Users/user/.voicemode/whisper.cpp/start-whisper-server.sh" } ``` ### install_kokoro_fastapi Installs [kokoro-fastapi](https://github.com/remsky/kokoro-fastapi) for text-to-speech (TTS) functionality. #### Features - Python environment management with UV - Automatic model downloads - Service configuration (launchd on macOS, systemd on Linux) - Auto-start capability #### Usage ```python # Basic installation with defaults result = await install_kokoro_fastapi() # Custom installation result = await install_kokoro_fastapi( install_dir="~/my-kokoro", models_dir="~/my-models", port=8881, auto_start=True, install_models=True, force_reinstall=False ) ``` #### Parameters - `install_dir` (str, optional): Installation directory (default: `~/.voicemode/kokoro-fastapi`) - `models_dir` (str, optional): Models directory (default: `~/.voicemode/kokoro-models`) - `port` (int, optional): Service port (default: 8880) - `auto_start` (bool, optional): Start service after installation (default: true) - `install_models` (bool, optional): Download Kokoro models (default: true) - `force_reinstall` (bool, optional): Force reinstallation (default: false) #### Return Value ```python { "success": True, "install_path": "/home/user/.voicemode/kokoro-fastapi", "service_url": "http://127.0.0.1:8880", "service_status": "managed_by_systemd", # Linux "service_status": "managed_by_launchd", # macOS "systemd_service": "/home/user/.config/systemd/user/kokoro-fastapi-8880.service", # Linux "launchagent": "/Users/user/Library/LaunchAgents/com.voicemode.kokoro-8880.plist", # macOS "start_script": "/home/user/.voicemode/kokoro-fastapi/start-cpu.sh", "message": "Kokoro-fastapi installed. Run: cd /home/user/.voicemode/kokoro-fastapi && ./start-cpu.sh" } ``` ## System Requirements ### whisper.cpp #### macOS - Xcode Command Line Tools - Homebrew (for cmake) - Metal support (built-in) #### Linux - Build essentials (gcc, g++, make) - CMake - CUDA toolkit (optional, for NVIDIA GPU support) ### kokoro-fastapi #### All Systems - Python 3.10+ - Git - ~5GB disk space for models - UV package manager (installed automatically if missing) ## Integration with Voice Mode After installation, the services integrate automatically with Voice Mode: 1. **whisper.cpp**: - Runs automatically on boot (port 2022) - OpenAI-compatible API endpoint - Model selection via `VOICEMODE_WHISPER_MODEL` environment variable - View installed models: `claude resource read whisper://models` 2. **kokoro-fastapi**: - Automatically detected by Voice Mode's provider registry when running - 67 voices available - OpenAI-compatible API endpoint ## Examples ### Complete Setup ```python # Install both services with defaults whisper_result = await install_whisper_cpp() # Uses large-v2 by default kokoro_result = await install_kokoro_fastapi() # Check installation status if whisper_result["success"] and kokoro_result["success"]: print("Voice services installed successfully!") print(f"Whisper: {whisper_result['install_path']}") print(f"Whisper server: {whisper_result['performance_info']['server_url']}") print(f"Kokoro API: {kokoro_result['service_url']}") ``` ### Upgrade Existing Installation ```python # Force reinstall with larger model result = await install_whisper_cpp( model="large-v3", force_reinstall=True ) ``` ### Custom Configuration ```python # Install kokoro-fastapi on different port result = await install_kokoro_fastapi( port=9000, models_dir="/opt/models/kokoro" ) ``` ## Troubleshooting ### Common Issues 1. **Missing Dependencies** - The tools will report missing dependencies with installation instructions - Follow the provided commands to install required packages 2. **Port Conflicts** - If port 8880 is in use, specify a different port for kokoro-fastapi - Check running services: `lsof -i :8880` 3. **GPU Not Detected** - On Linux, ensure NVIDIA drivers and CUDA are installed - Use `nvidia-smi` to verify GPU availability - Force CPU mode with `use_gpu=False` if needed 4. **Model Download Failures** - Check internet connection - Verify sufficient disk space - Try smaller models first (tiny, base) ### Manual Service Management #### macOS (launchd) ```bash # Whisper launchctl load ~/Library/LaunchAgents/com.voicemode.whisper-server.plist launchctl unload ~/Library/LaunchAgents/com.voicemode.whisper-server.plist # Kokoro launchctl load ~/Library/LaunchAgents/com.voicemode.kokoro.plist launchctl unload ~/Library/LaunchAgents/com.voicemode.kokoro.plist ``` #### Linux (systemd) ```bash # Whisper systemctl --user start whisper-server systemctl --user stop whisper-server systemctl --user status whisper-server # Kokoro systemctl --user start kokoro-fastapi-8880 systemctl --user stop kokoro-fastapi-8880 systemctl --user status kokoro-fastapi-8880 ``` #### Change Whisper Model ```bash # Set environment variable before restarting export VOICEMODE_WHISPER_MODEL=base.en # or tiny, small, medium, large-v2, large-v3 # Restart service to apply change # macOS launchctl unload ~/Library/LaunchAgents/com.voicemode.whisper-server.plist launchctl load ~/Library/LaunchAgents/com.voicemode.whisper-server.plist # Linux systemctl --user restart whisper-server ``` ## Testing Run the test suite to verify installation tools: ```bash cd /path/to/voicemode python -m pytest tests/test_installers.py -v # Skip integration tests (no actual installation) SKIP_INTEGRATION_TESTS=1 python -m pytest tests/test_installers.py -v ``` ## Contributing When adding new installation tools: 1. Create a new function in `voice_mode/tools/installers.py` 2. Use the `@mcp.tool()` decorator 3. Follow the existing pattern for error handling and return values 4. Add comprehensive tests in `tests/test_installers.py` 5. Update this documentation ## Security Notes - All installations are performed in user space (no sudo required) - Models are downloaded from official sources - Services bind to localhost only by default - No external network access without explicit configuration

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mbailey/voicemode'

If you have feedback or need assistance with the MCP directory API, please join our Discord server