Claude Voice Commands

CLAUDE.md•7.39 KiB

# CLAUDE.md This file provides guidance to Claude Code when working with the Claude Voice Commands project. ## Project Overview This project enables voice command interaction with Claude Code through an MCP (Model Context Protocol) server. Users can speak commands naturally, and Claude Code will receive and execute them. ## Quick Start Commands ### Setup & Installation ```bash # Create and activate virtual environment python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt # Test microphone setup python tests/test_microphone.py # Run the MCP server python -m mcp_server.voice_server ``` ### Testing Voice Recognition ```bash # Test basic voice capture python tests/test_voice.py # Test with different backends python tests/test_voice.py --backend google python tests/test_voice.py --backend whisper # Calibrate microphone python tests/calibrate_mic.py ``` ### MCP Server Operations ```bash # Start MCP server (for Claude Desktop integration) python -m mcp_server.voice_server # Test MCP tools directly python -m mcp_server.test_tools # Check server health python -m mcp_server.voice_server --health ``` ## Architecture ### Directory Structure ``` claude-voice-commands/ ├── mcp_server/ # MCP server implementation │ ├── voice_server.py # Main MCP server with tool definitions │ ├── speech_engines.py # Recognition backend implementations │ ├── command_queue.py # Command queuing system │ └── wake_word.py # Wake word detection ├── config/ # Configuration files ├── tests/ # Test scripts └── logs/ # Runtime logs ``` ### Core Components 1. **MCP Server** (`voice_server.py`) - FastMCP server exposing voice tools - Tools: start_listening, stop_listening, get_command, test_microphone 2. **Speech Engines** (`speech_engines.py`) - Google Speech Recognition (primary) - Whisper Local (fallback) - OpenAI Realtime (future premium option) 3. **Command Queue** (`command_queue.py`) - Thread-safe command queuing - Priority handling for urgent commands - Command history tracking 4. **Wake Word Detection** (`wake_word.py`) - Detects "Hey Claude", "Claude", etc. - Customizable wake words - Low CPU usage monitoring ## MCP Tools Available ### Core Tools ```python # Start voice listening await start_listening(mode="wake_word") # or "continuous" # Stop voice listening await stop_listening() # Get next voice command command = await get_next_command(timeout=5.0) # Test microphone setup status = await test_microphone() ``` ### Utility Tools ```python # Record a voice note note = await transcribe_voice_note(duration=30) # Get listening status status = await get_voice_status() # Clear command queue await clear_commands() ``` ## Development Workflow ### Adding New Features 1. Check ROADMAP.md for current sprint 2. Create feature branch: `git checkout -b sprint-X-feature` 3. Implement with tests 4. Update ROADMAP.md progress 5. Test with Claude Desktop integration ### Testing Checklist ```bash # Before committing python -m pytest tests/ # Run all tests python tests/test_microphone.py # Test audio input python tests/test_mcp_tools.py # Test MCP integration python tests/test_recognition.py # Test speech recognition ``` ### Debugging #### Common Issues & Solutions **Issue: "No module named 'pyaudio'"** ```bash # macOS brew install portaudio pip install pyaudio # Linux sudo apt-get install portaudio19-dev pip install pyaudio # Windows pip install pipwin pipwin install pyaudio ``` **Issue: "Microphone not found"** ```bash # Test available microphones python -c "import speech_recognition as sr; print(sr.Microphone.list_microphone_names())" # Set specific microphone export VOICE_MIC_INDEX=1 # Use the index from list above ``` **Issue: "Recognition fails"** ```bash # Check ambient noise level python tests/calibrate_mic.py # Test with different sensitivity export VOICE_ENERGY_THRESHOLD=4000 # Higher = less sensitive # Force specific backend export VOICE_BACKEND=whisper # or 'google' ``` **Issue: "MCP connection failed"** ```bash # Check MCP server is running ps aux | grep voice_server # Test MCP tools directly python -c "from mcp_server.voice_server import test_microphone; print(test_microphone())" # Check Claude Desktop config cat ~/Library/Application\ Support/Claude/claude_desktop_config.json ``` ### Performance Monitoring ```bash # Monitor CPU usage during listening python tests/benchmark_cpu.py # Test recognition latency python tests/benchmark_latency.py # Check memory usage python tests/profile_memory.py ``` ## Configuration ### Environment Variables (.env) ```bash # Recognition Settings VOICE_BACKEND=google # Primary backend: google, whisper, openai VOICE_FALLBACK=whisper # Fallback backend VOICE_ENERGY_THRESHOLD=4000 # Microphone sensitivity VOICE_PAUSE_THRESHOLD=0.8 # Pause between words VOICE_TIMEOUT=5 # Recognition timeout (seconds) # Wake Word Settings WAKE_WORDS=hey claude,claude,ok claude WAKE_WORD_SENSITIVITY=0.5 # 0-1, higher = more strict # API Keys (optional) OPENAI_API_KEY=sk-... # For OpenAI Realtime (Sprint 7) ELEVENLABS_API_KEY=... # Alternative option # Logging LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR LOG_FILE=logs/voice.log ``` ### Claude Desktop Configuration Add to `~/Library/Application Support/Claude/claude_desktop_config.json`: ```json { "mcpServers": { "voice-commands": { "command": "python", "args": ["-m", "mcp_server.voice_server"], "cwd": "/path/to/claude-voice-commands", "env": { "PYTHONPATH": "/path/to/claude-voice-commands", "VOICE_BACKEND": "google" } } } } ``` ## Sprint Progress Tracking Current Sprint: **Sprint 0 - Project Setup** When working on sprints: 1. Update task checkboxes in ROADMAP.md 2. Add completion notes to sprint section 3. Update progress table with dates 4. Commit with message: "Sprint X: [description]" ## Usage Examples ### Basic Voice Command Flow ```python # User says: "Activate voice mode" # Claude Code calls: await start_listening(mode="continuous") # User says: "Create a Python function to calculate fibonacci" # Claude Code calls: command = await get_next_command() # Returns: {"text": "Create a Python function to calculate fibonacci", "confidence": 0.95} # User says: "Stop voice mode" # Claude Code calls: await stop_listening() ``` ### Wake Word Flow ```python # Claude Code calls: await start_listening(mode="wake_word") # User says: "Hey Claude, run the tests" # Claude Code polls: command = await get_next_command(timeout=1.0) # Returns: {"text": "run the tests", "confidence": 0.92} ``` ## Best Practices 1. **Always test microphone first**: Run `test_microphone()` before starting 2. **Handle errors gracefully**: Recognition can fail, always have fallback 3. **Respect user privacy**: Only listen when explicitly activated 4. **Optimize for latency**: Use Google first, Whisper as fallback 5. **Log everything**: Helps debug recognition issues 6. **Update documentation**: Keep ROADMAP.md current with progress ## Future Enhancements (Post-Sprint 10) - [ ] Voice synthesis responses (Claude speaks back) - [ ] Multi-user voice profiles - [ ] Cloud command sync - [ ] Mobile app companion - [ ] Voice command marketplace --- *Last updated: 2025-01-03 - Sprint 0*

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/JWitcoff/Claude_Chat'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

CLAUDE.md•7.39 KiB