Skip to main content
Glama

Voice Mode

by mbailey
voicemode.env.example•8.56 kB
#!/usr/bin/env bash ########################################################## # Voice Mode Environment Configuration # # Copy this file to ~/.voicemode/voicemode.env and uncomment/modify as needed # Source it from your shell: source ~/.voicemode/voicemode.env # Or add to your shell RC file: echo "source ~/.voicemode/voicemode.env" >> ~/.bashrc ########################################################## ############# # API Keys ############# ## OpenAI API key (required for cloud STT/TTS services) # export OPENAI_API_KEY="sk-..." ############# # Audio Settings ############# ## Play audio feedback when recording starts/stops (default: true) # export VOICEMODE_AUDIO_FEEDBACK=true ## Default listening duration in seconds for the converse tool (default: 120.0) # export VOICEMODE_DEFAULT_LISTEN_DURATION=120.0 ## Disable silence detection globally - useful for noisy environments (default: false) # export VOICEMODE_DISABLE_SILENCE_DETECTION=false ## Voice Activity Detection aggressiveness 0-3 (default: 2) ## Controls how strictly WebRTC VAD filters out non-speech audio ## 0 = least aggressive filtering (includes more audio), 3 = most aggressive (strict) # export VOICEMODE_VAD_AGGRESSIVENESS=2 ## Milliseconds of silence before stopping recording (default: 1000) # export VOICEMODE_SILENCE_THRESHOLD_MS=1000 ## Minimum recording duration in seconds (default: 0.5) # export VOICEMODE_MIN_RECORDING_DURATION=0.5 ## Seconds to wait for user to start speaking (default: 4.0) # export VOICEMODE_INITIAL_SILENCE_GRACE_PERIOD=4.0 ############# # Think Out Loud Mode ############# ## Enable Think Out Loud mode - AI voices reasoning with multiple personas (default: false) # export VOICEMODE_THINK_OUT_LOUD=false ## Voice persona mappings for thinking roles (default: analytical:am_adam,creative:af_sarah,critical:af_bella,synthesis:af_nova) ## Format: role:voice pairs separated by commas ## Note: af_sky reserved for main conversation voice, not inner thinking # export VOICEMODE_THINKING_VOICES="analytical:am_adam,creative:af_sarah,critical:af_bella,synthesis:af_nova" ## Think Out Loud presentation style (default: sequential) ## Options: sequential (one after another), debate (back-and-forth), chorus (rapid perspectives) # export VOICEMODE_THINKING_STYLE=sequential ## Announce which voice is speaking (default: true) ## When enabled, voices introduce themselves (e.g., "Analytical perspective:") # export VOICEMODE_THINKING_ANNOUNCE_VOICE=true ############# # Audio Format Settings ############# ## Primary audio format: pcm, opus, mp3, wav, flac, aac (default: pcm) # export VOICEMODE_AUDIO_FORMAT=pcm ## Override format for TTS - defaults to primary (default: pcm) # export VOICEMODE_TTS_AUDIO_FORMAT=pcm ## Override format for STT - auto-selects mp3 if primary is pcm (default: mp3) # export VOICEMODE_STT_AUDIO_FORMAT=mp3 ## Audio format quality settings # export VOICEMODE_MP3_BITRATE=64k # MP3 bitrate # export VOICEMODE_AAC_BITRATE=64k # AAC bitrate # export VOICEMODE_OPUS_BITRATE=32000 # Opus bitrate in bps ############# # Provider Settings ############# ## Prefer local providers (Kokoro/Whisper) when available (default: true) # export VOICEMODE_PREFER_LOCAL=true ## Always try local providers even if they previously failed (default: true) ## When true, local providers are never marked as permanently unavailable # export VOICEMODE_ALWAYS_TRY_LOCAL=true ## Use simple failover without health checks (default: true) ## Tries each endpoint in order until one works - connection failures are instant # export VOICEMODE_SIMPLE_FAILOVER=true ## Automatically start Kokoro TTS on startup (default: false) # export VOICEMODE_AUTO_START_KOKORO=false ############# # TTS (Text-to-Speech) Settings ############# ## Comma-separated list of TTS endpoints in priority order ## Default: local Kokoro (port 8880), then OpenAI # export VOICEMODE_TTS_BASE_URLS="http://127.0.0.1:8880/v1,https://api.openai.com/v1" ## Change port if running Kokoro on non-default port: # export VOICEMODE_TTS_BASE_URLS="http://127.0.0.1:8888/v1,https://api.openai.com/v1" ## Comma-separated list of TTS models in priority order # export VOICEMODE_TTS_MODELS="gpt-4o-mini-tts,tts-1-hd,tts-1" ## Comma-separated list of preferred voices in priority order # export VOICEMODE_VOICES="af_sky,alloy" ############# # STT (Speech-to-Text) Settings ############# ## Comma-separated list of STT endpoints in priority order ## Default: local Whisper (port 2022), then OpenAI # export VOICEMODE_STT_BASE_URLS="http://127.0.0.1:2022/v1,https://api.openai.com/v1" ## Change port if running Whisper on non-default port: # export VOICEMODE_STT_BASE_URLS="http://127.0.0.1:2023/v1,https://api.openai.com/v1" ############# # LiveKit Settings ############# ## LiveKit server WebSocket URL (default: ws://127.0.0.1:7880) # export LIVEKIT_URL="ws://127.0.0.1:7880" ## For LiveKit Cloud: # export LIVEKIT_URL="wss://your-app.livekit.cloud" ## LiveKit API credentials # export LIVEKIT_API_KEY="devkey" # export LIVEKIT_API_SECRET="secret" ############# # Streaming Settings ############# ## Enable streaming audio playback (default: true) # export VOICEMODE_STREAMING_ENABLED=true ## Initial buffer before playback in milliseconds (default: 150) # export VOICEMODE_STREAM_BUFFER_MS=150 ## Download chunk size in bytes (default: 4096) # export VOICEMODE_STREAM_CHUNK_SIZE=4096 ## Maximum buffer size in seconds (default: 2.0) # export VOICEMODE_STREAM_MAX_BUFFER=2.0 ############# # Storage Settings ############# ## Base directory for all voicemode data (default: ~/.voicemode) # export VOICEMODE_BASE_DIR=~/.voicemode ## Enable all data saving - audio, transcriptions, event logs (default: false) # export VOICEMODE_SAVE_ALL=false ## Save audio recordings to ~/.voicemode/audio/ (default: false) # export VOICEMODE_SAVE_AUDIO=false ## Save transcriptions to ~/.voicemode/transcriptions/ (default: false) # export VOICEMODE_SAVE_TRANSCRIPTIONS=false ############# # Logging Settings ############# ## Enable debug logging - automatically enables all saving (default: false) ## Set to "trace" for verbose logging # export VOICEMODE_DEBUG=false ## Directory for event log files (default: ~/.voicemode/logs/events) # export VOICEMODE_EVENT_LOG_DIR=~/.voicemode/logs/events ## Enable structured event logging (default: true) # export VOICEMODE_EVENT_LOG_ENABLED=true ## Log rotation frequency (default: daily) # export VOICEMODE_EVENT_LOG_ROTATION=daily ############# # Whisper Configuration ############# ## Whisper model to use (default: large-v2) ## Options: tiny, base, small, medium, large, large-v2, large-v3 # export VOICEMODE_WHISPER_MODEL=large-v2 ## Whisper server port (default: 2022) # export VOICEMODE_WHISPER_PORT=2022 ## Language for transcription (default: auto) ## Use ISO 639-1 codes: en, es, fr, de, it, pt, ru, zh, ja, ko, etc. ## Use "auto" for automatic language detection # export VOICEMODE_WHISPER_LANGUAGE=auto ## Path to Whisper models directory (default: ~/.voicemode/models/whisper) # export VOICEMODE_WHISPER_MODEL_PATH=~/.voicemode/models/whisper ############# # Kokoro Configuration ############# ## Kokoro server port (default: 8880) # export VOICEMODE_KOKORO_PORT=8880 ## Directory containing Kokoro models (default: ~/.voicemode/models/kokoro) # export VOICEMODE_KOKORO_MODELS_DIR=~/.voicemode/models/kokoro ## Kokoro cache directory (default: ~/.voicemode/cache/kokoro) # export VOICEMODE_KOKORO_CACHE_DIR=~/.voicemode/cache/kokoro ## Default Kokoro voice when not specified (default: af_sky) ## See voice list: af_*, am_*, bf_*, bm_*, ef_*, em_*, etc. # export VOICEMODE_KOKORO_DEFAULT_VOICE=af_sky ############# # Tool Management ############# ## Enable only specific MCP tools (whitelist mode) ## Comma-separated list of tool names to enable ## If set, ONLY these tools will be loaded ## Example: converse,service,voice_status # export VOICEMODE_TOOLS_ENABLED="" ## Disable specific MCP tools (blacklist mode) ## Comma-separated list of tool names to exclude ## If set, all tools EXCEPT these will be loaded ## Example: kokoro_install,whisper_install,livekit_install # export VOICEMODE_TOOLS_DISABLED="" ## Note: If both are set, ENABLED takes precedence (whitelist wins) ## Common tool names: ## Core: converse, voice_status, voice_registry, pronounce ## Services: service, check_audio_devices, check_audio_dependencies ## Installers: kokoro_install, whisper_install, livekit_install, livekit_frontend_install ## Config: update_config, list_config_keys, config_reload

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mbailey/voicemode'

If you have feedback or need assistance with the MCP directory API, please join our Discord server