LocalVoiceMode

README.md•2.9 KiB

# LocalVoiceMode v2 Hybrid voice interface with Rust GUI and Python ML backend. ## Architecture ``` ┌─────────────────────────────┐ ┌─────────────────────────────┐ │ voicemode-ui (Rust) │◄───────►│ ml-server (Python) │ │ - egui + wgpu GUI │ JSON-RPC│ - TTS/ASR/SER models │ │ - Bloom shader effects │ │ - LLM client │ │ - Audio I/O (cpal) │ │ - Skill loader │ │ - VAD (Silero ONNX) │ │ - Model registry │ └─────────────────────────────┘ └─────────────────────────────┘ ``` ## Features - **Lightsaber Waveform Visualizer** - GPU bloom shaders with customizable colors - **Model Hot-Swap** - Plugin system for TTS/ASR/SER models - **Emotion Tracking** - SenseVoice SER feeds emotion to LLM context - **Hybrid App** - Full settings GUI + minimal overlay mode ## Color Palette | Name | Hex | Use | |------|-----|-----| | Lapis Lazuli | `#2659A5` | Primary - listening | | Mace Windu Purple | `#8033CC` | Accent - processing | | Emerald | `#33B24D` | Success - speaking | | Burnt Orange | `#CC661A` | Warning | | Sith Red | `#E61A1A` | Error | ## Quick Start ### Prerequisites - Rust 1.75+ with cargo - Python 3.11+ with uv or pip - NVIDIA GPU (optional, for CUDA acceleration) ### Build & Run ```bash # Terminal 1: Start ML server cd ml-server uv venv && uv pip install -e . python -m ml_server # Terminal 2: Start GUI cd voicemode-ui cargo run --release ``` ## Project Structure ``` v2/ ├── voicemode-ui/ # Rust GUI application │ ├── src/ │ │ ├── ui/ # egui components + shaders │ │ ├── audio/ # cpal capture/playback │ │ ├── ipc/ # JSON-RPC client │ │ └── state/ # App state management │ └── assets/ # Textures, shaders │ ├── ml-server/ # Python ML backend │ └── ml_server/ │ ├── models/ # TTS, ASR, SER adapters │ ├── registry/ # Model plugin system │ ├── ipc/ # JSON-RPC server │ ├── llm/ # LLM client │ └── skills/ # Character skills │ ├── PROMPT.md # Autonomous agent instructions └── CLAUDE.md # Claude Code instructions ``` ## Supported Models ### TTS (Text-to-Speech) - [x] Pocket TTS (default) - [ ] IndexTTS2 (planned) - [ ] Qwen3-TTS (planned) ### ASR (Speech Recognition) - [x] Parakeet TDT 0.6B v3 ### SER (Speech Emotion Recognition) - [ ] SenseVoice-Small (planned) ## License MIT

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DevMan57/voiceblitz-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•2.9 KiB