LocalVoiceMode

VOICES_AND_SKILLS.md•4.26 KiB

# Voices and Skills Guide ## Quick Start ### Switching Voices 1. Place your voice file as `C:\AI\localvoicemode\voice_references\default.wav` 2. Restart the app - it will use your new voice ### Switching Skills/Personalities 1. Edit `C:\AI\localvoicemode\v2\ml-server\.env` 2. Add: `DEFAULT_SKILL=hermione-companion` 3. Restart the app --- ## Voice Requirements Pocket TTS clones voices from reference audio. For best results: | Requirement | Optimal | Acceptable | |-------------|---------|------------| | **Duration** | 10 seconds | 5-30 seconds | | **Format** | WAV (16-bit PCM) | WAV (any bit depth) | | **Sample Rate** | 24000 Hz | 16000-48000 Hz | | **Channels** | Mono | Stereo (will be converted) | | **Content** | Clear speech | Minimal background noise | ### Recording Tips 1. Record in a quiet environment 2. Speak naturally (avoid reading robotically) 3. Include varied intonation (questions, statements) 4. Keep consistent microphone distance 5. Use [ai-coustics](https://ai-coustics.com/) to enhance recordings (optional) ### Voice File Locations Priority order (first found is used): 1. `voice_references/default.wav` - Default voice override 2. `skills/<skill-id>/reference.wav` - Per-skill voice 3. `voice_references/<skill-id>.wav` - Global skill voice 4. Built-in voices: `alba`, `marius`, `javert`, `fantine` ### Swapping Voices **Method 1: Replace default.wav** ``` C:\AI\localvoicemode\voice_references\default.wav ``` **Method 2: Rename your file** ``` # Your voice file -> default.wav my_voice.wav -> default.wav ``` **Method 3: Per-skill voice** ``` C:\AI\localvoicemode\skills\hermione-companion\reference.wav ``` --- ## Skills/Personalities Skills define character personalities, system prompts, and optional voice files. ### Available Skills | Skill ID | Character | Description | |----------|-----------|-------------| | `assistant-default` | Default Assistant | General-purpose voice assistant | | `hermione-companion` | Hermione Granger | Roleplay companion at the Hog's Head pub | ### Skill Structure ``` skills/<skill-id>/ ├── SKILL.md # Character definition (required) ├── reference.wav # Custom voice (optional) ├── avatar.png # Character image (optional) └── references/ # Lore/knowledge files (optional) ``` ### Creating a New Skill 1. Create folder: `skills/my-character/` 2. Create `SKILL.md`: ```yaml --- id: my-character name: My Character display_name: "My Character" description: A custom character voice: reference.wav metadata: setting: "Where the character is" greeting: "Hello! How can I help you?" --- # My Character ## System Prompt You are My Character. [Full personality description here...] ``` 3. Add voice file (optional): `reference.wav` 4. Restart and select the skill in Settings ### SKILL.md Format The file uses YAML frontmatter + Markdown body: ```yaml --- id: skill-id # Unique identifier name: Character Name # Display name display_name: "Name" # Optional display override description: Brief desc # For skill listing voice: reference.wav # Voice file (optional) avatar: avatar.png # Avatar image (optional) metadata: setting: "Scene description" greeting: "First message" personality_traits: - Trait 1 - Trait 2 speech_patterns: - "Tends to say..." --- # Character Name ## System Prompt Full character instructions here... ``` --- ## Configuration ### Environment Variables (.env) Edit `C:\AI\localvoicemode\v2\ml-server\.env`: ```bash # LLM Provider OPENROUTER_API_KEY=sk-or-v1-... OPENROUTER_MODEL=deepseek/deepseek-chat-v3-0324 # Default skill to load DEFAULT_SKILL=assistant-default # Force LLM provider: lm_studio, openrouter, openai # VOICE_PROVIDER=openrouter ``` ### System Prompt Override To use a custom system prompt without creating a skill: 1. Edit `.env` 2. Add: `SYSTEM_PROMPT="You are a helpful assistant..."` --- ## Troubleshooting ### Voice not changing - Ensure file is named exactly `default.wav` - Check file format (must be WAV) - Restart the ML server ### Skill not loading - Check SKILL.md YAML syntax - Verify skill folder exists in `skills/` - Check ML server logs for errors ### Voice quality issues - Use higher quality source audio - Ensure 10+ seconds of speech - Remove background noise - Try different reference audio

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DevMan57/voiceblitz-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

VOICES_AND_SKILLS.md•4.26 KiB