Skip to main content
Glama

Risky Business MCP Server

by khizar-anjum
elevenlabs_integration_plan.md8.75 kB
# ElevenLabs Text-to-Speech Integration Plan ## Overview This document outlines the plan to replace the current `espeak` voice generation with ElevenLabs API for higher quality, more natural-sounding voice briefings in the CVE Threat Assessment system. ## Current Implementation - **Technology**: espeak (local TTS engine) - **Method**: Subprocess call to espeak command - **Voice**: Robotic, synthetic voice - **Latency**: Instant (local processing) - **Cost**: Free ## Proposed ElevenLabs Implementation ### 1. Prerequisites #### Account Setup 1. Sign up for ElevenLabs account at https://elevenlabs.io 2. Navigate to Profile Settings 3. Copy your `xi-api-key` 4. Add key to `.env` file (already created) #### Dependencies ```bash # Python packages pip install requests python-dotenv # Audio playback (choose based on OS) # Linux: sudo apt-get install mpg123 # macOS: brew install mpg123 # Alternative Python-based playback: pip install playsound # or pygame ``` ### 2. API Endpoint Details **Endpoint**: `POST https://api.elevenlabs.io/v1/text-to-speech/{voice_id}` **Headers**: ```json { "Accept": "audio/mpeg", "Content-Type": "application/json", "xi-api-key": "YOUR_API_KEY" } ``` **Request Body**: ```json { "text": "Your text to convert", "model_id": "eleven_multilingual_v2", "voice_settings": { "stability": 0.5, "similarity_boost": 0.75 } } ``` ### 3. Implementation Code ```python import os import requests import subprocess from dotenv import load_dotenv # Load environment variables load_dotenv() def generate_elevenlabs_voice_briefing(summary_text): """ Generate voice briefing using ElevenLabs API Args: summary_text (str): The text to convert to speech Returns: bool: True if successful, False otherwise """ # Get configuration from environment api_key = os.getenv('ELEVENLABS_API_KEY') voice_id = os.getenv('ELEVENLABS_VOICE_ID', '21m00Tcm4TlvDq8ikWAM') model_id = os.getenv('ELEVENLABS_MODEL_ID', 'eleven_multilingual_v2') stability = float(os.getenv('ELEVENLABS_STABILITY', '0.5')) similarity = float(os.getenv('ELEVENLABS_SIMILARITY_BOOST', '0.75')) if not api_key: print("Error: ELEVENLABS_API_KEY not found in environment") return False # API endpoint url = f"https://api.elevenlabs.io/v1/text-to-speech/{voice_id}" # Request headers headers = { "Accept": "audio/mpeg", "Content-Type": "application/json", "xi-api-key": api_key } # Request body data = { "text": summary_text, "model_id": model_id, "voice_settings": { "stability": stability, "similarity_boost": similarity } } try: # Make API request response = requests.post(url, json=data, headers=headers, timeout=30) if response.status_code == 200: # Save audio file audio_path = '/tmp/threat_briefing.mp3' with open(audio_path, 'wb') as f: for chunk in response.iter_content(chunk_size=1024): if chunk: f.write(chunk) print(f"[+] Voice briefing generated successfully: {audio_path}") # Play the audio file try: # Try mpg123 first (better quality) subprocess.run(['mpg123', '-q', audio_path], check=True) except (subprocess.CalledProcessError, FileNotFoundError): try: # Fallback to afplay on macOS subprocess.run(['afplay', audio_path], check=True) except (subprocess.CalledProcessError, FileNotFoundError): print("[-] Could not play audio. Please install mpg123 or use manual playback") print(f" Audio saved at: {audio_path}") return True else: print(f"[-] ElevenLabs API error: {response.status_code}") print(f" Response: {response.text}") return False except requests.exceptions.RequestException as e: print(f"[-] Network error calling ElevenLabs API: {e}") return False except Exception as e: print(f"[-] Unexpected error: {e}") return False def generate_voice_with_fallback(summary_text): """ Generate voice briefing with fallback to espeak if ElevenLabs fails """ # Try ElevenLabs first if generate_elevenlabs_voice_briefing(summary_text): return True # Fallback to espeak print("[!] Falling back to espeak...") try: subprocess.run(['espeak', '-v', 'en+m3', '-s', '140', summary_text], check=True) return True except (subprocess.CalledProcessError, FileNotFoundError): print("[-] Both ElevenLabs and espeak failed") return False ``` ### 4. Integration with CLAUDE.md Replace the current Step 6 implementation: **Current**: ```python subprocess.run(['espeak', '-v', 'en+m3', '-s', '140', summary_text]) ``` **New**: ```python # Import at the top of the file from voice_briefing import generate_voice_with_fallback # In Step 6 generate_voice_with_fallback(summary_text) ``` ### 5. Available Voice Options #### Professional Voices - **Rachel** (`21m00Tcm4TlvDq8ikWAM`) - Calm, professional, news anchor style - **Adam** (`pNInz6obpgDQGcFmaJgB`) - Deep, authoritative, masculine - **Sarah** (`EXAVITQu4vr4xnSDxMaL`) - Young, friendly, energetic - **George** (`JBFqnCBsd6RMkjVDRZzb`) - British accent, professional #### Model Options 1. **eleven_multilingual_v2** (Default) - Highest quality - Supports 29+ languages - ~500ms latency 2. **eleven_turbo_v2_5** - Ultra-fast generation - 75ms latency - Ideal for real-time applications 3. **eleven_monolingual_v1** - English-optimized - Good balance of quality and speed ### 6. Voice Settings Explained - **Stability** (0.0 - 1.0) - Lower values: More emotional, varied delivery - Higher values: More consistent, stable tone - Recommended: 0.5 for briefings - **Similarity Boost** (0.0 - 1.0) - Lower values: More creative interpretation - Higher values: Closer to original voice profile - Recommended: 0.75 for clarity ### 7. Cost Considerations #### Free Tier - 10,000 characters/month - ~3 minutes of audio - Good for testing #### Starter Plan ($5/month) - 30,000 characters/month - ~10 minutes of audio #### Creator Plan ($22/month) - 100,000 characters/month - ~30 minutes of audio ### 8. Error Handling Strategy 1. **Primary**: Try ElevenLabs API 2. **Fallback 1**: If API fails, use espeak 3. **Fallback 2**: If both fail, save text to file 4. **Logging**: Record all failures for debugging ### 9. Security Best Practices 1. **Never commit `.env` file** - Add to `.gitignore` 2. **Use environment variables** for all sensitive data 3. **Implement rate limiting** to avoid exceeding quotas 4. **Cache generated audio** for repeated briefings 5. **Validate API responses** before processing ### 10. Testing Plan 1. **Unit Tests**: - Test API connection - Test voice generation with sample text - Test fallback mechanism 2. **Integration Tests**: - Test with actual CVE briefings - Test various text lengths - Test special characters and formatting 3. **Performance Tests**: - Measure API response times - Compare with espeak performance - Test under network issues ### 11. Migration Steps 1. **Phase 1**: Development - Set up ElevenLabs account - Add API key to `.env` - Implement `generate_elevenlabs_voice_briefing()` function - Test with sample texts 2. **Phase 2**: Integration - Add fallback mechanism - Update CLAUDE.md to use new function - Test with actual CVE assessments 3. **Phase 3**: Deployment - Update documentation - Monitor API usage - Gather user feedback ### 12. Monitoring & Maintenance - **Track API usage** to avoid exceeding limits - **Monitor audio quality** feedback - **Update voice models** as new versions release - **Review costs** monthly - **Maintain fallback** systems ## Conclusion The ElevenLabs integration will provide: - ✅ Natural, professional voice quality - ✅ Multiple voice options - ✅ Multilingual support - ✅ Adjustable voice characteristics - ✅ Fallback to espeak for reliability Trade-offs: - ❌ Requires API key and internet connection - ❌ Has usage limits and potential costs - ❌ Slightly higher latency than local TTS - ❌ Dependency on external service The implementation provides a robust solution with proper error handling and fallback mechanisms to ensure voice briefings are always available.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/khizar-anjum/risky-business-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server