VOICEVOX TTS MCP

mcp-tts-voicevox
packages
voicevox-client

README.md•6.9 KiB

# @kajidog/voicevox-client A TypeScript client library for VOICEVOX text-to-speech synthesis engine. ## Installation ```bash npm install @kajidog/voicevox-client ``` ## Basic Usage ```typescript import { VoicevoxClient } from '@kajidog/voicevox-client'; // Initialize client const client = new VoicevoxClient({ url: 'http://localhost:50021', defaultSpeaker: 1, defaultSpeedScale: 1.0 }); // Simple text-to-speech await client.speak('Hello, world!'); // With options await client.speak('Hello, world!', { speaker: 3, speedScale: 1.2, waitForEnd: true // Wait for playback to complete }); // Generate audio file const filePath = await client.generateAudioFile('Test message', './output.wav'); // Get available speakers const speakers = await client.getSpeakers(); ``` ## Features - **Text-to-Speech Synthesis**: Convert text to speech with multiple speaker voices - **Audio Queue Management**: Efficient queue-based audio processing and playback - **Streaming Playback**: Direct buffer playback via ffplay (no temp files) - **Cross-platform Audio Playback**: Native audio playback support without external dependencies - **macOS**: Uses built-in `afplay` command - **Windows**: Uses PowerShell MediaPlayer with optimized timing - **Linux**: Auto-detects available players (`aplay`, `paplay`, `play`, `ffplay`) - **File Generation**: Generate WAV audio files from text - **Speaker Management**: Get information about available speakers and voices - **Flexible Input**: Support for single text, text arrays, and speech segments - **Advanced Playback Control**: Immediate playback, synchronous/asynchronous control - **Lightweight**: No external audio dependencies - uses platform-native tools ## API Reference ### VoicevoxClient Main client class for interacting with VOICEVOX engine. #### Constructor ```typescript new VoicevoxClient(config: VoicevoxConfig) ``` **VoicevoxConfig:** ```typescript interface VoicevoxConfig { url: string; // VOICEVOX engine URL defaultSpeaker?: number; // Default speaker ID (default: 1) defaultSpeedScale?: number; // Default playback speed (default: 1.0) defaultPlaybackOptions?: PlaybackOptions; // Default playback options } ``` #### Methods ##### speak Convert text to speech and play it. ```typescript speak( input: string | string[] | SpeechSegment[], options?: SpeakOptions ): Promise<string> ``` **SpeakOptions:** ```typescript interface SpeakOptions { speaker?: number; // Speaker ID speedScale?: number; // Playback speed immediate?: boolean; // Start playback immediately (default: true) waitForStart?: boolean; // Wait for playback to start (default: false) waitForEnd?: boolean; // Wait for playback to end (default: false) } ``` **Examples:** ```typescript // Simple text await client.speak('Hello'); // Multiple texts as array await client.speak(['Hello', 'How are you?']); // Speech segments with different speakers await client.speak([ { text: 'Hello', speaker: 1 }, { text: 'Nice to meet you', speaker: 3 } ]); // With options await client.speak('Important message', { speaker: 2, speedScale: 1.5, immediate: true, waitForEnd: true }); ``` ##### generateQuery Generate an AudioQuery for voice synthesis. ```typescript generateQuery( text: string, speaker?: number, speedScale?: number ): Promise<AudioQuery> ``` ##### generateAudioFile Generate an audio file from text or AudioQuery. ```typescript generateAudioFile( textOrQuery: string | AudioQuery, outputPath?: string, speaker?: number, speedScale?: number ): Promise<string> ``` ##### enqueueAudioGeneration Add text or query to the audio generation queue. ```typescript enqueueAudioGeneration( input: string | string[] | SpeechSegment[] | AudioQuery, options?: SpeakOptions ): Promise<string> ``` ##### Other Methods - `getSpeakers(): Promise<Speaker[]>` - Get list of available speakers - `getSpeakerInfo(uuid: string): Promise<SpeakerInfo>` - Get speaker details - `clearQueue(): Promise<void>` - Clear the playback queue - `startPlayback(): void` - Start queue playback - `pausePlayback(): void` - Pause queue playback - `resumePlayback(): void` - Resume queue playback - `getQueueLength(): number` - Get number of items in queue - `isQueueEmpty(): boolean` - Check if queue is empty - `isPlaying(): boolean` - Check if currently playing ## Playback Options ### Immediate Playback (`immediate: true`) Clear existing queue and play audio immediately: ```typescript // Stops current playback, clears queue, and plays new audio await client.speak('Urgent notification', { immediate: true, waitForEnd: true }); ``` ### Synchronous Playback (`waitForEnd: true`) Wait for playback to complete before continuing: ```typescript // Step-by-step audio guide await client.speak('Step 1: Open the file', { waitForEnd: true }); await client.speak('Step 2: Click the button', { waitForEnd: true }); ``` ### Queue-based Playback (`immediate: false`) Add to queue without auto-starting: ```typescript client.speak('First message', { immediate: false }); client.speak('Second message', { immediate: false }); client.startPlayback(); // Start playing queue ``` ## Streaming Playback When `ffplay` is available, the library can play audio directly from memory without creating temporary files: - Faster first audio playback (no disk I/O) - Reduced disk usage - Can be disabled via environment variable: `VOICEVOX_STREAMING_PLAYBACK=false` ## Audio Requirements The package uses platform-native audio tools for playback: - **macOS**: No additional setup required (uses built-in `afplay`) - **Windows**: No additional setup required (uses PowerShell) - **Linux**: Requires one of the following audio players: - `aplay` (ALSA) - `paplay` (PulseAudio) - `play` (SoX) - `ffplay` (FFmpeg) For streaming playback (optional): - Install `ffmpeg` which includes `ffplay` ## Environment Variables - `VOICEVOX_URL`: VOICEVOX engine URL (default: `http://localhost:50021`) - `VOICEVOX_DEFAULT_SPEAKER`: Default speaker ID (default: `1`) - `VOICEVOX_DEFAULT_SPEED_SCALE`: Default playback speed (default: `1.0`) - `VOICEVOX_DEFAULT_IMMEDIATE`: Start playback immediately (default: `true`) - `VOICEVOX_DEFAULT_WAIT_FOR_START`: Wait for playback start (default: `false`) - `VOICEVOX_DEFAULT_WAIT_FOR_END`: Wait for playback end (default: `false`) - `VOICEVOX_STREAMING_PLAYBACK`: Enable streaming playback (default: `true`) ## Development This package is part of the MCP VOICEVOX project. For development: ```bash # Install dependencies npm install # Build the package npm run build # Run tests (includes audio playback mocking) npm test # Type checking and linting npm run lint ``` ## License MIT License

Loading blob content...

Latest Blog Posts

Don't Use Large Strings as Cache Keys
By punkpeye on January 11, 2026.
markdown
node-js
cache
What are Claude Skills?
By punkpeye on January 10, 2026.
mcp
skills
How to Test MCP Streamable HTTP Endpoints Using cURL
By punkpeye on January 2, 2026.
tutorial
bash

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kajidog/mcp-tts-voicevox'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•6.9 KiB