Rime MCP
A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Rime API. This server downloads audio and plays it using the system's native audio player.
Features
- Exposes a
speak
tool that converts text to speech and plays it through system audio - Uses Rime's high-quality voice synthesis API
Requirements
- Node.js 16.x or higher
- A working audio output device
- macOS: Uses
afplay
There's sample code from Claude for the following that is not tested 🤙✨
- Windows: Built-in Media.SoundPlayer (PowerShell)
- Linux: mpg123, mplayer, aplay, or ffplay
MCP Configuration
Copy
All of the optional env vars are part of the tool definition and are prompts to
All voice options are listed here.
You can get your API key from the Rime Dashboard.
The following environment variables can be used to customize the behavior:
RIME_GUIDANCE
: The main description of when and how to use the speak toolRIME_WHO_TO_ADDRESS
: Who the speech should address (default: "user")RIME_WHEN_TO_SPEAK
: When the tool should be used (default: "when asked to speak or when finishing a command")RIME_VOICE
: The default voice to use (default: "cove")
Example use cases
Example 1: Coding agent announcements
Copy
Example 2: Learn how the kids talk these days
Copy
Example 3: Different languages based on context
Copy
Development
- Install dependencies:
Copy
- Build the server:
Copy
- Run in development mode with hot reload:
Copy
License
MIT
This server cannot be installed
A Model Context Protocol server that enables AI models to generate and play high-quality text-to-speech audio through your device's native audio system using Rime's voice synthesis API.