Rime MCP

Integrations

  • Supports various Linux audio players (mpg123, mplayer, aplay, ffplay) for playing synthesized speech

  • Uses macOS native 'afplay' audio player to output synthesized speech

  • Supports context-based voice selection when discussing Python, using specific voices like 'antoine'

Rime MCP

A Model Context Protocol (MCP) server that provides text-to-speech capabilities using the Rime API. This server downloads audio and plays it using the system's native audio player.

Features

  • Exposes a speak tool that converts text to speech and plays it through system audio
  • Uses Rime's high-quality voice synthesis API

Requirements

  • Node.js 16.x or higher
  • A working audio output device
  • macOS: Uses afplay

There's sample code from Claude for the following that is not tested 🤙✨

  • Windows: Built-in Media.SoundPlayer (PowerShell)
  • Linux: mpg123, mplayer, aplay, or ffplay

MCP Configuration

"ref": { "command": "npx", "args": ["rime-mcp"], "env": { RIME_API_KEY=your_api_key_here # Optional configuration RIME_GUIDANCE="<guide how the agent speaks>" RIME_WHO_TO_ADDRESS="<your name>" RIME_WHEN_TO_SPEAK="<tell the agent when to speak>" RIME_VOICE="cove" } }

All of the optional env vars are part of the tool definition and are prompts to

All voice options are listed here.

You can get your API key from the Rime Dashboard.

The following environment variables can be used to customize the behavior:

  • RIME_GUIDANCE: The main description of when and how to use the speak tool
  • RIME_WHO_TO_ADDRESS: Who the speech should address (default: "user")
  • RIME_WHEN_TO_SPEAK: When the tool should be used (default: "when asked to speak or when finishing a command")
  • RIME_VOICE: The default voice to use (default: "cove")

Example use cases

Example 1: Coding agent announcements

"RIME_WHEN_TO_SPEAK": "Always conclude your answers by speaking.", "RIME_GUIDANCE": "Give a brief overview of the answer. If any files were edited, list them."

Example 2: Learn how the kids talk these days

RIME_GUIDANCE="Use phrases and slang common among Gen Alpha." RIME_WHO_TO_ADDRESS="Matt" RIME_WHEN_TO_SPEAK="when asked to speak"

Example 3: Different languages based on context

RIME_VOICE="use 'cove' when talking about Typescript and 'antoine' when talking about Python"

Development

  1. Install dependencies:
npm install
  1. Build the server:
npm run build
  1. Run in development mode with hot reload:
npm run dev

License

MIT

Badges

You must be authenticated.

A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

Tools

A Model Context Protocol server that enables AI models to generate and play high-quality text-to-speech audio through your device's native audio system using Rime's voice synthesis API.

  1. Features
    1. Requirements
      1. MCP Configuration
        1. Example use cases
          1. Example 1: Coding agent announcements
          2. Example 2: Learn how the kids talk these days
          3. Example 3: Different languages based on context
        2. Development
          1. License
            1. Badges

              Related MCP Servers

              • -
                security
                F
                license
                -
                quality
                Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
                Last updated -
                2
                Python
              • -
                security
                F
                license
                -
                quality
                A Model Context Protocol server that enables AI assistants to utilize AivisSpeech Engine's high-quality voice synthesis capabilities through a standardized API interface.
                Last updated -
                TypeScript
              • -
                security
                A
                license
                -
                quality
                A server that enables Claude 3.7 and other AI agents to access VOICEVOX-compatible speech synthesis engines (AivisSpeech, VOICEVOX, COEIROINK) through the Model Context Protocol.
                Last updated -
                2
                TypeScript
                MIT License
                • Linux
              • A
                security
                A
                license
                A
                quality
                An official Model Context Protocol (MCP) server that enables AI clients to interact with ElevenLabs' Text to Speech and audio processing APIs, allowing for speech generation, voice cloning, audio transcription, and other audio-related tasks.
                Last updated -
                19
                633
                Python
                MIT License
                • Apple

              View all related MCP servers

              ID: 3a17xnerpt