Integrates with NVIDIA NeMo models to provide speech-to-text (STT) capabilities as part of an interactive voice dialogue system
Interactive Voice MCP Server (Kokoro TTS + NeMo ASR)
A Model Context Protocol server that provides Text-to-Speech (TTS) capabilities using Kokoro and Speech-to-Text (STT) capabilities using NVIDIA NeMo Parakeet models, enabling interactive voice dialogues.
Available Tools
interactive_voice_dialog
- Synthesizes text to speech, plays it, then listens for user speech input and returns the transcription.- Required arguments:
text_to_speak
(string): The text for the assistant to speak.
- Optional arguments:
voice
(string): The voice to use for TTS (e.g., 'af_heart'). Defaults to 'af_heart'.
- Required arguments:
Installation
Prerequisites
Some of the underlying TTS models require espeak-ng
to be installed on your system.
Windows Installation:
- Go to espeak-ng releases.
- Click on "Latest release".
- Download the appropriate
*.msi
file (e.g.espeak-ng-20191129-b702b03-x64.msi
). - Run the downloaded installer.
Local Development Installation
To allow Claude Desktop to launch this server using python -m mcp_server_tts
, you need to install it as a Python module. Installing in "editable" mode (-e
) is recommended for development, as it means changes to the source code are reflected immediately without needing to reinstall.
Navigate to the directory containing the pyproject.toml
file (the root of this server project) and run:
After installation, you can run it as a script using:
Configuration
To use this server with Claude Desktop, you need to add it to your claude_desktop_config.json
file.
The location of this file is typically: C:\Users\<YourUsername>\AppData\Roaming\Claude\claude_desktop_config.json
Add the following entry under the mcpServers
object in your claude_desktop_config.json
:
For example, your mcpServers
section might look like this:
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
Enables voice-based interactions with Claude by converting text to speech using Kokoro TTS and transcribing user responses using NVIDIA NeMo ASR, creating interactive voice dialogues.
Related MCP Servers
- -securityAlicense-qualityLets you use Claude Desktop, or any MCP Client, to use natural language to accomplish things with Neon.Last updated -600306TypeScriptMIT License
- AsecurityFlicenseAqualityFacilitates direct speech generation using Claude for multiple languages and emotions, integrating with a Zonos TTS setup via the Model Context Protocol.Last updated -19TypeScript
- -securityFlicense-qualityProvides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.Last updated -2Python
- -securityAlicense-qualityEnables Claude and other AI assistants to interact with your computer's audio system, allowing for recording from microphones and playing audio through speakers.Last updated -2PythonMIT License