Speech Processing

MCP Servers for Speech Processing

Voice interaction and speech processing capabilities. Enables converting speech to text, audio commands, and voice generation.

View all MCP Servers

ElevenLabs MCP Serverofficial
elevenlabs
A
security
A
license
A
quality
An official Model Context Protocol (MCP) server that enables AI clients to interact with ElevenLabs' Text to Speech and audio processing APIs, allowing for speech generation, voice cloning, audio transcription, and other audio-related tasks.
Last updated -
19
843
Python
MIT License
Audio Transcriber MCP Server
Ichigo3766
A
security
A
license
A
quality
A MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.
Last updated -
1
4
7
JavaScript
MIT License
MCP Simple AivisSpeech
shinshin86
A
security
A
license
A
quality
A Model Context Protocol server that integrates with AivisSpeech to enable AI assistants to convert text to natural-sounding Japanese speech with customizable voice parameters.
Last updated -
1
337
7
JavaScript
Apache 2.0
Rime MCP
MatthewDailey
A
security
A
license
A
quality
A Model Context Protocol server that enables AI models to generate and play high-quality text-to-speech audio through your device's native audio system using Rime's voice synthesis API.
Last updated -
1
6
8
JavaScript
The Unlicense
Asterisk S2S MCP Server
gcorroto
A
security
A
license
A
quality
MCP Server for automated conversational phone calls using Asterisk with Speech-to-Speech capabilities, allowing users to make phone conversations as easily as writing a prompt.
Last updated -
9
530
2
TypeScript
MIT License
MCP Make Sound
nocoo
A
security
A
license
A
quality
A Model Context Protocol server for macOS that enables AI assistants to play system sounds for audio feedback, offering informational, warning, and error sound options.
Last updated -
3
JavaScript
MIT License
douyin-mcp-server
yzfly
A
security
A
license
A
quality
douyin-mcp-server
Last updated -
3
171
Python
MIT License
TranscriptionTools MCP Server
MushroomFleet
A
security
A
license
A
quality
Provides intelligent transcript processing capabilities for Claude, featuring natural formatting, contextual repair, and smart summarization powered by Deep Thinking LLMs.
Last updated -
4
15
TypeScript
MIT License
mcp-hfspace MCP Server
xiyuefox
A
security
A
license
A
quality
Connects Claude Desktop to Hugging Face Spaces with minimal setup, enabling capabilities like image generation, vision tasks, text-to-speech, and chat with AI models.
Last updated -
3
397
MIT License
Voicevox MCP Server
Dosugamea
A
security
A
license
A
quality
A server that enables Claude 3.7 and other AI agents to access VOICEVOX-compatible speech synthesis engines (AivisSpeech, VOICEVOX, COEIROINK) through the Model Context Protocol.
Last updated -
1
10
TypeScript
MIT License
Bouyomi-chan MCP Server
uraoz
A
security
A
license
A
quality
A Node.js server that enables AI assistants to interact with Bouyomi-chan's text-to-speech functionality through Model Context Protocol (MCP), allowing for voice reading of text with adjustable parameters.
Last updated -
1
1
JavaScript
MIT License
AivisSpeech MCP Server
kentaro
A
security
F
license
A
quality
A Model Context Protocol server that enables AI assistants to utilize AivisSpeech Engine's high-quality voice synthesis capabilities through a standardized API interface.
Last updated -
1
TypeScript
Gladia MCPofficial
gladiaio
-
security
A
license
-
quality
Official Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.
Last updated -
2
Python
MIT License
Windows TTS MCP Server
balloonf
A
security
F
license
A
quality
A PowerShell-based MCP server that enables Claude Desktop to convert text to speech using Windows' built-in Speech API, offering features like playback control, speed and volume adjustment.
Last updated -
10
Python
Whissle MCP Server
WhissleAI
A
security
F
license
A
quality
A Python-based server that provides access to Whissle API endpoints for speech-to-text, diarization, translation, and text summarization.
Last updated -
5
1
Python
Flyworks MCPofficial
Flyworks-AI
-
security
A
license
-
quality
A Model Context Protocol server that enables fast and free lipsync video creation for a wide range of digital avatars, supporting both audio and text inputs to generate synchronized lip movements.
Last updated -
90
Python
MIT License
Zonos TTS MCP Server
PhialsBasement
A
security
F
license
A
quality
Facilitates direct speech generation using Claude for multiple languages and emotions, integrating with a Zonos TTS setup via the Model Context Protocol.
Last updated -
1
5
12
TypeScript
MCP FishAudio Server
da-okazaki
-
security
A
license
-
quality
An MCP (Model Context Protocol) server that provides seamless integration between Fish Audio's Text-to-Speech API and LLMs like Claude, enabling natural language-driven speech synthesis.
Last updated -
27
6
TypeScript
MIT License
NijiVoice-MCP
ryoooo
-
security
A
license
-
quality
An MCP server that enables LLMs to access the NijiVoice API for text-to-speech generation, supporting features like fetching available voice actors and checking credit balance.
Last updated -
Python
MIT License
Speech MCP
Kvadratni
-
security
A
license
-
quality
A Goose MCP extension providing voice interaction with modern audio visualization, allowing users to communicate with Goose through speech rather than text.
Last updated -
57
Python
MIT License
YouTube Transcript MCP Server
Kyungpyo-Kim
-
security
A
license
-
quality
Enables AI models like Claude to easily access and utilize subtitle data from YouTube videos by extracting transcripts from video URLs with support for multiple languages.
Last updated -
Python
MIT License
MCP-Audio Plugin
AIO-2030
-
security
A
license
-
quality
A voice-to-text transcription service that converts audio files to transcripts using SiliconFlow, supporting both multipart/form-data and base64 formats.
Last updated -
7
Python
Apache 2.0
Blabber-MCP
pinkpixel-dev
-
security
A
license
-
quality
An MCP server that enables LLMs to generate spoken audio from text using OpenAI's Text-to-Speech API, supporting various voices, models, and audio formats.
Last updated -
2
1
JavaScript
MIT License
MCP Audio Transcriber
ShreyasTembhare
-
security
A
license
-
quality
A portable, Dockerized Python tool that implements Model Context Protocol for audio transcription using Whisper models, featuring both CLI and web UI interfaces for converting audio files to JSON transcriptions.
Last updated -
Python
MIT License
Voice Recognition MCP Service
yangsenessa
-
security
A
license
-
quality
Provides voice recognition and text extraction capabilities with support for both stdio and MCP modes, processing audio files or base64 encoded data and returning structured results with language, emotion, and speaker information.
Last updated -
Python
MIT License
Voice Recorder MCP Server
DefiBax
-
security
A
license
-
quality
Enables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.
Last updated -
6
Python
MIT License
Voice Call MCP Server
popcornspace
-
security
A
license
-
quality
A Model Context Protocol server that enables AI assistants like Claude to initiate and manage real-time voice calls using Twilio and OpenAI's voice models.
Last updated -
44
TypeScript
MIT License
Audio MCP Server
GongRzhe
-
security
A
license
-
quality
Enables Claude and other AI assistants to interact with your computer's audio system, allowing for recording from microphones and playing audio through speakers.
Last updated -
3
Python
MIT License
MCP Agent Platform
rolenet
-
security
A
license
-
quality
A multi-agent human-computer interaction system that enables natural interaction through integrated visual recognition, speech recognition, and speech synthesis capabilities.
Last updated -
15
Python
Apache 2.0
ElevenLabs MCP Server
nguyendinhsinh361
-
security
A
license
-
quality
Official ElevenLabs Model Context Protocol server that enables AI assistants like Claude to interact with Text to Speech and audio processing APIs, allowing them to generate speech, clone voices, transcribe audio, and create soundscapes.
Last updated -
1
Python
MIT License
RetellAI MCP Server
abhaybabbar
-
security
F
license
-
quality
A Model Context Protocol server implementation that enables AI assistants to interact with RetellAI's voice services for managing calls, agents, phone numbers, and voice options.
Last updated -
597
11
TypeScript
Interactive Voice MCP Server
rungee84
-
security
F
license
-
quality
Enables voice-based interactions with Claude by converting text to speech using Kokoro TTS and transcribing user responses using NVIDIA NeMo ASR, creating interactive voice dialogues.
Last updated -
Python
Edge-TTS MCP Server
yuiseki
-
security
F
license
-
quality
A Model Context Protocol server that provides text-to-speech functionality for AI agents using Microsoft Edge's text-to-speech technology, supporting multiple voices, languages, and voice customization.
Last updated -
4
Python
Voice Mode
mbailey
-
security
F
license
-
quality
Natural voice conversations for AI assistants that brings human-like voice interactions to Claude, ChatGPT, and other LLMs through the Model Context Protocol (MCP).
Last updated -
141
Python
MS-Lucidia-Voice-Gateway-MCP
ExpressionsBot
-
security
F
license
-
quality
A server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.
Last updated -
4
JavaScript
Resemble AI Voice Generation MCP Server
obaid
-
security
F
license
-
quality
Integrates with Claude and Cursor using the Model Context Protocol to generate voice audio from text using Resemble AI's voices.
Last updated -
Python
Kokoro TTS MCP Server
giannisanni
-
security
F
license
-
quality
Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
Last updated -
7
Python
Votars MCP
scarletlabs-ai
-
security
F
license
-
quality
Votars is the world's smartest multilingual meeting assistant, designed for voice recording, transcription, and advanced AI processing. It features real-time translation, intelligent error correction, AI summarization, smart content generation, and AI discussions. The Votars app is available on Web,
Last updated -
27
Go
systemprompt-mcp-interview
Ejb503
-
security
F
license
-
quality
A specialized Model Context Protocol (MCP) server that enables AI-powered interview roleplay scenarios for practice with realistic conversational feedback.
Last updated -
10
4
TypeScript
IntelliGlow
shree-bd
-
security
F
license
-
quality
A Model Context Protocol (MCP) server that allows AI assistants like Claude to control real smart bulbs via UDP network communication, featuring voice commands, AI reasoning, and direct hardware control.
Last updated -
1
Python
MCP Video & Audio Text Extraction Server
SealinGp
-
security
F
license
-
quality
An MCP server that downloads videos/extracts audio from various platforms like YouTube, Bilibili, and TikTok, then transcribes them to text using OpenAI's Whisper model.
Last updated -
5
Python
Systemprompt MCP Gmail Server
Ejb503
-
security
F
license
-
quality
Enables users to manage Gmail accounts using AI agent-assisted operations via an MCP protocol, supporting email search, reading, deletion, and sending with a voice-powered interface.
Last updated -
22
9
TypeScript

MCP Servers for Speech Processing

ElevenLabs MCP Serverofficial

Gladia MCPofficial

Flyworks MCPofficial