STT2TTS MCP
Integrates with Ollama to enable local LLM-based speech-to-text and text-to-speech, leveraging locally running models for on-device processing.
Integrates with OpenAI API to provide cloud-based speech-to-text (STT) and text-to-speech (TTS) capabilities, offering high-accuracy transcription and a variety of voices.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@STT2TTS MCPtranscribe meeting_recording.mp3"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
STT2TTS MCP Server
Local-first speech-to-text and text-to-speech MCP server. Hot-swappable engines via config.yaml — no code changes, no API keys required.
┌──────────────┐ stdio ┌──────────────────┐
│ MCP client │ ◀────────────▶ │ stt2tts-mcp │
│ │ │ ├─ STT engine │ ──▶ faster-whisper
│ │ │ └─ TTS engine │ ──▶ piper / kokoro / coqui
└──────────────┘ └──────────────────┘
│
▼
config.yaml (hot-reload)Why
Replaces whisper-mcp. Works offline, ships with five STT and six TTS engines, switches per-task via config.
Related MCP server: speaches-mcp
Install
pip install stt2tts-mcp
# Add the engines you actually use:
pip install stt2tts-mcp[stt-faster-whisper] # local STT
pip install stt2tts-mcp[tts-piper] # local TTS (~50MB voices)
# Register with your MCP client (consult your client's docs for the exact
# config file location — most use mcp_config.json or a per-client equivalent):
{
"mcp": {
"stt2tts": {
"type": "local",
"command": ["stt2tts-mcp"],
"enabled": true
}
}
}Engines
STT | Size | License | Best for |
faster-whisper | 39M – 2.9 GB | MIT | English, INT8 CPU, fastest |
sherpa-onnx | 39M – large | Apache 2.0 | Multilingual |
OpenAI API | cloud | Proprietary | Highest accuracy, needs key |
Ollama | varies | MIT | Local LLM integration |
LMStudio | varies | MIT | Local model server |
TTS | Voice size | License | Best for |
Piper | 20 – 50 MB | Apache 2.0 | Smallest, 10-20× realtime |
Kokoro-82M | ~330 MB | Apache 2.0 | Quality/size ratio |
Coqui XTTS | ~1.5 GB | MPL 2.0 | Voice cloning, needs GPU |
OpenAI API | cloud | Proprietary | All voices, needs key |
Ollama | varies | MIT | LLM-based voices |
LMStudio | varies | MIT | Local model server |
Configure
config.yaml:
stt:
engine: faster_whisper # sherpa_onnx | openai_api | ollama | lmstudio
enabled: true
params:
model_size: base.en # tiny.en | base.en | small.en | medium.en
device: cpu # cpu | cuda
tts:
engine: piper # kokoro | coqui | openai_api | ollama | lmstudio
enabled: true
params:
voice: en_US-lessac-medium
model_dir: ~/.cache/piperReload without restart by calling the reload_config MCP tool.
MCP Tools
Tool | What it does |
| Audio file → text |
| Text → WAV file |
| Available STT models |
| Available TTS voices |
| Re-read |
| Engine status |
All formats ffmpeg supports (wav, mp3, ogg, flac, m4a) are accepted; STT input is auto-converted to 16 kHz mono.
Develop
git clone https://github.com/your-org/stt2tts-mcp
cd stt2tts-mcp
pip install -e ".[all]"
python -m stt2tts_mcp.serverLicense
Apache 2.0
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/pygodzilla/stt2tts-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server