1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@vocald speak 'Hello, world!' using the Bella voice" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

vocald

by danylo-dudok

Overview Schema Related Servers Score Discussions

Python

Remote

vocald

Headless text-to-speech / speech-to-text server with a REST API and a native MCP endpoint. No GUI, no GPU, no torch — built to give AI agents and scripts a voice.

TTS: Kokoro-82M (50 preset voices, 8 languages) on onnxruntime, int8
STT: Whisper via faster-whisper (CTranslate2)
API: REST + MCP (Streamable HTTP at /mcp) — same tools either way
Auth: single bearer key over every route including /mcp; the docker image fails closed
Footprint: 1.2 GB image, ~350 MiB RSS after a generation (the torch-based ancestor was 7.4 GB / 2.2 GiB)

Install

Docker

export VOICEBOX_API_KEY=$(openssl rand -hex 24)
docker compose -f docker-compose.headless.yml up -d --build
curl -H "Authorization: Bearer $VOICEBOX_API_KEY" http://127.0.0.1:17600/health

The Kokoro model is baked into the image (no first-request download); Whisper downloads into a named volume on first transcription. Without a key the container refuses to serve anything — deny by default.

Homebrew (macOS, native)

bash verify-cli.sh   # builds + installs the formula from your working tree, tests it end to end
# or step by step — see INSTALL.md
vocald-server        # 127.0.0.1:17493, open loopback dev mode
VOICEBOX_API_KEY=$(openssl rand -hex 24) vocald-server --host 0.0.0.0

Data lives in ~/.voicebox; model files (~115 MB) download on first generation. See INSTALL.md for the full environment reference.

Related MCP server: speaches-mcp

API in 30 seconds

AUTH="Authorization: Bearer $VOICEBOX_API_KEY"

# create a preset voice
curl -X POST -H "$AUTH" -H 'Content-Type: application/json' \
  -d '{"name":"Bella","voice_type":"preset","preset_engine":"kokoro","preset_voice_id":"af_bella"}' \
  http://127.0.0.1:17600/profiles

# synthesize, then download the artifact
curl -X POST -H "$AUTH" -H 'Content-Type: application/json' \
  -d '{"profile_id":"<id>","text":"Hello from vocald."}' \
  http://127.0.0.1:17600/generate          # -> {"id": ...}
curl -H "$AUTH" -o hello.wav http://127.0.0.1:17600/audio/<id>

# transcribe
curl -X POST -H "$AUTH" -F file=@hello.wav http://127.0.0.1:17600/transcribe

Preset voice ids: GET /profiles/presets/kokoro.

MCP

Point any MCP client (Claude Code, Cursor, …) at http://host:port/mcp with the Authorization: Bearer <key> header. Tools:

Tool	What it does
`voicebox.list_profiles`	List voices
`voicebox.create_profile`	Create a preset voice (`af_bella`, `am_adam`, …)
`voicebox.generate`	Synthesize and return `{generation_id, audio_url}` — download with the same bearer
`voicebox.speak`	Fire-and-forget synthesis (returns a poll URL)
`voicebox.transcribe`	Audio (base64) → text
`voicebox.list_captures`	Recent recordings with transcripts

The tool names keep the voicebox. prefix for compatibility with existing clients and skills.

Podcast skill

vocald pairs with a Claude Code skill that turns any file, URL, or topic into a NotebookLM-style two-host audio conversation — the skill writes the dialogue, then voices each turn through vocald's API and plays it locally. See examples/podcast-skill/ for the skill definition; drop it into ~/.claude/skills/podcast/ and run /podcast <source>.

Limits, by design

Kokoro preset voices only — the voice-cloning engines and the LLM "personality" rewrite from the upstream project are not included in this slim build (requests degrade with clear errors, never crashes). If you need cloning, use the upstream desktop app.

Responsible use

Synthetic speech can be misused. See RESPONSIBLE_USE.md — in short: only voices you own or have permission to use, no impersonation, no fraud.

Credits & license

vocald is a headless, torch-free hard fork of jamiepine/voicebox (forked at b542768), which provides the desktop app this server grew out of. MIT license — see LICENSE.

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Related MCP Servers

VOICEVOX TTS MCP
Text-to-Speech Speech Processing Multimedia Processing
kajidog
A
license
A
quality
D
maintenance
A text-to-speech MCP server that enables AI assistants to speak using the VOICEVOX engine with support for multi-character conversations. It features queue management, low-latency streaming via FFplay, and cross-platform playback across Windows, macOS, and Linux.
Last updated 2026-07-04
7
149
16
ISC
speaches-mcp
Speech Processing Text-to-Speech
xavier-hernandez
F
license
C
quality
C
maintenance
An MCP server that exposes speech-to-text and text-to-speech capabilities using a local speaches instance, allowing AI assistants to transcribe audio and generate speech.
Last updated 2026-05-16
2
simple-asr-mcp
Speech Processing Audio Processing
Selentar
A
license
-
quality
D
maintenance
Minimal MCP server for local speech recognition using faster-whisper. Runs on CPU, no cloud required.
Last updated 2026-04-06
MIT
STT2TTS MCP
Speech Processing Text-to-Speech
pygodzilla
A
license
-
quality
B
maintenance
Local-first speech-to-text and text-to-speech MCP server. Hot-swappable engines via config.yaml — no code changes, no API keys required.
Last updated 2026-06-19
2
MIT

View all related MCP servers

Related MCP Connectors

mcp-fish
MCP server exposing the AceDataCloud Fish Audio API (text-to-speech with voice conditioning)
mcp-aichat
MCP server for AI dialogue using various LLM models via AceDataCloud
mcp-producer
MCP server for Producer/Riffusion AI music generation

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/danylo-dudok/vocald'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

vocald

Install

Docker

Homebrew (macOS, native)

API in 30 seconds

MCP

Podcast skill

Limits, by design

Responsible use

Credits & license

Maintenance

Resources

Looking for Admin?

Related MCP Servers

VOICEVOX TTS MCP

speaches-mcp

simple-asr-mcp

STT2TTS MCP

Related MCP Connectors

Latest Blog Posts

MCP directory API