vocald
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@vocaldspeak 'Hello, world!' using the Bella voice"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
vocald
Headless text-to-speech / speech-to-text server with a REST API and a native MCP endpoint. No GUI, no GPU, no torch — built to give AI agents and scripts a voice.
TTS: Kokoro-82M (50 preset voices, 8 languages) on onnxruntime, int8
STT: Whisper via faster-whisper (CTranslate2)
API: REST + MCP (Streamable HTTP at
/mcp) — same tools either wayAuth: single bearer key over every route including
/mcp; the docker image fails closedFootprint: 1.2 GB image, ~350 MiB RSS after a generation (the torch-based ancestor was 7.4 GB / 2.2 GiB)
Install
Docker
export VOICEBOX_API_KEY=$(openssl rand -hex 24)
docker compose -f docker-compose.headless.yml up -d --build
curl -H "Authorization: Bearer $VOICEBOX_API_KEY" http://127.0.0.1:17600/healthThe Kokoro model is baked into the image (no first-request download); Whisper downloads into a named volume on first transcription. Without a key the container refuses to serve anything — deny by default.
Homebrew (macOS, native)
bash verify-cli.sh # builds + installs the formula from your working tree, tests it end to end
# or step by step — see INSTALL.md
vocald-server # 127.0.0.1:17493, open loopback dev mode
VOICEBOX_API_KEY=$(openssl rand -hex 24) vocald-server --host 0.0.0.0Data lives in ~/.voicebox; model files (~115 MB) download on first
generation. See INSTALL.md for the full environment reference.
Related MCP server: speaches-mcp
API in 30 seconds
AUTH="Authorization: Bearer $VOICEBOX_API_KEY"
# create a preset voice
curl -X POST -H "$AUTH" -H 'Content-Type: application/json' \
-d '{"name":"Bella","voice_type":"preset","preset_engine":"kokoro","preset_voice_id":"af_bella"}' \
http://127.0.0.1:17600/profiles
# synthesize, then download the artifact
curl -X POST -H "$AUTH" -H 'Content-Type: application/json' \
-d '{"profile_id":"<id>","text":"Hello from vocald."}' \
http://127.0.0.1:17600/generate # -> {"id": ...}
curl -H "$AUTH" -o hello.wav http://127.0.0.1:17600/audio/<id>
# transcribe
curl -X POST -H "$AUTH" -F file=@hello.wav http://127.0.0.1:17600/transcribePreset voice ids: GET /profiles/presets/kokoro.
MCP
Point any MCP client (Claude Code, Cursor, …) at http://host:port/mcp with
the Authorization: Bearer <key> header. Tools:
Tool | What it does |
| List voices |
| Create a preset voice ( |
| Synthesize and return |
| Fire-and-forget synthesis (returns a poll URL) |
| Audio (base64) → text |
| Recent recordings with transcripts |
The tool names keep the voicebox. prefix for compatibility with existing
clients and skills.
Podcast skill
vocald pairs with a Claude Code skill that
turns any file, URL, or topic into a NotebookLM-style two-host audio
conversation — the skill writes the dialogue, then voices each turn through
vocald's API and plays it locally. See
examples/podcast-skill/ for the skill definition;
drop it into ~/.claude/skills/podcast/ and run /podcast <source>.
Limits, by design
Kokoro preset voices only — the voice-cloning engines and the LLM "personality" rewrite from the upstream project are not included in this slim build (requests degrade with clear errors, never crashes). If you need cloning, use the upstream desktop app.
Responsible use
Synthetic speech can be misused. See RESPONSIBLE_USE.md — in short: only voices you own or have permission to use, no impersonation, no fraud.
Credits & license
vocald is a headless, torch-free hard fork of
jamiepine/voicebox (forked at
b542768), which provides the desktop app this server grew out of. MIT
license — see LICENSE.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/danylo-dudok/vocald'
If you have feedback or need assistance with the MCP directory API, please join our Discord server