1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@vocald speak 'Hello, world!' using the Bella voice" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

vocald

by danylo-dudok

Overview Schema Related Servers Score Discussions

Python

Remote

vocald

Headless text-to-speech / speech-to-text server with a REST API and a native MCP endpoint. No GUI, no GPU, no torch — built to give AI agents and scripts a voice.

TTS: Kokoro-82M (50 preset voices, 8 languages) on onnxruntime, int8
STT: Whisper via faster-whisper (CTranslate2)
API: REST + MCP (Streamable HTTP at /mcp) — same tools either way
Auth: single bearer key over every route including /mcp; the docker image fails closed
Footprint: 1.2 GB image, ~350 MiB RSS after a generation (the torch-based ancestor was 7.4 GB / 2.2 GiB)

Install

Docker

export VOICEBOX_API_KEY=$(openssl rand -hex 24)
docker compose -f docker-compose.headless.yml up -d --build
curl -H "Authorization: Bearer $VOICEBOX_API_KEY" http://127.0.0.1:17600/health

The Kokoro model is baked into the image (no first-request download); Whisper downloads into a named volume on first transcription. Without a key the container refuses to serve anything — deny by default.

Homebrew (macOS, native)

bash verify-cli.sh   # builds + installs the formula from your working tree, tests it end to end
# or step by step — see INSTALL.md
vocald-server        # 127.0.0.1:17493, open loopback dev mode
VOICEBOX_API_KEY=$(openssl rand -hex 24) vocald-server --host 0.0.0.0

Data lives in ~/.voicebox; model files (~115 MB) download on first generation. See INSTALL.md for the full environment reference.

Related MCP server: speaches-mcp

API in 30 seconds

AUTH="Authorization: Bearer $VOICEBOX_API_KEY"

# create a preset voice
curl -X POST -H "$AUTH" -H 'Content-Type: application/json' \
  -d '{"name":"Bella","voice_type":"preset","preset_engine":"kokoro","preset_voice_id":"af_bella"}' \
  http://127.0.0.1:17600/profiles

# synthesize, then download the artifact
curl -X POST -H "$AUTH" -H 'Content-Type: application/json' \
  -d '{"profile_id":"<id>","text":"Hello from vocald."}' \
  http://127.0.0.1:17600/generate          # -> {"id": ...}
curl -H "$AUTH" -o hello.wav http://127.0.0.1:17600/audio/<id>

# transcribe
curl -X POST -H "$AUTH" -F file=@hello.wav http://127.0.0.1:17600/transcribe

Preset voice ids: GET /profiles/presets/kokoro.

MCP

Point any MCP client (Claude Code, Cursor, …) at http://host:port/mcp with the Authorization: Bearer <key> header. Tools:

Tool	What it does
`voicebox.list_profiles`	List voices
`voicebox.create_profile`	Create a preset voice (`af_bella`, `am_adam`, …)
`voicebox.generate`	Synthesize and return `{generation_id, audio_url}` — download with the same bearer
`voicebox.speak`	Fire-and-forget synthesis (returns a poll URL)
`voicebox.transcribe`	Audio (base64) → text
`voicebox.list_captures`	Recent recordings with transcripts

The tool names keep the voicebox. prefix for compatibility with existing clients and skills.

Podcast skill

vocald pairs with a Claude Code skill that turns any file, URL, or topic into a NotebookLM-style two-host audio conversation — the skill writes the dialogue, then voices each turn through vocald's API and plays it locally. See examples/podcast-skill/ for the skill definition; drop it into ~/.claude/skills/podcast/ and run /podcast <source>.

Limits, by design

Kokoro preset voices only — the voice-cloning engines and the LLM "personality" rewrite from the upstream project are not included in this slim build (requests degrade with clear errors, never crashes). If you need cloning, use the upstream desktop app.

Responsible use

Synthetic speech can be misused. See RESPONSIBLE_USE.md — in short: only voices you own or have permission to use, no impersonation, no fraud.

Credits & license

vocald is a headless, torch-free hard fork of jamiepine/voicebox (forked at b542768), which provides the desktop app this server grew out of. MIT license — see LICENSE.

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/danylo-dudok/vocald'

If you have feedback or need assistance with the MCP directory API, please join our Discord server