Skip to main content
Glama

speaches-mcp

An MCP (Model Context Protocol) server that exposes speaches as transcribe_audio and text_to_speech tools.

Speaches is a local, OpenAI API-compatible server for speech-to-text (via faster-whisper) and text-to-speech (via Kokoro/Piper). This MCP server lets AI assistants like Claude use it directly.

Tools

transcribe_audio

Transcribe an audio file using your speaches instance.

Parameter

Required

Description

file_path

Absolute path to the audio file

language

ISO-639-1 language code (e.g. en). Omit for auto-detect.

model

Whisper model ID. Defaults to SPEACHES_STT_MODEL env var.

text_to_speech

Convert text to speech and save to a file.

Parameter

Required

Description

text

Text to convert

output_path

Absolute path for the output audio file (e.g. /tmp/output.mp3)

voice

Voice ID. Defaults to TTS_VOICE env var.

model

TTS model ID. Defaults to TTS_MODEL env var.

Usage

With Docker + Supergateway (SSE transport)

This exposes the MCP server over SSE on port 8010, suitable for remote clients.

docker compose up --build

Then connect your MCP client to http://localhost:8010/sse.

Standalone (stdio transport)

Build the image:

docker build -t speaches-mcp .

Run it:

docker run --rm -i \
  -e SPEACHES_URL=http://your-speaches-host:8000 \
  speaches-mcp

For Claude Desktop, add to your config:

{
  "mcpServers": {
    "speaches": {
      "command": "docker",
      "args": ["run", "--rm", "-i",
        "-e", "SPEACHES_URL=http://your-speaches-host:8000",
        "speaches-mcp"
      ]
    }
  }
}

Environment Variables

Variable

Default

Description

SPEACHES_URL

http://speaches:8000

Base URL of your speaches instance

STT_MODEL

Systran/faster-whisper-large-v3

Default speech-to-text model

TTS_MODEL

speaches-ai/Kokoro-82M-v1.0-ONNX

Default text-to-speech model

TTS_VOICE

af_heart

Default TTS voice

OPENAI_API_KEY

dummy

Required by the OpenAI SDK but not used by speaches

Downloading Models

Before transcribing, make sure you've downloaded models into speaches:

# Speech-to-text
curl http://your-speaches-host:8000/v1/models/Systran/faster-whisper-large-v3 -X POST

# Text-to-speech
curl http://your-speaches-host:8000/v1/models/speaches-ai/Kokoro-82M-v1.0-ONNX -X POST

License

MIT

Install Server
F
license - not found
C
quality
C
maintenance

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/xavier-hernandez/mcp-speaches'

If you have feedback or need assistance with the MCP directory API, please join our Discord server