Gladia MCP

Official
by gladiaio

Gladia MCP

Features

  • Audio transcription with speaker diarization
  • Real-time speech-to-text
  • Audio intelligence capabilities:
    • Translation
    • Summarization
    • Named Entity Recognition
    • Sentiment Analysis
    • Content Moderation
    • Chapterization
    • Audio to LLM integration
  • Async API with FastAPI
  • Easy-to-use CLI interface
  • Configurable logging
  • CORS support
  • Health check endpoint

Quickstart with Claude Desktop

  1. Get your API key from Gladia. There is a free tier available.
  2. Install uv (Python package manager), install with curl -LsSf https://astral.sh/uv/install.sh | sh or see the uv repo for additional install methods.
  3. Go to Claude > Settings > Developer > Edit Config > claude_desktop_config.json to include the following:
{ "mcpServers": { "Gladia": { "command": "uvx", "args": ["gladia-mcp"], "env": { "GLADIA_API_KEY": "<insert-your-api-key-here>" } } } }

If you're using Windows, you will have to enable "Developer Mode" in Claude Desktop to use the MCP server. Click "Help" in the hamburger menu at the top left and select "Enable Developer Mode".

Other MCP clients

For other clients like Cursor and Windsurf, run:

  1. pip install gladia-mcp
  2. python -m gladia_mcp --api-key={{PUT_YOUR_API_KEY_HERE}} --print to get the configuration. Paste it into appropriate configuration directory specified by your MCP client.

Example usage

Try asking Claude:

  • "Transcribe this audio file and identify different speakers"
  • "Convert this recording to text and translate it to Spanish"
  • "Analyze the sentiment and emotions in this speech"
  • "Extract key topics and create chapters from this long audio file"
  • "Transcribe this conversation and summarize the main points"

Optional features

You can add the GLADIA_MCP_BASE_PATH environment variable to the claude_desktop_config.json to specify the base path MCP server should look for and output files specified with relative paths.

Contributing

If you want to contribute or run from source:

  1. Clone the repository:
git clone https://github.com/gladia/gladia-mcp cd gladia-mcp
  1. Create a virtual environment and install dependencies using uv:
uv venv source .venv/bin/activate uv pip install -e ".[dev]"
  1. Copy .env.example to .env and add your Gladia API key:
cp .env.example .env # Edit .env and add your API key
  1. Run the tests to make sure everything is working:
./scripts/test.sh # Or with options ./scripts/test.sh --verbose --fail-fast
  1. Install the server in Claude Desktop: mcp install gladia_mcp/server.py
  2. Debug and test locally with MCP Inspector: mcp dev gladia_mcp/server.py

API Endpoints

Health Check

GET /health

Transcribe Audio

POST /transcribe

Parameters:

  • file: Audio file (multipart/form-data)
  • diarization: Enable speaker diarization (boolean, optional)
  • language: Language code (string, optional)

Example using curl:

curl -X POST "http://localhost:8000/transcribe" \ -H "accept: application/json" \ -H "Content-Type: multipart/form-data" \ -F "file=@audio.wav" \ -F "diarization=true"

Troubleshooting

Logs when running with Claude Desktop can be found at:

  • Windows: %APPDATA%\Claude\logs\mcp-server-gladia.log
  • macOS: ~/Library/Logs/Claude/mcp-server-gladia.log

MCP Gladia: spawn uvx ENOENT

If you encounter the error "MCP Gladia: spawn uvx ENOENT", confirm its absolute path by running this command in your terminal:

which uvx

Once you obtain the absolute path (e.g., /usr/local/bin/uvx), update your configuration to use that path (e.g., "command": "/usr/local/bin/uvx"). This ensures that the correct executable is referenced.

Development

Running Tests

pytest

Code Style

The project follows PEP 8 style guide. Use flake8 for linting:

flake8 gladia_mcp

License

MIT License

-
security - not tested
-
license - not tested
-
quality - not tested

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

Official Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.

  1. Features
    1. Quickstart with Claude Desktop
      1. Other MCP clients
        1. Example usage
          1. Optional features
            1. Contributing
              1. API Endpoints
                1. Health Check
                2. Transcribe Audio
              2. Troubleshooting
                1. MCP Gladia: spawn uvx ENOENT
              3. Development
                1. Running Tests
                2. Code Style
              4. License

                Related MCP Servers

                • -
                  security
                  F
                  license
                  -
                  quality
                  A Model Context Protocol server that provides text-to-speech capabilities using the Kokoro TTS model, offering multiple voice options and customizable speech parameters.
                  Last updated -
                  239
                  JavaScript
                  • Apple
                  • Linux
                • -
                  security
                  A
                  license
                  -
                  quality
                  A Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.
                  Last updated -
                  TypeScript
                  MIT License
                • A
                  security
                  A
                  license
                  A
                  quality
                  A Model Context Protocol server that enables AI models to generate and play high-quality text-to-speech audio through your device's native audio system using Rime's voice synthesis API.
                  Last updated -
                  1
                  176
                  4
                  JavaScript
                  The Unlicense
                  • Apple
                  • Linux
                • A
                  security
                  A
                  license
                  A
                  quality
                  An official Model Context Protocol (MCP) server that enables AI clients to interact with ElevenLabs' Text to Speech and audio processing APIs, allowing for speech generation, voice cloning, audio transcription, and other audio-related tasks.
                  Last updated -
                  19
                  633
                  Python
                  MIT License
                  • Apple

                View all related MCP servers

                ID: yaqbn4liew