Skip to main content
Glama

MCP-Audio Plugin

by AIO-2030
Apache 2.0
1

MCP-Audio Plugin

mcp-audio is an AIO-2030 compliant MCP plugin that performs voice-to-text transcription using the Audio speech recognition API.

It exposes the identify_voice method via both multipart/form-data and base64 formats, supports the AIO tools.call protocol, and returns JSON-RPC structured outputs.


Features

  • Fully AIO-compliant MCP plugin (/tools.call, /help)
  • Converts .wav/.mp3 audio files to transcripts using SiliconFlow
  • API key managed securely via .env file
  • Docker-compatible and minimal dependencies
  • Registration-ready for AIO endpoint registry

Setup (Local)

1. Clone and Install

git clone git@github.com:AIO-2030/mcp-audio.git cd mcp-audio python -m venv venv && source venv/bin/activate pip install -r requirements.txt

2. Add .env file

cp .env.example .env

Set your audio URL and API key:

AUDIO_URL=https--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

3. Run the MCP server

python src/mcp_server.py

4. Docker

4.1 Build and Run
docker build -t mcp-audio . docker run --env-file .env -p 8080:8080 mcp-audio

API Overview

POST /api/v1/mcp/voice_model

Upload audio file directly. Response:

{ "transcript": "hello world", "confidence": 0.91, "audio_hash": "a1b2c3..." }

POST /api/v1/mcp/tools.call (AIO Protocol)

JSON-RPC format with base64-encoded audio. Response:

{ "method": "tools.call", "params": { "method": "identify_voice", "inputs": [ { "type": "audio", "value": "<base64-audio>" } ] } }

GET /api/v1/mcp/help

Auto-serves contents of mcp_audio_registration.json. Used by Queen AI for MCP discovery and service indexing.

Testing Tools

Base64 Voice Test

python test/test_audio_base64.py

Health Check

python health_check.py

MCP Registration (to AIO Endpoint Canister)

./register_mcp.sh

Requires jq, dfx, and a running endpoint_registry canister.

-
security - not tested
A
license - permissive license
-
quality - not tested

A voice-to-text transcription service that converts audio files to transcripts using SiliconFlow, supporting both multipart/form-data and base64 formats.

  1. Features
    1. Setup (Local)
      1. Clone and Install
      2. Add .env file
      3. Run the MCP server
      4. Docker
    2. API Overview
      1. POST /api/v1/mcp/voice_model
      2. POST /api/v1/mcp/tools.call (AIO Protocol)
      3. GET /api/v1/mcp/help
    3. Testing Tools
      1. Base64 Voice Test
      2. Health Check
    4. MCP Registration (to AIO Endpoint Canister)

      Related MCP Servers

      • -
        security
        F
        license
        -
        quality
        Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
        Last updated -
        2
        Python
      • A
        security
        A
        license
        A
        quality
        A MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.
        Last updated -
        1
        2
        JavaScript
        MIT License
        • Linux
        • Apple
      • -
        security
        A
        license
        -
        quality
        A service that extracts and transcribes audio content from videos across 1000+ streaming websites including YouTube, Bilibili, TikTok, and Twitter, supporting multiple transcription providers like Deepgram, Gladia, Speechmatics, and AssemblyAI.
        Last updated -
        5
        Python
        MIT License
        • Linux
        • Apple

      View all related MCP servers

      MCP directory API

      We provide all the information about MCP servers via our MCP API.

      curl -X GET 'https://glama.ai/api/mcp/v1/servers/AIO-2030/mcp-audio'

      If you have feedback or need assistance with the MCP directory API, please join our Discord server