Supports containerized deployment of the MCP server with Docker build and run capabilities.
Manages API keys and configuration securely through environment variables stored in a .env file.
Allows source code management and versioning for the MCP plugin repository.
Hosts the repository and provides source code access via git clone from the AIO-2030 organization.
Provides the runtime environment for the MCP server implementation.
MCP-Audio Plugin
mcp-audio
is an AIO-2030 compliant MCP plugin that performs voice-to-text transcription using the Audio speech recognition API.
It exposes the identify_voice
method via both multipart/form-data
and base64
formats, supports the AIO tools.call
protocol, and returns JSON-RPC structured outputs.
Features
- Fully AIO-compliant MCP plugin (
/tools.call
,/help
) - Converts
.wav
/.mp3
audio files to transcripts using SiliconFlow - API key managed securely via
.env
file - Docker-compatible and minimal dependencies
- Registration-ready for AIO endpoint registry
Setup (Local)
1. Clone and Install
2. Add .env file
Set your audio URL and API key:
3. Run the MCP server
4. Docker
4.1 Build and Run
API Overview
POST /api/v1/mcp/voice_model
Upload audio file directly. Response:
POST /api/v1/mcp/tools.call (AIO Protocol)
JSON-RPC format with base64-encoded audio. Response:
GET /api/v1/mcp/help
Auto-serves contents of mcp_audio_registration.json. Used by Queen AI for MCP discovery and service indexing.
Testing Tools
Base64 Voice Test
Health Check
MCP Registration (to AIO Endpoint Canister)
Requires jq, dfx, and a running endpoint_registry canister.
This server cannot be installed
A voice-to-text transcription service that converts audio files to transcripts using SiliconFlow, supporting both multipart/form-data and base64 formats.
Related MCP Servers
- AsecurityAlicenseAqualityIntegrates with ElevenLabs text-to-speech API.Last updated -636PythonMIT License
- -securityFlicense-qualityProvides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.Last updated -2Python
- AsecurityAlicenseAqualityA MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.Last updated -12JavaScriptMIT License
- -securityAlicense-qualityA service that extracts and transcribes audio content from videos across 1000+ streaming websites including YouTube, Bilibili, TikTok, and Twitter, supporting multiple transcription providers like Deepgram, Gladia, Speechmatics, and AssemblyAI.Last updated -5PythonMIT License