Enables containerized deployment of the transcription service, making it portable and providing a consistent runtime environment.
Provides audio file processing capabilities, allowing the transcription service to handle various audio formats like .wav, .mp3, .ogg, and .m4a.
Integrates with OpenAI's Whisper models to provide high-quality, multi-language audio transcription with options for different model sizes.
Offers a web-based user interface for uploading audio files, selecting model parameters, previewing transcriptions, and downloading results.
MCP Audio Transcriber
A Dockerized Python tool that implements the Model Context Protocol (MCP) via AssemblyAI's API. Upload or point to an audio file, and receive a structured JSON transcription.
Features
AssemblyMCP: a concrete MCP implementation that uses AssemblyAI's REST API
Command-line interface (
app.py
):python app.py <input_audio> <output_json>Streamlit web UI (
streamlit_app.py
):Upload local files or paste URLs
Click Transcribe
Preview transcript and download JSON
Docker support for environment consistency and portability
Prerequisites
Python 3.10+
An AssemblyAI API key
ffmpeg (for local decoding, if using local files)
(Optional) Docker Desktop / Engine
(Optional) Streamlit (
pip install streamlit
)
🔧 Installation
Clone the repo
git clone https://github.com/ShreyasTembhare/MCP---Audio-Transcriber.git cd MCP---Audio-TranscriberCreate a
.env
ASSEMBLYAI_API_KEY=your_assemblyai_api_key_hereEnsure
.gitignore
contains:.envInstall Python dependencies
pip install --upgrade pip pip install -r requirements.txtInstall ffmpeg
Ubuntu/Debian:
sudo apt update && sudo apt install ffmpeg -y
Windows: download from https://ffmpeg.org and add its
bin/
to your PATH
Usage
1. CLI Transcription
<input_audio>
: any file or URL supported by AssemblyAI<output_json>
: path for the generated JSON
Example:
2. Streamlit Web UI
Upload or enter an audio URL
Click Transcribe
Download the JSON result
3. Docker
Build the image:
Run it (mounting your data/ folder):
Then inspect:
Windows PowerShell:
Project Structure
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
A portable, Dockerized Python tool that implements Model Context Protocol for audio transcription using Whisper models, featuring both CLI and web UI interfaces for converting audio files to JSON transcriptions.
Related MCP Servers
- -securityAlicense-qualityEnables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.Last updated -6MIT License
- -securityFlicense-qualityA Model Context Protocol server that provides AI-powered features for the Transcripter project, including tools for searching and summarizing transcriptions and resources for accessing transcription and analysis data.
- -securityFlicense-qualityEnables seamless integration with Typecast API through the Model Context Protocol, allowing clients to manage voices, convert text to speech, and play audio in a standardized way.Last updated -2
Gladia MCPofficial
-securityAlicense-qualityOfficial Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.