MCP Audio Transcriber
A Dockerized Python tool that implements the Model Context Protocol (MCP) via AssemblyAI's API. Upload or point to an audio file, and receive a structured JSON transcription.
Features
- AssemblyMCP: a concrete MCP implementation that uses AssemblyAI's REST API
- Command-line interface (
app.py
): - Streamlit web UI (
streamlit_app.py
):- Upload local files or paste URLs
- Click Transcribe
- Preview transcript and download JSON
- Docker support for environment consistency and portability
Prerequisites
- Python 3.10+
- An AssemblyAI API key
- ffmpeg (for local decoding, if using local files)
- (Optional) Docker Desktop / Engine
- (Optional) Streamlit (
pip install streamlit
)
🔧 Installation
- Clone the repo
- Create a
.env
- Ensure
.gitignore
contains: - Install Python dependencies
- Install ffmpeg
- Ubuntu/Debian:
sudo apt update && sudo apt install ffmpeg -y
- Windows: download from https://ffmpeg.org and add its
bin/
to your PATH
- Ubuntu/Debian:
Usage
1. CLI Transcription
<input_audio>
: any file or URL supported by AssemblyAI<output_json>
: path for the generated JSON
Example:
2. Streamlit Web UI
- Open http://localhost:8501
- Upload or enter an audio URL
- Click Transcribe
- Download the JSON result
3. Docker
Build the image:
Run it (mounting your data/ folder):
Then inspect:
Windows PowerShell:
Project Structure
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
Ein portables, Dockerized-Python-Tool, das das Model Context Protocol für die Audiotranskription mithilfe von Whisper-Modellen implementiert und sowohl CLI- als auch Web-UI-Schnittstellen zum Konvertieren von Audiodateien in JSON-Transkriptionen bietet.
Related MCP Servers
- -securityAlicense-qualityEnables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.Last updated -6MIT License
- -securityFlicense-qualityA Model Context Protocol server that provides AI-powered features for the Transcripter project, including tools for searching and summarizing transcriptions and resources for accessing transcription and analysis data.Last updated -652
- -securityFlicense-qualityEnables seamless integration with Typecast API through the Model Context Protocol, allowing clients to manage voices, convert text to speech, and play audio in a standardized way.Last updated -2
Gladia MCPofficial
-securityAlicense-qualityOfficial Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.Last updated -2MIT License