MCP Audio Transcriber
A Dockerized Python tool that implements the Model Context Protocol (MCP) via AssemblyAI's API. Upload or point to an audio file, and receive a structured JSON transcription.
Features
- AssemblyMCP: a concrete MCP implementation that uses AssemblyAI's REST API
- Command-line interface (
app.py
): - Streamlit web UI (
streamlit_app.py
):- Upload local files or paste URLs
- Click Transcribe
- Preview transcript and download JSON
- Docker support for environment consistency and portability
Prerequisites
- Python 3.10+
- An AssemblyAI API key
- ffmpeg (for local decoding, if using local files)
- (Optional) Docker Desktop / Engine
- (Optional) Streamlit (
pip install streamlit
)
🔧 Installation
- Clone the repo
- Create a
.env
- Ensure
.gitignore
contains: - Install Python dependencies
- Install ffmpeg
- Ubuntu/Debian:
sudo apt update && sudo apt install ffmpeg -y
- Windows: download from https://ffmpeg.org and add its
bin/
to your PATH
- Ubuntu/Debian:
Usage
1. CLI Transcription
<input_audio>
: any file or URL supported by AssemblyAI<output_json>
: path for the generated JSON
Example:
2. Streamlit Web UI
- Open http://localhost:8501
- Upload or enter an audio URL
- Click Transcribe
- Download the JSON result
3. Docker
Build the image:
Run it (mounting your data/ folder):
Then inspect:
Windows PowerShell:
Project Structure
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
Whisper 모델을 사용하여 오디오 전사를 위한 모델 컨텍스트 프로토콜을 구현하는 휴대형 Docker화된 Python 도구로, 오디오 파일을 JSON 전사본으로 변환하기 위한 CLI와 웹 UI 인터페이스를 모두 갖추고 있습니다.
Related MCP Servers
- -securityAlicense-qualityEnables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.Last updated -6MIT License
- -securityFlicense-qualityA Model Context Protocol server that provides AI-powered features for the Transcripter project, including tools for searching and summarizing transcriptions and resources for accessing transcription and analysis data.Last updated -652
- -securityFlicense-qualityEnables seamless integration with Typecast API through the Model Context Protocol, allowing clients to manage voices, convert text to speech, and play audio in a standardized way.Last updated -2
Gladia MCPofficial
-securityAlicense-qualityOfficial Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.Last updated -2MIT License