MCP Audio Transcriber

Ein Docker-Python-Tool, das das Model Context Protocol (MCP) über die API von AssemblyAI implementiert. Laden Sie eine Audiodatei hoch oder verweisen Sie darauf, und erhalten Sie eine strukturierte JSON-Transkription.

Merkmale

AssemblyMCP : eine konkrete MCP-Implementierung, die die REST-API von AssemblyAI verwendet
Befehlszeilenschnittstelle ( app.py ):
python app.py <input_audio> <output_json>
Streamlit-Web-Benutzeroberfläche ( streamlit_app.py ):
- Laden Sie lokale Dateien hoch oder fügen Sie URLs ein
- Klicken Sie auf Transkribieren
- Transkriptvorschau und JSON herunterladen
Docker-Unterstützung für Umgebungskonsistenz und Portabilität

Voraussetzungen

Python 3.10+
Ein AssemblyAI-API-Schlüssel
ffmpeg (für lokale Dekodierung, wenn lokale Dateien verwendet werden)
(Optional) Docker Desktop / Engine
(Optional) Streamlit ( pip install streamlit )

🔧 Installation

Klonen Sie das Repo
git clone https://github.com/ShreyasTembhare/MCP---Audio-Transcriber.git cd MCP---Audio-Transcriber
Erstellen Sie eine .env
ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here
Stellen Sie sicher, dass .gitignore Folgendes enthält:
.env
Installieren Sie Python-Abhängigkeiten
pip install --upgrade pip pip install -r requirements.txt
Installieren Sie ffmpeg
- Ubuntu/Debian: sudo apt update && sudo apt install ffmpeg -y
- Windows: Laden Sie es von https://ffmpeg.org herunter und fügen Sie bin/ zu Ihrem PATH hinzu

Verwendung

1. CLI-Transkription

python app.py <input_audio> <output_json>

<input_audio> : jede von AssemblyAI unterstützte Datei oder URL
<output_json> : Pfad für das generierte JSON

Beispiel:

python app.py data/input.ogg data/output.json
cat data/output.json

2. Streamlit-Web-Benutzeroberfläche

streamlit run streamlit_app.py

Öffnen Sie http://localhost:8501
Audio-URL hochladen oder eingeben
Klicken Sie auf Transkribieren
Laden Sie das JSON-Ergebnis herunter

3. Docker

Erstellen Sie das Image:

docker build -t mcp-transcriber .

Führen Sie es aus (mounten Sie Ihre Daten/Ordner):

docker run --rm \
  -e ASSEMBLYAI_API_KEY="$ASSEMBLYAI_API_KEY" \
  -v "$(pwd)/data:/data" \
  mcp-transcriber:latest \
  /data/input.ogg /data/output.json

Dann prüfen Sie:

ls data/output.json
cat data/output.json

Windows PowerShell:

docker run --rm `
  -e ASSEMBLYAI_API_KEY=$env:ASSEMBLYAI_API_KEY `
  -v "${PWD}\data:/data" `
  mcp-transcriber:latest `
  /data/input.ogg /data/output.json

Projektstruktur

MCP-Audio-Transcriber/
├── app.py               # CLI entrypoint (AssemblyMCP only)
├── mcp.py               # ModelContextProtocol + AssemblyMCP
├── streamlit_app.py     # Streamlit interface
├── requirements.txt     # assemblyai, python-dotenv, streamlit, etc.
├── Dockerfile           # builds the container
├── .gitignore           # ignores .env, __pycache__, etc.
├── LICENSE              # MIT license
└── data/                # sample input and output
    ├── input.ogg
    └── output.json

This server cannot be installed

security - not tested

license - permissive license

quality - not tested

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

Ein portables, Dockerized-Python-Tool, das das Model Context Protocol für die Audiotranskription mithilfe von Whisper-Modellen implementiert und sowohl CLI- als auch Web-UI-Schnittstellen zum Konvertieren von Audiodateien in JSON-Transkriptionen bietet.

Related MCP Servers

Kokoro TTS MCP Server
giannisanni
-
security
F
license
-
quality
Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
Last updated -
2
Python
Sonic Pi MCP
abhishekjairath
-
security
A
license
-
quality
A Model Context Protocol server that allows AI assistants like Claude and Cursor to create music and control Sonic Pi programmatically through OSC messages.
Last updated -
10
7
TypeScript
MIT License
Audio Transcriber MCP Server
Ichigo3766
A
security
A
license
A
quality
A MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.
Last updated -
1
2
JavaScript
MIT License
Rime MCP
MatthewDailey
A
security
A
license
A
quality
A Model Context Protocol server that enables AI models to generate and play high-quality text-to-speech audio through your device's native audio system using Rime's voice synthesis API.
Last updated -
1
15
4
JavaScript
The Unlicense

View all related MCP servers

MCP Audio Transcriber