Skip to main content
Glama

MCP Audio Transcriber

MCP Audio Transcriber

A Dockerized Python tool that implements the Model Context Protocol (MCP) via AssemblyAI's API. Upload or point to an audio file, and receive a structured JSON transcription.

Features

  • AssemblyMCP: a concrete MCP implementation that uses AssemblyAI's REST API
  • Command-line interface (app.py):
    python app.py <input_audio> <output_json>
  • Streamlit web UI (streamlit_app.py):
    • Upload local files or paste URLs
    • Click Transcribe
    • Preview transcript and download JSON
  • Docker support for environment consistency and portability

Prerequisites

  • Python 3.10+
  • An AssemblyAI API key
  • ffmpeg (for local decoding, if using local files)
  • (Optional) Docker Desktop / Engine
  • (Optional) Streamlit (pip install streamlit)

🔧 Installation

  1. Clone the repo
    git clone https://github.com/ShreyasTembhare/MCP---Audio-Transcriber.git cd MCP---Audio-Transcriber
  2. Create a .env
    ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here
  3. Ensure .gitignore contains:
    .env
  4. Install Python dependencies
    pip install --upgrade pip pip install -r requirements.txt
  5. Install ffmpeg
    • Ubuntu/Debian: sudo apt update && sudo apt install ffmpeg -y
    • Windows: download from https://ffmpeg.org and add its bin/ to your PATH

Usage

1. CLI Transcription

python app.py <input_audio> <output_json>
  • <input_audio>: any file or URL supported by AssemblyAI
  • <output_json>: path for the generated JSON

Example:

python app.py data/input.ogg data/output.json cat data/output.json

2. Streamlit Web UI

streamlit run streamlit_app.py
  • Open http://localhost:8501
  • Upload or enter an audio URL
  • Click Transcribe
  • Download the JSON result

3. Docker

Build the image:

docker build -t mcp-transcriber .

Run it (mounting your data/ folder):

docker run --rm \ -e ASSEMBLYAI_API_KEY="$ASSEMBLYAI_API_KEY" \ -v "$(pwd)/data:/data" \ mcp-transcriber:latest \ /data/input.ogg /data/output.json

Then inspect:

ls data/output.json cat data/output.json

Windows PowerShell:

docker run --rm ` -e ASSEMBLYAI_API_KEY=$env:ASSEMBLYAI_API_KEY ` -v "${PWD}\data:/data" ` mcp-transcriber:latest ` /data/input.ogg /data/output.json

Project Structure

MCP-Audio-Transcriber/ ├── app.py # CLI entrypoint (AssemblyMCP only) ├── mcp.py # ModelContextProtocol + AssemblyMCP ├── streamlit_app.py # Streamlit interface ├── requirements.txt # assemblyai, python-dotenv, streamlit, etc. ├── Dockerfile # builds the container ├── .gitignore # ignores .env, __pycache__, etc. ├── LICENSE # MIT license └── data/ # sample input and output ├── input.ogg └── output.json
-
security - not tested
A
license - permissive license
-
quality - not tested

local-only server

The server can only run on the client's local machine because it depends on local resources.

A portable, Dockerized Python tool that implements Model Context Protocol for audio transcription using Whisper models, featuring both CLI and web UI interfaces for converting audio files to JSON transcriptions.

  1. Features
    1. Prerequisites
      1. 🔧 Installation
        1. Usage
          1. 1. CLI Transcription
          2. 2. Streamlit Web UI
          3. 3. Docker
        2. Project Structure

          Related MCP Servers

          • -
            security
            A
            license
            -
            quality
            Enables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.
            Last updated -
            6
            MIT License
          • -
            security
            F
            license
            -
            quality
            A Model Context Protocol server that provides AI-powered features for the Transcripter project, including tools for searching and summarizing transcriptions and resources for accessing transcription and analysis data.
            Last updated -
            580
          • -
            security
            F
            license
            -
            quality
            Enables seamless integration with Typecast API through the Model Context Protocol, allowing clients to manage voices, convert text to speech, and play audio in a standardized way.
            Last updated -
            2
          • -
            security
            A
            license
            -
            quality
            Official Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.
            Last updated -
            2
            MIT License

          View all related MCP servers

          MCP directory API

          We provide all the information about MCP servers via our MCP API.

          curl -X GET 'https://glama.ai/api/mcp/v1/servers/ShreyasTembhare/MCP---Audio-Transcriber'

          If you have feedback or need assistance with the MCP directory API, please join our Discord server