MCP 音频转录器
一个 Dockerized Python 工具,通过 AssemblyAI 的 API 实现模型上下文协议 (MCP)。上传或指向音频文件,即可接收结构化的 JSON 转录。
特征
AssemblyMCP :使用 AssemblyAI 的 REST API 的具体 MCP 实现
命令行界面(
app.py
):python app.py <input_audio> <output_json>Streamlit 网页用户界面(
streamlit_app.py
):上传本地文件或粘贴 URL
点击转录
预览成绩单并下载 JSON
Docker 对环境一致性和可移植性的支持
先决条件
Python 3.10+
AssemblyAI API 密钥
ffmpeg(用于本地解码,如果使用本地文件)
(可选)Docker 桌面/引擎
(可选)Streamlit(
pip install streamlit
)
🔧 安装
克隆 repo
git clone https://github.com/ShreyasTembhare/MCP---Audio-Transcriber.git cd MCP---Audio-Transcriber创建
.env
ASSEMBLYAI_API_KEY=your_assemblyai_api_key_here确保
.gitignore
包含:.env安装 Python 依赖项
pip install --upgrade pip pip install -r requirements.txt安装 ffmpeg
Ubuntu/Debian:
sudo apt update && sudo apt install ffmpeg -y
Windows:从https://ffmpeg.org下载并将其
bin/
添加到您的 PATH
用法
1. CLI 转录
<input_audio>
:AssemblyAI 支持的任何文件或 URL<output_json>
:生成的 JSON 的路径
例子:
2. Streamlit Web UI
上传或输入音频 URL
点击转录
下载 JSON 结果
3. Docker
构建图像:
运行它(挂载你的数据/文件夹):
然后检查:
Windows PowerShell:
项目结构
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
一种可移植的 Dockerized Python 工具,使用 Whisper 模型实现用于音频转录的模型上下文协议,具有 CLI 和 Web UI 界面,用于将音频文件转换为 JSON 转录。
Related MCP Servers
- -securityAlicense-qualityEnables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.Last updated -6MIT License
- -securityFlicense-qualityA Model Context Protocol server that provides AI-powered features for the Transcripter project, including tools for searching and summarizing transcriptions and resources for accessing transcription and analysis data.
- -securityFlicense-qualityEnables seamless integration with Typecast API through the Model Context Protocol, allowing clients to manage voices, convert text to speech, and play audio in a standardized way.Last updated -2
Gladia MCPofficial
-securityAlicense-qualityOfficial Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.