Integrations
Provides audio transcription capabilities using OpenAI's Speech-to-Text API, allowing conversion of audio files to text with options for language specification and saving transcriptions to files.
OpenAI Speech-to-Text transcriptions MCP Server
A MCP server that provides audio transcription capabilities using OpenAI's API.
Installation
Setup
- Clone the repository:
- Install dependencies:
- Build the server:
- Set up your OpenAI API key in your environment variables.
- Add the server configuration to your environment:
Replace /path/to/audio-transcriber-mcp
with the actual path where you cloned the repository.
Features
Tools
transcribe_audio
- Transcribe audio files using OpenAI's API- Takes filepath as a required parameter
- Optional parameters:
- save_to_file: Boolean to save transcription to a file
- language: ISO-639-1 language code (e.g., "en", "es")
License
This MCP server is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License. For more details, please see the LICENSE file in the project repository.
You must be authenticated.
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Tools
A MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.
Related Resources
Related MCP Servers
- -securityAlicense-qualityEnables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.Last updated -4PythonMIT License
- -security-license-qualityAn MCP server that enables LLMs to generate spoken audio from text using OpenAI's Text-to-Speech API, supporting various voices, models, and audio formats.Last updated -1JavaScriptMIT License
- AsecurityAlicenseAqualityA Model Context Protocol server that enables AI models to generate and play high-quality text-to-speech audio through your device's native audio system using Rime's voice synthesis API.Last updated -11764JavaScriptThe Unlicense
ElevenLabs MCP Serverofficial
AsecurityAlicenseAqualityAn official Model Context Protocol (MCP) server that enables AI clients to interact with ElevenLabs' Text to Speech and audio processing APIs, allowing for speech generation, voice cloning, audio transcription, and other audio-related tasks.Last updated -19633PythonMIT License