Deepgram MCP Server

README.md•4.64 KiB

# Deepgram MCP Server A Model Context Protocol (MCP) server that provides access to Deepgram's speech recognition and text-to-speech capabilities. ## Features - **Audio Transcription**: Convert audio to text with high accuracy - **Text-to-Speech**: Generate natural-sounding speech from text with automatic compression - **Audio Analysis**: Extract insights like sentiment, topics, intents, and entities - **Speaker Diarization**: Identify different speakers in audio - **Language Detection**: Automatically detect the language of audio - **Multiple Models**: Support for various Deepgram models optimized for different use cases - **Smart Audio Compression**: Automatically compresses generated audio files for efficient transfer ## Installation 1. Clone this repository 2. Install dependencies: ```bash npm install ``` 3. Copy the environment file and add your Deepgram API key: ```bash cp env.example .env # Edit .env and add your DEEPGRAM_API_KEY, OPENAI_API_KEY or GROQ_API_KEY (whatever you want to use) ``` 4. Build the project: ```bash npm run build ``` ## Usage ### HTTP Transport (Recommended for Production) ```bash npm start # or node dist/index.js ``` The server will start on port 8080 by default. You can specify a different port: ```bash node dist/index.js --port 8081 ``` ### STDIO Transport (For Development) ```bash npm run start:stdio # or node dist/index.js --stdio --port 8081 ``` ## Available Tools ### 1. transcribe_audio Transcribe audio to text with various options for customization. **Parameters:** - `audioUrl` or `audioData`: Audio source (URL or base64) - `model`: Deepgram model to use (default: "nova-2-general") - `language`: Language code (default: "en") - `punctuate`: Add punctuation (default: true) - `diarize`: Speaker identification (default: false) - `sentiment`: Sentiment analysis (default: false) - And many more options... ### 2. text_to_speech Convert text to speech using Deepgram's TTS models with automatic compression. **Parameters:** - `text`: Text to convert to speech (required) - `model`: TTS model to use (default: "aura-asteria-en") - `voice`: Voice selection - `format`: Output format (default: "mp3") - `speed`: Speech speed (default: 1.0) **Output:** - Original audio file saved to `generated_audio/` folder - Compressed audio data saved to `compressed_audio/` folder - Response includes file paths and compression metadata ### 3. analyze_audio Perform advanced audio analysis including sentiment, topics, intents, and entities. **Parameters:** - `audioUrl` or `audioData`: Audio source - `features`: Analysis features to enable - `model`: Model for analysis ### 4. get_models Get information about available Deepgram models. **Parameters:** - `model_type`: Filter by model type ("transcription", "tts", or "all") ## Client Configuration For MCP clients, use this configuration: ```json { "mcpServers": { "deepgram": { "url": "http://localhost:8080/mcp" } } } ``` ## Development ```bash # Watch mode for development npm run watch # Development with STDIO npm run dev:stdio # Development with HTTP npm run dev ``` ## API Key Get your Deepgram API key from [Deepgram Console](https://console.deepgram.com/). ## Audio Compression System The TTS functionality includes an intelligent compression system that: - **Automatically compresses** generated audio files using gzip compression - **Saves compressed data** to separate files to avoid large agent responses - **Provides decompression tools** for easy audio file extraction - **Maintains quality** while reducing file sizes by 2-4x ### File Structure ``` generated_audio/ # Original audio files ├── tts_2025-01-16T...mp3 compressed_audio/ # Compressed audio data ├── compressed_audio_2025-01-16T...json decompressed_audio/ # Decompressed audio files (after extraction) ├── decompressed_2025-01-16T...mp3 ``` ### Decompression Tools **Python Script (Recommended):** ```bash python decompress_audio.py <response_file_or_compressed_file> ``` **Node.js Script:** ```bash npm run decompress <compressed_data_file> ``` ## Agno Integration This MCP server also includes integration with [Agno](https://docs.agno.com/introduction), a high-performance runtime for multi-agent systems. ### Agno Tests ```bash # Text-to-Speech test (saves audio to generated_audio/ and compressed_audio/) npm run test:agno:tts # Speech-to-Text test (transcribes sample audio) npm run test:agno:stt ``` The TTS test will: 1. Generate audio with automatic compression 2. Save the response to `tts_response.json` 3. Decompress the audio file to `generated_audio/` ## License MIT ## Developer - Dheeraj Mudireddy (meetdheerajreddy@gmail.com)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/reddheeraj/Deepgram-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•4.64 KiB