Provides access to Deepgram's speech recognition and text-to-speech capabilities, including audio transcription, natural speech generation, audio analysis with sentiment detection, speaker diarization, and language detection across multiple specialized models.
Deepgram MCP Server
A Model Context Protocol (MCP) server that provides access to Deepgram's speech recognition and text-to-speech capabilities.
Features
- Audio Transcription: Convert audio to text with high accuracy
- Text-to-Speech: Generate natural-sounding speech from text
- Audio Analysis: Extract insights like sentiment, topics, intents, and entities
- Speaker Diarization: Identify different speakers in audio
- Language Detection: Automatically detect the language of audio
- Multiple Models: Support for various Deepgram models optimized for different use cases
Installation
- Clone this repository
- Install dependencies:
- Copy the environment file and add your Deepgram API key:
- Build the project:
Usage
HTTP Transport (Recommended for Production)
The server will start on port 8080 by default. You can specify a different port:
STDIO Transport (For Development)
Available Tools
1. transcribe_audio
Transcribe audio to text with various options for customization.
Parameters:
audioUrl
oraudioData
: Audio source (URL or base64)model
: Deepgram model to use (default: "nova-2-general")language
: Language code (default: "en")punctuate
: Add punctuation (default: true)diarize
: Speaker identification (default: false)sentiment
: Sentiment analysis (default: false)- And many more options...
2. text_to_speech
Convert text to speech using Deepgram's TTS models.
Parameters:
text
: Text to convert to speech (required)model
: TTS model to use (default: "aura-asteria-en")voice
: Voice selectionformat
: Output format (default: "mp3")speed
: Speech speed (default: 1.0)
3. analyze_audio
Perform advanced audio analysis including sentiment, topics, intents, and entities.
Parameters:
audioUrl
oraudioData
: Audio sourcefeatures
: Analysis features to enablemodel
: Model for analysis
4. get_models
Get information about available Deepgram models.
Parameters:
model_type
: Filter by model type ("transcription", "tts", or "all")
Client Configuration
For MCP clients, use this configuration:
Development
API Key
Get your Deepgram API key from Deepgram Console.
Agno Integration
This MCP server also includes integration with Agno, a high-performance runtime for multi-agent systems.
Agno Tests
License
MIT
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Enables speech-to-text transcription, text-to-speech synthesis, and audio analysis using Deepgram's AI models. Supports features like speaker diarization, sentiment analysis, language detection, and various audio processing capabilities.