Which integrations are available for this server?

Uses Google's Gemini AI models to transcribe audio files into text, with multiple model options available for different quality and speed requirements. Stores audio transcriptions in a Supabase database with pgvector for semantic search capabilities, enabling natural language queries across transcribed audio content.

How do I use MCP Audio RAG Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@MCP Audio RAG Server What were the main points discussed about the budget in yesterday's meeting recording?" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

MCP Audio RAG Server

Transform your audio files into a searchable knowledge base using AI. Ask Claude questions about your meetings, podcasts, lectures, or any audio content.

What is this?

This is an MCP (Model Context Protocol) server that lets you:

Transcribe any audio file using Google's Gemini AI
Store the transcriptions in a searchable database
Search through all your audio content using natural language

Once set up, you can simply ask Claude things like:

"What did they discuss about the budget in my meeting recording?"
"Find mentions of machine learning in my podcast collection"
"What were the key points from yesterday's lecture?"

How It Works

┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Audio File │ ──▶ │ Gemini │ ──▶ │ Chunking │ ──▶ │ Supabase │ │ (.mp3, etc) │ │ Transcribe │ │ + Embedding │ │ (pgvector) │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ Claude │ ◀── │ Results │ ◀── │ Search │ ◀──────────┘ │ Response │ │ + Snippets │ │ Query │ └─────────────┘ └─────────────┘ └─────────────┘

Quick Start

Prerequisites

Node.js 18+ - Download here
Gemini API Key - Get one free
Supabase Account - Sign up free

Step 1: Clone & Install

git clone https://github.com/matheusslg/mcp-audio-rag.git cd mcp-audio-rag npm install

Step 2: Set Up Supabase Database

Create a new project at supabase.com
Go to SQL Editor in your dashboard
Paste and run the contents of supabase/schema.sql

Step 3: Get Your API Keys

Supabase (Settings → API):

Copy Project URL → SUPABASE_URL
Copy service_role key → SUPABASE_SERVICE_KEY

Google AI Studio:

Create key at aistudio.google.com/apikey → GEMINI_API_KEY

Step 4: Configure

cp .env.example .env

Edit .env:

GEMINI_API_KEY=your-key-here SUPABASE_URL=https://your-project.supabase.co SUPABASE_SERVICE_KEY=your-service-role-key

Step 5: Add to Claude

For Claude Code CLI (~/.claude.json):

{ "mcpServers": { "audio-rag": { "command": "npx", "args": ["tsx", "/full/path/to/mcp-audio-rag/src/server.ts"], "env": { "GEMINI_API_KEY": "your-key", "SUPABASE_URL": "https://your-project.supabase.co", "SUPABASE_SERVICE_KEY": "your-service-role-key" } } } }

For Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json on Mac):

Same config as above.

Usage

Transcribe Audio

Just tell Claude to transcribe a file:

Transcribe /path/to/meeting.mp3

Want to use a specific model? Just ask:

Transcribe /path/to/lecture.m4a using gemini-2.5-pro

Search Your Audio

Ask natural questions:

What did they say about the project timeline? Search for mentions of "budget" in my recordings Find discussions about AI in my podcasts

Manage Your Library

List all my transcribed audio files Delete the recording from last week Get the full transcript of meeting.mp3 Summarize the podcast episode

Available Models

Model	Best For
`gemini-2.5-flash`	Default - Fast & accurate, great balance
`gemini-2.5-flash-lite`	Fastest, cheapest - good for bulk processing
`gemini-2.5-pro`	Best quality - complex audio, multiple speakers
`gemini-3-pro-preview`	Newest - cutting edge capabilities
`gemini-2.0-flash`	Reliable - previous generation
`gemini-2.0-flash-lite`	Fast - previous generation

Supported Audio Formats

.mp3 .mp4 .m4a .wav .webm .mpeg .mpga

Available Tools

Tool	Description
`ingest_audio`	Transcribe and store an audio file
`search_transcripts`	Search through your audio using natural language
`list_transcripts`	List all transcribed audio files
`get_full_transcript`	Get the complete transcript of a file
`summarize_audio`	Generate an AI summary of a transcript
`delete_transcript`	Remove a transcribed file from the database

Troubleshooting

Problem	Solution
"No relevant segments found"	Try rephrasing your search, or check if audio was ingested
"Missing environment variable"	Check your `.env` file or Claude config has all 3 keys
Supabase errors	Make sure you're using `service_role` key, not `anon` key
Slow transcription	Use `gemini-2.5-flash-lite` for faster processing

Support This Project

If this project saved you time or helped you out, consider buying me a coffee!

License

MIT - Use it however you want!

MCP Audio RAG Server

MCP Audio RAG Server

What is this?

How It Works

Quick Start

Prerequisites

Step 1: Clone & Install

Step 2: Set Up Supabase Database

Step 3: Get Your API Keys

Step 4: Configure

Step 5: Add to Claude

Usage

Transcribe Audio

Search Your Audio

Manage Your Library

Available Models

Supported Audio Formats

Available Tools

Troubleshooting

Support This Project

License

Resources

New MCP Servers

Latest Blog Posts

MCP directory API