Skip to main content
Glama

mcp-podcast-generator

An MCP (Model Context Protocol) server that generates podcast audio from scripts. It runs in Docker, exposes an HTTP endpoint, and provides a generate_podcast tool that:

  • Converts scripts to speech using Google Gemini TTS (gemini-2.5-flash-preview-tts)

  • Supports single-host monologue and dual-host dialogue formats

  • Optionally adds intro/outro music fetched from any HTTPS URL (Cloudflare R2, S3, etc.)

  • Applies EBU R128 loudness normalization via FFmpeg

  • Outputs the final MP3 to a volume-mapped /output folder

Getting Started

Prerequisites

  • Docker and Docker Compose installed (get Docker)

  • A Google AI Studio API key with access to Gemini models (get one here)

  • (Optional) An intro/outro MP3 hosted on any HTTPS URL (Cloudflare R2, S3, etc.)

1. Clone and configure

git clone https://github.com/ivo-toby/mcp-podcast-generator.git
cd mcp-podcast-generator

cp .env.example .env

Open .env and set your API key:

GOOGLE_API_KEY=your_api_key_here

2. Create the output directory

MP3 files are written to ./output on the host (mapped to /output inside the container):

mkdir -p output

3. Start the server

docker compose up

The first run builds the Docker image (a few minutes). On success you'll see:

mcp-podcast-generator  | MCP Podcast Generator listening on port 3000

To run in the background (detached):

docker compose up -d

4. Verify it's running

curl http://localhost:3000/health
# → {"status":"ok"}

Common Docker operations

# View logs (follow mode)
docker compose logs -f

# Stop the server
docker compose down

# Rebuild the image after code changes
docker compose up --build

# Remove containers and volumes (full reset)
docker compose down -v

5. Generate your first podcast

curl -s -X POST http://localhost:3000/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "generate_podcast",
      "arguments": {
        "type": "single",
        "hosts": [{ "name": "Host", "voice": "Kore" }],
        "segments": [
          { "text": "Welcome to my first AI-generated podcast episode." },
          { "text": "Today we explore how easy it is to turn a script into audio." },
          { "text": "Thanks for listening. See you next time!" }
        ],
        "outputFilename": "first-episode.mp3"
      }
    }
  }' | jq .

On success you'll get back the output path and duration:

{
  "success": true,
  "outputPath": "/output/first-episode.mp3",
  "durationSeconds": 18.4,
  "downloadUrl": "http://localhost:3000/output/first-episode.mp3"
}

Your MP3 is available at http://localhost:3000/output/first-episode.mp3 and on disk at ./output/first-episode.mp3.

6. Install in Claude Desktop (optional)

With the container running, add the server to Claude Desktop's MCP configuration.

Find your config file:

OS

Path

macOS

~/Library/Application Support/Claude/claude_desktop_config.json

Windows

%APPDATA%\Claude\claude_desktop_config.json

Linux

~/.config/Claude/claude_desktop_config.json

Add the server:

{
  "mcpServers": {
    "podcast-generator": {
      "command": "npx",
      "args": ["mcp-remote", "http://localhost:3000/mcp"]
    }
  }
}

Claude Desktop doesn't support Streamable HTTP directly — mcp-remote bridges the connection. npx will download it automatically on first run.

If you already have other MCP servers configured, add podcast-generator alongside them inside the existing mcpServers object.

Restart Claude Desktop. The generate_podcast tool will appear in the tools panel. You can now ask Claude to generate a podcast directly:

"Generate a 5-minute dual-host podcast about the future of open source, using Alex (Charon) and Sam (Aoede), with intro music from https://podcast.briefcast.online/assets/music/intro.mp3, and save it as open-source-ep1.mp3"

Claude will call the tool and return the output path when done.

Related MCP server: MCP Server Whisper

Environment Variables

Variable

Required

Default

Description

GOOGLE_API_KEY

Google AI Studio API key (get one here)

OUTPUT_DIR

/output

Directory where MP3 files are written

TEMP_DIR

/tmp/podcast-gen

Temporary processing directory

PORT

3000

HTTP server port

PUBLIC_URL

http://localhost:3000

Base URL used to construct download links returned by the tool. Set this to your public hostname when running behind a reverse proxy or on a remote server.

MCP Endpoint

POST /mcp — Streamable HTTP transport (JSON-RPC 2.0)

GET /health — Health check, returns { "status": "ok" }

Tool: generate_podcast

Input Schema

{
  // "single" = one host monologue, "dual" = two-host dialogue
  type: "single" | "dual";

  // Host configurations (1 for single, exactly 2 for dual)
  hosts: Array<{
    name: string;   // Speaker label, e.g. "Alex"
    voice: string;  // Gemini prebuilt voice name (see list below)
  }>;

  // Script segments
  segments: Array<{
    speaker?: string;  // Must match a host name (required for dual-host)
    text: string;      // The text to speak
  }>;

  // Output filename (written to OUTPUT_DIR)
  outputFilename: string;  // e.g. "episode-2026-04-02.mp3"

  // Optional music (any HTTPS URL: R2, S3, etc.)
  introMusicUrl?: string;
  outroMusicUrl?: string;

  // Audio processing options
  fadeInDuration?: number;   // seconds, default 2
  fadeOutDuration?: number;  // seconds, default 3
  targetLufs?: number;       // LUFS target, default -16
}

Output

{
  "success": true,
  "outputPath": "/output/episode-2026-04-02.mp3",
  "durationSeconds": 245.3,
  "downloadUrl": "http://localhost:3000/output/episode-2026-04-02.mp3"
}

Available Gemini Voices

Voice

Character

Aoede

Warm, storytelling

Charon

Deep, authoritative

Fenrir

Bold, energetic

Kore

Clear, professional

Puck

Bright, conversational

Orbit

Smooth, measured

Perseus

Confident, direct

Tethys

Calm, thoughtful

Vega

Dynamic, expressive

Zubenelgenubi

Distinctive, memorable

Example MCP Calls

Single Host

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "generate_podcast",
    "arguments": {
      "type": "single",
      "hosts": [{ "name": "Alex", "voice": "Charon" }],
      "segments": [
        { "text": "Welcome to Tech Weekly, your daily dose of AI news." },
        { "text": "Today we're covering the latest developments in language models." },
        { "text": "That's all for today. Thanks for listening!" }
      ],
      "outputFilename": "episode-2026-04-02.mp3",
      "introMusicUrl": "https://your-bucket.r2.dev/intro.mp3",
      "outroMusicUrl": "https://your-bucket.r2.dev/outro.mp3"
    }
  }
}

Dual Host

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "generate_podcast",
    "arguments": {
      "type": "dual",
      "hosts": [
        { "name": "Alex", "voice": "Charon" },
        { "name": "Sam",  "voice": "Puck"   }
      ],
      "segments": [
        { "speaker": "Alex", "text": "Welcome back to the show! I'm Alex." },
        { "speaker": "Sam",  "text": "And I'm Sam. Today we're talking about MCP servers." },
        { "speaker": "Alex", "text": "It's a fascinating topic. Let's dive in." },
        { "speaker": "Sam",  "text": "Absolutely. Thanks everyone for listening!" }
      ],
      "outputFilename": "episode-dual-2026-04-02.mp3",
      "introMusicUrl": "https://your-bucket.r2.dev/intro.mp3",
      "outroMusicUrl": "https://your-bucket.r2.dev/outro.mp3",
      "fadeInDuration": 3,
      "fadeOutDuration": 4,
      "targetLufs": -16
    }
  }
}

curl Example

curl -X POST http://localhost:3000/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "generate_podcast",
      "arguments": {
        "type": "single",
        "hosts": [{ "name": "Host", "voice": "Kore" }],
        "segments": [{ "text": "Hello world, this is a test podcast episode." }],
        "outputFilename": "test.mp3"
      }
    }
  }'

Audio Pipeline

Input Script
     │
     ▼
Gemini TTS (gemini-2.5-flash-preview-tts)
     │  PCM 24kHz 16-bit mono (base64)
     ▼
FFmpeg: PCM → MP3 (192kbps libmp3lame)
     │
     ▼
EBU R128 Normalization (two-pass, target: -16 LUFS)
     │
     ├── [intro music URL] → download → fade-in
     │
     ├── TTS audio (normalized)
     │
     └── [outro music URL] → download → fade-out
          │
          ▼
     FFmpeg concat (re-encoded, avoids frame boundary issues)
          │
          ▼
     Final EBU R128 normalization pass
          │
          ▼
     /output/episode.mp3

Using with an MCP Client

Add to your MCP client configuration:

{
  "mcpServers": {
    "podcast-generator": {
      "command": "npx",
      "args": ["mcp-remote", "http://localhost:3000/mcp"]
    }
  }
}

Development

npm install
npm run dev   # runs tsx src/index.ts directly

Requires ffmpeg and ffprobe installed locally for development.

  • briefcast — Full podcast pipeline that uses this server for audio generation

F
license - not found
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ivo-toby/mcp-podcast-generator'

If you have feedback or need assistance with the MCP directory API, please join our Discord server