Skip to main content
Glama

Open-Source MCP servers

Production-ready MCP servers that extend AI capabilities through file access, database connections, API integrations, and other contextual services.

7,731 servers. Last updated -

Matching MCP tools:

Matching MCP servers:

  • A
    security
    A
    license
    A
    quality
    A MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.
    Last updated -
    1
    4
    7
    JavaScript
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    Enables Claude and other AI assistants to interact with your computer's audio system, allowing for recording from microphones and playing audio through speakers.
    Last updated -
    3
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    Provides powerful video and audio editing capabilities through FFmpeg, enabling AI assistants to perform professional-grade operations including format conversion, trimming, overlays, transitions, and advanced audio processing.
    Last updated -
    17
    Python
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    Provides tools for image, audio, and video recognition using Google's Gemini AI through the Model Context Protocol.
    Last updated -
    3
    9
    TypeScript
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    A portable, Dockerized Python tool that implements Model Context Protocol for audio transcription using Whisper models, featuring both CLI and web UI interfaces for converting audio files to JSON transcriptions.
    Last updated -
    Python
    MIT License
    • Linux

Interested in MCP?

Join the MCP community for support and updates.

RedditDiscord
  • -
    security
    A
    license
    -
    quality
    A voice-to-text transcription service that converts audio files to transcripts using SiliconFlow, supporting both multipart/form-data and base64 formats.
    Last updated -
    7
    Python
    Apache 2.0
  • A
    security
    F
    license
    A
    quality
    A Node.js server that enables video manipulation through natural language requests, including resizing videos to different resolutions (360p to 1080p) and extracting audio in various formats (MP3, AAC, WAV, OGG).
    Last updated -
    4
    94
    27
    TypeScript
    • Apple
    • Linux
  • -
    security
    A
    license
    -
    quality
    A server that allows Claude to control audio playback on your computer, supporting MP3, WAV, and OGG files with features like play, list, and stop commands.
    Last updated -
    3
    Python
    MIT License
    • Apple
    • Linux
  • A
    security
    F
    license
    A
    quality
    A Model Context Protocol server that enables AI assistants to generate images, text, and audio through the Pollinations APIs without requiring authentication.
    Last updated -
    7
    564
    27
    JavaScript
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    A document conversion server that transforms various file formats (PDFs, documents, images, audio, web content) to Markdown with improved multilingual and UTF-8 support.
    Last updated -
    10
    2
    9
    TypeScript
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    A server that generates MP3 audio files from text using Kokoro TTS technology with optional S3 upload capabilities.
    Last updated -
    1
    51
    Python
    Apache 2.0
    • Apple
  • A
    security
    A
    license
    A
    quality
    Converts various file types and web content to Markdown format. It provides a set of tools to transform PDFs, images, audio files, web pages, and more into easily readable and shareable Markdown text.
    Last updated -
    10
    11
    1,966
    TypeScript
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables real-time interaction with Ableton Live, allowing AI assistants to control song creation, track management, clip operations, and audio recording workflows.
    Last updated -
    23
    11
    33
    TypeScript
    MIT License
    • Linux
    • Apple
  • A
    security
    F
    license
    A
    quality
    A Model Context Protocol server that allows AI assistants to generate music through the Suno API, supporting custom lyrics and style inputs or inspiration-based creation.
    Last updated -
    1
    9
    JavaScript
  • A
    security
    A
    license
    A
    quality
    A Node.js implementation of the Kagi Model Context Protocol server that enables Claude AI to search the web and summarize documents, videos, and audio using Kagi's APIs.
    Last updated -
    2
    1
    JavaScript
    MIT License
    • Apple
  • -
    security
    F
    license
    -
    quality
    An MCP server that downloads videos/extracts audio from various platforms like YouTube, Bilibili, and TikTok, then transcribes them to text using OpenAI's Whisper model.
    Last updated -
    5
    Python
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.
    Last updated -
    3
    184
    339
    TypeScript
    MIT License
    • Apple
  • A
    security
    F
    license
    A
    quality
    An MCP server designed to work with FFmpeg for media processing tasks, offering enhanced performance and secure communication for handling media processing requests.
    Last updated -
    2
    14
    12
    TypeScript
  • A
    security
    A
    license
    A
    quality
    Model Context Protocol server that enables interaction with Mobvoi's Text to Speech and Voice Clone APIs, allowing MCP clients like Cursor, Claude Desktop, and Cline to generate speech and clone voices.
    Last updated -
    4
    1
    Python
    MIT License
    • Apple
    • Linux
  • -
    security
    A
    license
    -
    quality
    An MCP (Model Context Protocol) server that provides seamless integration between Fish Audio's Text-to-Speech API and LLMs like Claude, enabling natural language-driven speech synthesis.
    Last updated -
    1
    6
    TypeScript
    MIT License
  • A
    security
    F
    license
    A
    quality
    Create videos and images using Luma AI, this MCP server handles all API functionality for Luma Dream Machine from Claude Desktop.
    Last updated -
    10
    3
    Python
    • Apple
  • A
    security
    A
    license
    A
    quality
    An official Model Context Protocol (MCP) server that enables AI clients to interact with ElevenLabs' Text to Speech and audio processing APIs, allowing for speech generation, voice cloning, audio transcription, and other audio-related tasks.
    Last updated -
    19
    843
    Python
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    An AI-powered automation bridge for Adobe Premiere Pro that enables controlling video edits with natural language and automating workflows through Claude or other AI agents.
    Last updated -
    37
    3
    TypeScript
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables AI agents to join and interact with online meetings (Zoom and Google Meet), capturing transcripts and recordings to generate meeting summaries.
    Last updated -
    3
    6
    TypeScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    An MCP server that enables LLMs to generate spoken audio from text using OpenAI's Text-to-Speech API, supporting various voices, models, and audio formats.
    Last updated -
    0
    1
    JavaScript
    MIT License
  • -
    security
    F
    license
    -
    quality
    Converts various file types (documents, images, audio, web content) to markdown format without requiring Docker, supporting PDF, Word, Excel, PowerPoint, images, audio files, web URLs, and more.
    Last updated -
    96
    6
    JavaScript
    • Apple
    • Linux
  • A
    security
    A
    license
    A
    quality
    A server that enables Claude Desktop to generate images using Google's Gemini AI models through the Model Context Protocol (MCP).
    Last updated -
    7
    13
    JavaScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    Official Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.
    Last updated -
    2
    Python
    MIT License
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables AI models to generate and play high-quality text-to-speech audio through your device's native audio system using Rime's voice synthesis API.
    Last updated -
    1
    2
    8
    JavaScript
    The Unlicense
    • Apple
    • Linux
  • A
    security
    A
    license
    A
    quality
    A server enabling integration between KoboldAI's text generation capabilities and MCP-compatible applications, with features like chat completion, Stable Diffusion, and OpenAI-compatible API endpoints.
    Last updated -
    20
    418
    4
    JavaScript
    MIT License