MCP Servers for Multimedia Processing

Provides the ability to handle multimedia, such as audio and video editing, playback, format conversion, also includes video filters, enhancements, and so on.

View all MCP Servers

  • A
    security
    A
    license
    A
    quality
    JavaScript implementation of MiniMax MCP that enables interaction with MiniMax AI services for image generation, video generation, text-to-speech, and voice cloning through MCP-compatible clients.
    Last updated -
    6
    359
    31
    TypeScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that provides image generation capabilities using Google's Gemini 2 API, allowing users to generate multiple images with customizable parameters like prompts, aspect ratios, and person generation settings.
    Last updated -
    1
    JavaScript
    MIT License
    • Apple
    • Linux
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol (MCP) server that converts Mermaid diagrams to PNG images.
    Last updated -
    1
    170
    13
    JavaScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables AI assistants like Claude to use Bouyomichan (a Japanese text-to-speech program) for voice reading with adjustable voice types, volume, speed, and pitch.
    Last updated -
    1
    1
    JavaScript
    MIT License
    • Apple
  • A
    security
    F
    license
    A
    quality
    A Model Context Protocol server that enables Claude to generate and upscale images through the Letz AI API, allowing users to create images directly within Claude conversations.
    Last updated -
    2
    1
    JavaScript
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    Generates realistic human face images that don't represent real people, offering various output shapes, configurable dimensions, and batch generation capabilities.
    Last updated -
    1
    3
    1
    JavaScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    MCP server that exposes Google's Veo2 video generation capabilities, allowing clients to generate videos from text prompts or images.
    Last updated -
    7
    7
    TypeScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    A MCP server that integrates with Stable Diffusion WebUI to provide text-to-image generation and image upscaling capabilities through simple API calls.
    Last updated -
    5
    4
    JavaScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    MCP server for Synthesizer V AI Vocal Studio, which allows LLMs to create/edit vocal tracks e.g. adding lyrics to the melody.
    Last updated -
    6
    Apache 2.0
    • Apple
  • A
    security
    A
    license
    A
    quality
    Image Tools MCP is a Model Context Protocol (MCP) service that retrieves image dimensions and compresses images from URLs and local files using the TinyPNG API. It supports converting images to formats like webp, jpeg/jpg, and png, providing detailed information on width, height, type, and compressi
    Last updated -
    4
    51
    3
    JavaScript
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    An MCP server providing video processing capabilities through FFmpeg, enabling dialog-based local video search, trimming, concatenation, and playback functionalities.
    Last updated -
    7
    9
    Python
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    Provides an interface between AI assistants and Tripo AI via Model Context Protocol, enabling generation of 3D assets from natural language and importing them to Blender.
    Last updated -
    15
    139
    Python
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    An MCP server that allows Claude to use OpenAI's image generation capabilities (gpt-image-1) to create image assets for users, which is particularly useful for game and web development projects.
    Last updated -
    1
    JavaScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol (MCP) server for Adobe After Effects that enables AI assistants and other applications to control After Effects through a standardized protocol.
    Last updated -
    13
    16
    JavaScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    An MCP server implementation that integrates with Minimax API to provide AI-powered image generation and text-to-speech functionality in editors like Windsurf and Cursor.
    Last updated -
    2
    192
    1
    JavaScript
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    A server for integrating with Placid.app's API, enabling listing templates and generating creatives using the Model Context Protocol with secure API token management.
    Last updated -
    3
    10
    5
    TypeScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables real-time interaction with Ableton Live, allowing AI assistants to control song creation, track management, clip operations, and audio recording workflows.
    Last updated -
    23
    176
    10
    TypeScript
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables retrieval of transcripts from YouTube videos. This server provides direct access to video transcripts and subtitles through a simple interface, making it ideal for content analysis and processing.
    Last updated -
    1
    258
    10
    TypeScript
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    An intelligent MCP server with a fully automated batch pipeline for web-ready images. Features include noise reduction, auto levels/curves, JPEG artifact removal, 4K resizing, smart sharpening with shadow/highlight enhancement, and advanced WebP conversion.
    Last updated -
    1
    5
    JavaScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    A powerful MCP tool for parsing and manipulating MIDI files that allows users to read, analyze, and modify MIDI files through natural language commands, supporting operations like reading file information, modifying tracks, adding notes, and setting tempo.
    Last updated -
    11
    23
    1
    JavaScript
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    MCP (Model Context Protocol) server that utilizes the Google Gemini Vision API to interact with YouTube videos. It allows users to get descriptions, summaries, answers to questions, and extract key moments from YouTube videos.
    Last updated -
    4
    141
    JavaScript
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables AI assistants to generate images, text, and audio through the Pollinations APIs without requiring authentication.
    Last updated -
    7
    325
    4
    JavaScript
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    Provides comprehensive document processing, including reading, converting, and manipulating various document formats with advanced text and HTML processing capabilities.
    Last updated -
    16
    46
    11
    TypeScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    MCP server for seamless document format conversion using Pandoc, supporting Markdown, HTML, PDF, DOCX (.docx), csv and more.
    Last updated -
    1
    104
    Python
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    A server that enables generating videos from static images using Vidu's AI models, with features for image-to-video conversion, task monitoring, and image uploading.
    Last updated -
    3
    1
    TypeScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    A TypeScript-based Model Context Protocol (MCP) server enabling integration with PiAPI for media content generation using platforms like Midjourney, Flux, and others through MCP-compatible applications.
    Last updated -
    1
    22
    TypeScript
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    This server provides tools for uploading images and videos directly to Cloudinary using Claude/Cline, facilitating resource management with customizable options like resource type and public ID.
    Last updated -
    1
    71
    4
    JavaScript
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    Provides tools for image, audio, and video recognition using Google's Gemini AI through the Model Context Protocol.
    Last updated -
    3
    6
    TypeScript
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    Facilitates the creation of DecentSampler drum kit configurations, supporting WAV file analysis and XML generation to ensure accurate sample lengths and well-structured presets.
    Last updated -
    5
    93
    1
    TypeScript
    MIT License
    • Apple
  • A
    security
    F
    license
    A
    quality
    A Model Context Protocol server that converts PDF documents into PNG images through a simple MCP tool call.
    Last updated -
    1
    2
    Python
    • Apple
    • Linux
  • A
    security
    F
    license
    A
    quality
    An MCP server designed to work with FFmpeg for media processing tasks, offering enhanced performance and secure communication for handling media processing requests.
    Last updated -
    2
    4
    TypeScript
  • A
    security
    F
    license
    A
    quality
    A lightweight MCP service that enables programmatic downloading of Instagram videos to a specified local path with progress tracking.
    Last updated -
    1
    JavaScript
  • A
    security
    F
    license
    A
    quality
    A Node.js server that provides advanced video and image processing capabilities through the Model Context Protocol, enabling operations like conversion, compression, editing, and effects application.
    Last updated -
    10
    13
    JavaScript
    • Apple
    • Linux
  • A
    security
    F
    license
    A
    quality
    Drawing Tool for AI Assistants
    Last updated -
    4
    1
    JavaScript
  • -
    security
    A
    license
    -
    quality
    The Model Context Protocol (MCP) Server built on Qiniu Cloud products supports users in accessing Qiniu Cloud Storage, intelligent multimedia services, and more through this MCP Server within the context of AI large model clients.
    Last updated -
    9
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    A Model Context Protocol server that enables fast and free lipsync video creation for a wide range of digital avatars, supporting both audio and text inputs to generate synchronized lip movements.
    Last updated -
    3
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    Connects 'yt-dlp' with LLMs via the Model Context Protocol, allowing users to download YouTube content and integrate it with Dive and other MCP-compatible LLMs.
    Last updated -
    2
    125
    16
    TypeScript
    MIT License
    • Apple
    • Linux
  • -
    security
    A
    license
    -
    quality
    Generate animation like 3blue1brown using a single prompt.
    Last updated -
    30
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    A Model Context Protocol (MCP) server for searching and retrieving Lottie animations from LottieFiles.
    Last updated -
    12
    TypeScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    An MCP server that integrates Shaka Packager with Claude AI applications, enabling Claude to analyze, transcode, and package video files for streaming in formats like HLS and DASH.
    Last updated -
    Python
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    A server that enables LLM applications to interact directly with DaVinci Resolve video editing software, allowing AI-assisted capabilities like accessing timeline information and automating editing workflows.
    Last updated -
    72
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    A server that integrates ComfyUI with MCP, allowing users to generate images and download them through natural language interactions.
    Last updated -
    1
    Python
    Apache 2.0
  • -
    security
    A
    license
    -
    quality
    A Model Context Protocol server that enables fetching and processing images from URLs, local file paths, and numpy arrays, returning them as base64-encoded strings with proper MIME types.
    Last updated -
    1
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    AI-powered assistant that connects Claude to video encoding workflows, translating cryptic errors into plain English and providing actionable solutions for troubleshooting encoding jobs.
    Last updated -
    1
    Python
    MIT License
  • -
    security
    A
    license
    -
    quality
    A FastMCP-powered server for programmatically creating, editing, and rendering PowerPoint (PPTX) presentations with features for slide creation, content insertion, and PNG rendering.
    Last updated -
    7
    Python
    Apache 2.0
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    A service that extracts and transcribes audio content from videos across 1000+ streaming websites including YouTube, Bilibili, TikTok, and Twitter, supporting multiple transcription providers like Deepgram, Gladia, Speechmatics, and AssemblyAI.
    Last updated -
    5
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    A FastAPI-based MCP server that integrates with smartscreen.tv, allowing you to programmatically control web displays by displaying media, sending notifications, and controlling playback via HTTP commands.
    Last updated -
    Python
    MIT License
    • Apple
    • Linux
  • -
    security
    A
    license
    -
    quality
    A server that provides tools to control OBS Studio remotely via the OBS WebSocket protocol, enabling management of scenes, sources, streaming, and recording through an MCP client interface.
    Last updated -
    2
    TypeScript
    GPL 2.0
  • -
    security
    A
    license
    -
    quality
    Uses yt-dlp to download subtitles from YouTube and connects it to claude.ai via Model Context Protocol.
    Last updated -
    1
    868
    201
    JavaScript
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    A Model Context Protocol server that enables AI assistants like Claude to generate lyrics, songs, and background music through Mureka's APIs.
    Last updated -
    4
    Python
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    Connects Cinema 4D to Claude, enabling AI-assisted 3D modeling and scene manipulation through natural language commands.
    Last updated -
    10
    Python
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    An MCP tool server that enables generating and editing images through OpenAI's image models, supporting text-to-image generation and advanced image editing (inpainting, outpainting) across various MCP-compatible clients.
    Last updated -
    11
    TypeScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    Provides image generation capabilities using the Flux Schnell model on Replicate, allowing users to create images from text prompts.
    Last updated -
    1
    JavaScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    Connects Ableton Live to AI assistants through Model Context Protocol (MCP), enabling natural language control of music production tasks like track creation, MIDI editing, instrument loading, and playback control.
    Last updated -
    18
    Python
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    MCP Server for ShaderToy, a site where people share GLSL shader. This MCP server allows LLMs to make complex shader they aren't normally capable of.
    Last updated -
    17
    Python
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    Windows integration MCP server that enables Claude to interact with Windows system features including media playback control, notification management, window operations, screenshots, monitor control, theme settings, file opening, and clipboard access.
    Last updated -
    2
    Python
    MIT License
  • -
    security
    A
    license
    -
    quality
    A server that provides AI-powered image generation, modification, and processing capabilities through the Model Context Protocol, leveraging Google Gemini models and other image services.
    Last updated -
    6
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    -
    license
    -
    quality
    Enables Claude Desktop and Agents to generate AI avatars and videos through the HeyGen API, providing tools to create and manage avatar videos with specified text and voice options.
    Last updated -
    1
    Python
  • -
    security
    F
    license
    -
    quality
    An MCP server that downloads videos/extracts audio from various platforms like YouTube, Bilibili, and TikTok, then transcribes them to text using OpenAI's Whisper model.
    Last updated -
    Python
    • Linux
    • Apple
  • -
    security
    F
    license
    -
    quality
    Enables AI applications to integrate with YouTube-Summarizer's APIs through the MCP protocol, offering local tool-based interaction for summarizing YouTube content.
    Last updated -
    1
    Python
  • -
    security
    F
    license
    -
    quality
    Enables extraction of transcript text from YouTube videos by providing the video URL, supporting standard, shortened, and embed URL formats.
    Last updated -
    1
    JavaScript
  • -
    security
    -
    license
    -
    quality
    A Model Context Protocol server that provides a convenient interface for creating lipsynced videos by matching digital avatar videos with audio inputs.
    Last updated -
    1
    Python
  • -
    security
    F
    license
    -
    quality
    A Model Context Protocol server that enables AI agents to create fully mixed and mastered tracks in REAPER DAW, supporting project management, MIDI composition, audio recording, and mixing automation.
    Last updated -
    37
    Python
    • Apple
  • -
    security
    F
    license
    -
    quality
    A FastMCP server that creates a virtual MIDI output port, allowing LLMs to generate and send MIDI data to any software that accepts MIDI input.
    Last updated -
    1
    Python
  • -
    security
    F
    license
    -
    quality
    A MIDI composition system that enables AI assistants to create music through FluidSynth, with capabilities for playing notes, creating melodies, managing tracks, and exporting audio.
    Last updated -
    Python
  • -
    security
    F
    license
    -
    quality
    A PDF processing server that extracts text via normal parsing or OCR, and retrieves images from PDF files through the MCP protocol with a built-in web debugger.
    Last updated -
    6
    Python
  • -
    security
    F
    license
    -
    quality
    A server that provides Luma AI's video generation API as the Model Context Protocol (MCP)
    Last updated -
    2
    TypeScript
  • -
    security
    F
    license
    -
    quality
    Provides tools to interact with RunwayML and Luma AI APIs for video and image generation, including text-to-video, image-to-video, prompt enhancement, and management of generations.
    Last updated -
    1
    TypeScript
  • -
    security
    F
    license
    -
    quality
    A Goose MCP extension providing voice interaction with modern audio visualization, allowing users to communicate with Goose through speech rather than text.
    Last updated -
    26
    Python
    • Linux
    • Apple
  • -
    security
    F
    license
    -
    quality
    Simple MCP server that returns the transcription of a Youtube video using url and desired language.
    Last updated -
    Python
  • -
    security
    F
    license
    -
    quality
    Generates and returns and image using Together.ai
    Last updated -
    3
    TypeScript
    • Linux
    • Apple
  • -
    security
    -
    license
    -
    quality
    A TypeScript-based server that converts static images into animated videos with Ghibli-style aesthetics, accessible through Claude Desktop.
    Last updated -
    1
    TypeScript
    MIT License
  • -
    security
    F
    license
    -
    quality
    Upload, edit, and generate videos from everyone's favorite LLM and Video Jungle.
    Last updated -
    126
    Python
    • Apple
  • -
    security
    F
    license
    -
    quality
    A Model Context Protocol server that converts SVG code to PNG images, offering two conversion methods (CairoSVG and Inkscape) with support for custom working directories.
    Last updated -
    Python
    • Linux
    • Apple
  • -
    security
    F
    license
    -
    quality
    A Model Context Protocol server that provides text-to-speech capabilities using the Kokoro TTS model, offering multiple voice options and customizable speech parameters.
    Last updated -
    239
    JavaScript
    • Apple
    • Linux
  • -
    security
    F
    license
    -
    quality
    Enables video editing using natural language commands powered by FFmpeg, supporting operations like trimming, merging, format conversion, and more with real-time progress tracking and error handling.
    Last updated -
    17
    Python
    • Apple
    • Linux
  • -
    security
    F
    license
    -
    quality
    A server for downloading, processing, and managing YouTube content with features like video quality selection, format conversion, and metadata extraction.
    Last updated -
    JavaScript
  • -
    security
    F
    license
    -
    quality
    Allows AI assistants like Claude to directly interact with and control DaVinci Resolve through the Model Context Protocol, providing capabilities for project management, timeline manipulation, media management, and Fusion integration.
    Last updated -
    5
    Python
    • Apple
    • Linux
  • -
    security
    -
    license
    -
    quality
    Enables interaction with YouTube videos by extracting metadata, captions in multiple languages, and converting content to markdown with various templates.
    Last updated -
    TypeScript
  • -
    security
    F
    license
    -
    quality
    Automatically captures and processes screenshots from YouTube videos and Shorts at specified intervals, supporting customizable screenshot timing and providing API endpoints for image management.
    Last updated -
    • Apple
  • -
    security
    F
    license
    -
    quality
    A Model Context Protocol server that enables AI assistants to extract transcripts from YouTube videos, allowing AI to analyze and work with video content directly.
    Last updated -
    6
    1
    TypeScript
  • -
    security
    -
    license
    -
    quality
    vedit-mcp
    Last updated -
    Python
    MIT License
  • -
    security
    -
    license
    -
    quality
    A lightweight server that exposes FFmpeg's video processing capabilities to AI assistants through the Model Context Protocol (MCP), supporting operations like video format conversion, audio extraction, and adding watermarks.
    Last updated -
    9
    TypeScript
    MIT License
  • -
    security
    F
    license
    -
    quality
    A Node.js server that enables video manipulation through natural language requests, including resizing videos to different resolutions (360p to 1080p) and extracting audio in various formats (MP3, AAC, WAV, OGG).
    Last updated -
    34
    2
    TypeScript
    • Apple
    • Linux
  • -
    security
    -
    license
    -
    quality
    A Model Context Protocol server that enables AI assistants like Claude to interact with DaVinci Resolve Studio, providing advanced control over editing, color grading, audio, and other video production tasks.
    Last updated -
    Python
  • -
    security
    F
    license
    -
    quality
    Create videos and images using Luma AI, this MCP server handles all API functionality for Luma Dream Machine from Claude Desktop.
    Last updated -
    Python
    • Apple
  • -
    security
    -
    license
    -
    quality
    Model Context Protocol server that enables generating videos from text prompts and/or images using AI models (Luma Ray2 Flash and Kling v1.6 Pro) with configurable parameters like aspect ratio, resolution, and duration.
    Last updated -
    1
    JavaScript
    MIT License