Multimedia Processing
Provides the ability to handle multimedia, such as audio and video editing, playback, format conversion, also includes video filters, enhancements, and so on.
MCP ServersBrowse all →
- AlicenseAqualityBmaintenanceMCP server that fetches YouTube video transcripts and optionally summarizes them. Supports multiple transcript formats (text, JSON, SRT, WebVTT), multi-language retrieval, and flexible YouTube URL parsing.Last updated3052MIT
- MIT

Qiniu MCP Serverofficial
AlicenseBqualityAmaintenanceThe Model Context Protocol (MCP) Server built on Qiniu Cloud products supports users in accessing Qiniu Cloud Storage, intelligent multimedia services, and more through this MCP Server within the context of AI large model clients.Last updated2236MIT- AlicenseBqualityCmaintenanceMCP server for video enhancement and SAM3 image segmentation, enabling tasks like upscaling videos and segmenting objects in images via natural language.Last updated4MIT

Cosmic MCP Serverofficial
AlicenseAqualityCmaintenanceAn MCP server that enables AI assistants to manage content, media, and schemas within Cosmic CMS buckets. It allows users to perform CRUD operations on objects and types while providing tools for AI-driven text, image, and video generation.Last updated17421MIT
@avclabs.ai/enhance-mcpofficial
AlicenseAqualityCmaintenanceEnables video enhancement through MCP tools for creating tasks, querying status, and synchronous enhancement, supporting URL or local file inputs.Last updated371Apache 2.0- AlicenseAqualityCmaintenanceAll Voice Lab MCP ServerLast updated1256MIT
- AlicenseAqualityBmaintenanceMCP server for ComfyUI — text-to-image, variations, img2img refine, upscale, image proxy, and workflow runner.Last updated41573MIT
- Apache 2.0
- AlicenseAqualityCmaintenanceA security-hardened MCP server for generating and editing images using Google Gemini models. It provides tools for text-to-image creation and iterative image editing with strict input validation and secure file handling.Last updated13MIT
- AlicenseAqualityCmaintenanceEnables AI assistants to download Instagram content including posts, videos, reels, stories, highlights, and profile pictures using Instaloader, with optional metadata and caption extraction.Last updated75MIT
- AlicenseAqualityFmaintenanceA Model Context Protocol server that enables retrieval of transcripts from YouTube videos. This server provides direct access to video transcripts and subtitles through a simple interface, making it ideal for content analysis and processing.Last updated4130831MIT

MiniMax MCP JSofficial
AlicenseAqualityDmaintenanceJavaScript implementation of MiniMax MCP that enables interaction with MiniMax AI services for image generation, video generation, text-to-speech, and voice cloning through MCP-compatible clients.Last updated10604122MIT
Jsoncut MCP Serverofficial
AlicenseAqualityFmaintenanceEnables AI agents to generate JSON configurations for creating images and videos programmatically through the Jsoncut API, with support for layers, positioning, transitions, and validation.Last updated561MIT- AlicenseBqualityCmaintenanceProvides programmatic access to Baidu's Xiling Digital Human platform, enabling AI assistants to generate digital human videos, clone voices, and create synthesized speech through 13 standardized MCP protocol interfaces.Last updated134MIT

Runway API MCP Serverofficial
AlicenseAqualityCmaintenanceEnables AI video and image generation through the Runway API. Supports video generation from images and text prompts, image creation, video upscaling and editing, and task management.Last updated7819MIT
flo-pluginofficial
AlicenseBqualityCmaintenanceIntegrates Flo's AI-powered media automation into Claude, enabling tasks like quality control, content moderation, delivery validation, and asset search via slash commands and MCP tools.Last updated120MIT- AlicenseAqualityBmaintenanceConvert HTML to PDF/PNG/WebP/PPTX slide carousels with 11 themes. For LinkedIn carousels, decks, Instagram posts, and infographics — Puppeteer-based pixel-perfect rendering.Last updated2261MIT

MMAudio MCPofficial
AlicenseBqualityBmaintenanceEnables AI-powered video-to-audio and text-to-audio generation using MMAudio's API. Create synchronized audio from video content or generate audio from text descriptions with configurable parameters.Last updated303MIT- AlicenseBquality-maintenanceEnables AI-powered image generation using the Ideogram V3 Balanced model via Replicate. Supports text-to-image generation, inpainting, style transfer with 60+ presets, custom resolutions, and reproducible outputs with local image storage.Last updated2
- AlicenseBqualityBmaintenanceParse any file or URL into structured text. Extract text from PDF, DOCX, YouTube, web pages, images, and 25+ formats via one API. Tools: parse_url, parse_file, get_youtube_transcript.Last updated39MIT
- AlicenseBqualityCmaintenanceMedia Execution Control Layer for AI Agents. Reserve-execute-burn/refund pattern. FFmpeg post-processing (format conversion, audio normalization) Supports Flux2 Pro, Veo 3.1, Suno V5.Last updated16MIT
- AlicenseBqualityCmaintenanceBlenderMCP enables Claude AI to directly interact with and control Blender for prompt-assisted 3D modeling, scene creation, and manipulation. It supports object and material control, scene inspection, and execution of Python code through natural language commands.Last updated17MIT
- AlicenseAqualityDmaintenanceEnables natural language control of Blackmagic ATEM video switchers via the Model Context Protocol. It allows users to manage camera switching, transitions, audio mixing, macros, and streaming operations through AI assistants.Last updated324MIT
- AlicenseBqualityCmaintenanceMCP (Model Context Protocol) server that utilizes the Google Gemini Vision API to interact with YouTube videos. It allows users to get descriptions, summaries, answers to questions, and extract key moments from YouTube videos.Last updated4306MIT
- AlicenseBqualityCmaintenanceA Model Context Protocol server that enables AI assistants like Claude to use Bouyomichan (a Japanese text-to-speech program) for voice reading with adjustable voice types, volume, speed, and pitch.Last updated12MIT
- AlicenseBqualityCmaintenanceEnables PTZ camera control with gimbal positioning, snapshots, and AI visual analysis for OBSBOT and UVC cameras. Supports autonomous scanning patterns and integrates with vision-language models for real-time camera analysis.Last updated71MIT
- AlicenseBquality-maintenanceAn MCP server that automatically renames local subtitle files to match corresponding videos using statistical token matching and episode verification. It enables users to scan media directories, preview matches, and manage subtitle configurations through natural language commands.Last updated6
- AlicenseAqualityCmaintenanceFacilitates the creation of DecentSampler drum kit configurations, supporting WAV file analysis and XML generation to ensure accurate sample lengths and well-structured presets.Last updated534MIT
- AlicenseBqualityCmaintenanceEnables batch audio processing and optimization using FFmpeg with preset configurations for game audio, voice processing, and music mastering, including specialized optimization for ElevenLabs AI voice output.Last updated92MIT
MCP ConnectorsBrowse all →
Background removal, 4x upscaling, and face restoration via GPU
AI audio tools for music producers — stem splitting, vocal removal, BPM & key detection, audio-to-MIDI, format conversion, trimming, video-to-audio extraction and AI song generation.
The first artist-owned MCP server. Discover, narrate, and stream Matthew Hartley's debut album The Time Is Now from any compatible AI client. Exposes 8 tools (list_songs, get_song, list_chapters, get_chapter, get_artist, get_experience, get_experience_prompt, refresh_stream_urls) over a public HTTP endpoint with no auth. Apache 2.0 licensed.
Composable APIs for document extraction, image transformation, and document & sheet generation.
Protect and verify digital content with cryptographic signing and proof of ownership.
Media intelligence analysis for audio, video, and images via the Echosaw MCP server.
AI-powered image processing via GPU. Remove backgrounds and upscale images (2x/4x) directly from any MCP client. OAuth 2.1 authenticated, returns processed images inline with download links. Free credits on signup at maskr.io.
- apiA
Quiz.Video MCP: list, create, AI-generate, and render quiz and flashcard videos.
Create and track AI music videos and audio-reactive visuals from songs.
Transform and optimize images by resizing, compressing, and converting across multiple formats. Streamline complex editing workflows using a multi-step pipeline for efficient sequential processing.
Capture photos remotely from mobile devices via S3-backed upload URLs
MCP server for meme generation, template search, caption rendering, and AI meme creation.
Image processing for AI agents. Resize, convert, compress, and pipeline images.
Hosted MCP server for meme generation, meme template search, caption rendering, and AI meme creation.
Imgflip MCP — wraps Imgflip API (free, no auth for template listing)
Focused MCP server for OpenAI image/audio generation (v2.0.0). Wraps endpoints via HAPI CLI.
125+ browser tools for PDF, Image, Video, Audio, AI, Scanner. Files never leave your device.
84+ free local-first tools: image, PDF, docs, dev utils. Wasm, zero upload, x402 API.
Analyze images and videos with Gemini to get fast, reliable visual insights. Handle content from U…
Search your Flashback video library with natural language to instantly find relevant moments. Get…