Multimedia Processing
Provides the ability to handle multimedia, such as audio and video editing, playback, format conversion, also includes video filters, enhancements, and so on.
MCP ServersBrowse all →
- AlicenseAqualityDmaintenanceA Model Context Protocol server that enables retrieval of transcripts from YouTube videos. This server provides direct access to video transcripts and subtitles through a simple interface, making it ideal for content analysis and processing.Last updated4136731MIT
- AlicenseAqualityCmaintenanceAll Voice Lab MCP ServerLast updated1256MIT
- AlicenseBqualityCmaintenanceMCP server for video enhancement and SAM3 image segmentation, enabling tasks like upscaling videos and segmenting objects in images via natural language.Last updated4MIT
- AlicenseAqualityBmaintenanceMCP server that fetches YouTube video transcripts and optionally summarizes them. Supports multiple transcript formats (text, JSON, SRT, WebVTT), multi-language retrieval, and flexible YouTube URL parsing.Last updated2852MIT

Jsoncut MCP Serverofficial
AlicenseAqualityFmaintenanceEnables AI agents to generate JSON configurations for creating images and videos programmatically through the Jsoncut API, with support for layers, positioning, transitions, and validation.Last updated561MIT- AlicenseAqualityBmaintenanceConvert HTML to PDF/PNG/WebP/PPTX slide carousels with 11 themes. For LinkedIn carousels, decks, Instagram posts, and infographics — Puppeteer-based pixel-perfect rendering.Last updated2261MIT

Runway API MCP Serverofficial
AlicenseAqualityCmaintenanceEnables AI video and image generation through the Runway API. Supports video generation from images and text prompts, image creation, video upscaling and editing, and task management.Last updated7819MIT
Qiniu MCP Serverofficial
AlicenseBqualityAmaintenanceThe Model Context Protocol (MCP) Server built on Qiniu Cloud products supports users in accessing Qiniu Cloud Storage, intelligent multimedia services, and more through this MCP Server within the context of AI large model clients.Last updated2236MIT- Apache 2.0

MiniMax MCP JSofficial
AlicenseAqualityDmaintenanceJavaScript implementation of MiniMax MCP that enables interaction with MiniMax AI services for image generation, video generation, text-to-speech, and voice cloning through MCP-compatible clients.Last updated10589121MIT
Cosmic MCP Serverofficial
AlicenseAqualityCmaintenanceAn MCP server that enables AI assistants to manage content, media, and schemas within Cosmic CMS buckets. It allows users to perform CRUD operations on objects and types while providing tools for AI-driven text, image, and video generation.Last updated171251MIT- AlicenseAqualityCmaintenanceA security-hardened MCP server for generating and editing images using Google Gemini models. It provides tools for text-to-image creation and iterative image editing with strict input validation and secure file handling.Last updated13MIT

MMAudio MCPofficial
AlicenseBqualityBmaintenanceEnables AI-powered video-to-audio and text-to-audio generation using MMAudio's API. Create synchronized audio from video content or generate audio from text descriptions with configurable parameters.Last updated303MIT- AlicenseAqualityBmaintenanceMCP server for ComfyUI — text-to-image, variations, img2img refine, upscale, image proxy, and workflow runner.Last updated41545MIT
- AlicenseBqualityCmaintenanceProvides programmatic access to Baidu's Xiling Digital Human platform, enabling AI assistants to generate digital human videos, clone voices, and create synthesized speech through 13 standardized MCP protocol interfaces.Last updated134MIT
- MIT
- AlicenseBqualityCmaintenanceMedia Execution Control Layer for AI Agents. Reserve-execute-burn/refund pattern. FFmpeg post-processing (format conversion, audio normalization) Supports Flux2 Pro, Veo 3.1, Suno V5.Last updated16MIT
- AlicenseBquality-maintenanceEnables AI-powered image generation using the Ideogram V3 Balanced model via Replicate. Supports text-to-image generation, inpainting, style transfer with 60+ presets, custom resolutions, and reproducible outputs with local image storage.Last updated2
- AlicenseAqualityBmaintenanceA Model Context Protocol server that enables AI assistants to generate images, text, and audio through the Pollinations APIs without requiring authentication.Last updated914441MIT
- AlicenseAqualityDmaintenanceEnables natural language control of Blackmagic ATEM video switchers via the Model Context Protocol. It allows users to manage camera switching, transitions, audio mixing, macros, and streaming operations through AI assistants.Last updated324MIT
- AlicenseBqualityCmaintenanceMCP (Model Context Protocol) server that utilizes the Google Gemini Vision API to interact with YouTube videos. It allows users to get descriptions, summaries, answers to questions, and extract key moments from YouTube videos.Last updated4116MIT
- AlicenseBqualityCmaintenanceEnables batch audio processing and optimization using FFmpeg with preset configurations for game audio, voice processing, and music mastering, including specialized optimization for ElevenLabs AI voice output.Last updated92MIT
- AlicenseAqualityCmaintenanceExposes Google Gemini's Nano Banana image generation models to Claude, enabling text-to-image generation, image editing, and multi-image composition through natural language prompts.Last updated3MIT
- AlicenseAqualityCmaintenanceEnables conversational image generation and editing with Google's Gemini 2.5 Flash Image Preview. Supports text-to-image generation, natural language image editing, multi-image composition, and style transfer with optional file saving.Last updated473MIT
- AlicenseBqualityCmaintenanceEnables PTZ camera control with gimbal positioning, snapshots, and AI visual analysis for OBSBOT and UVC cameras. Supports autonomous scanning patterns and integrates with vision-language models for real-time camera analysis.Last updated71MIT
- AlicenseAqualityCmaintenanceEnables image generation and editing using Google Gemini AI with support for multiple aspect ratios, context images, custom styles, and watermark overlays. Optimized for creating social media content with automatic file saving and flexible output configuration.Last updated203MIT
- AlicenseAqualityCmaintenanceConverts various file types (PDF, images, audio, DOCX, XLSX, PPTX) and web content (YouTube videos, web pages, Bing search results) into Markdown format for easy reading and sharing.Last updated10616MIT
- AlicenseBquality-maintenanceAn MCP server that automatically renames local subtitle files to match corresponding videos using statistical token matching and episode verification. It enables users to scan media directories, preview matches, and manage subtitle configurations through natural language commands.Last updated6
- AlicenseBqualityBmaintenanceAgent-native media processing: video encoding, image manipulation, document conversion, audio transcription, and more via 86+ cloud Robots.Last updated771
- AlicenseAqualityCmaintenanceEnables conversion between multiple image formats including JPG, PNG, WebP, GIF, BMP, TIFF, SVG, ICO, and AVIF with quality control and batch processing capabilities.Last updated4982MIT
MCP ConnectorsBrowse all →
Background removal, 4x upscaling, and face restoration via GPU
AI audio tools for music producers — stem splitting, vocal removal, BPM & key detection, audio-to-MIDI, format conversion, trimming, video-to-audio extraction and AI song generation.
The first artist-owned MCP server. Discover, narrate, and stream Matthew Hartley's debut album The Time Is Now from any compatible AI client. Exposes 8 tools (list_songs, get_song, list_chapters, get_chapter, get_artist, get_experience, get_experience_prompt, refresh_stream_urls) over a public HTTP endpoint with no auth. Apache 2.0 licensed.
Protect and verify digital content with cryptographic signing and proof of ownership.
Composable APIs for document extraction, image transformation, and document & sheet generation.
Media intelligence analysis for audio, video, and images via the Echosaw MCP server.
AI-powered image processing via GPU. Remove backgrounds and upscale images (2x/4x) directly from any MCP client. OAuth 2.1 authenticated, returns processed images inline with download links. Free credits on signup at maskr.io.
- apiA
Quiz.Video MCP: list, create, AI-generate, and render quiz and flashcard videos.
Create and track AI music videos and audio-reactive visuals from songs.
Transform and optimize images by resizing, compressing, and converting across multiple formats. Streamline complex editing workflows using a multi-step pipeline for efficient sequential processing.
Capture photos remotely from mobile devices via S3-backed upload URLs
MCP server for meme generation, template search, caption rendering, and AI meme creation.
Image processing for AI agents. Resize, convert, compress, and pipeline images.
Hosted MCP server for meme generation, meme template search, caption rendering, and AI meme creation.
Imgflip MCP — wraps Imgflip API (free, no auth for template listing)
Focused MCP server for OpenAI image/audio generation (v2.0.0). Wraps endpoints via HAPI CLI.
125+ browser tools for PDF, Image, Video, Audio, AI, Scanner. Files never leave your device.
84+ free local-first tools: image, PDF, docs, dev utils. Wasm, zero upload, x402 API.
Analyze images and videos with Gemini to get fast, reliable visual insights. Handle content from U…
Search your Flashback video library with natural language to instantly find relevant moments. Get…