Multimedia Processing
Provides the ability to handle multimedia, such as audio and video editing, playback, format conversion, also includes video filters, enhancements, and so on.
MCP ServersBrowse all →
AlicenseAqualityFmaintenanceJavaScript implementation of MiniMax MCP that enables interaction with MiniMax AI services for image generation, video generation, text-to-speech, and voice cloning through MCP-compatible clients.Last updated410335125MIT- AlicenseAqualityBmaintenanceOne MCP server for music, image, video, and audio generation across Suno, Grok Imagine, Seedance, Kling, Hailuo, Wan, VEO, Ideogram, and GPT Image 2. Generate, edit, upscale, reframe, and master through one API key and one credit pool.Last updated8162273MIT

Qiniu MCP Serverofficial
AlicenseBqualityDmaintenanceThe Model Context Protocol (MCP) Server built on Qiniu Cloud products supports users in accessing Qiniu Cloud Storage, intelligent multimedia services, and more through this MCP Server within the context of AI large model clients.Last updated2236MIT- AlicenseAqualityAmaintenanceRemove vocals, extract instrumentals, and split any song into up to six stems — directly from Claude Desktop, Cursor, or any MCP client. Supports local audio files, YouTube URLs, and SoundCloud trackLast updated1115MIT
- MIT

Cosmic MCP Serverofficial
AlicenseAqualityCmaintenanceAn MCP server that enables AI assistants to manage content, media, and schemas within Cosmic CMS buckets. It allows users to perform CRUD operations on objects and types while providing tools for AI-driven text, image, and video generation.Last updated17321MIT- AlicenseBqualityBmaintenanceMCP server for video enhancement and SAM3 image segmentation, enabling tasks like upscaling videos and segmenting objects in images via natural language.Last updated422MIT
- AlicenseAqualityBmaintenanceEnables file conversion between 690+ formats (image, video, audio, document, data, font, ebook, archive) using an MCP server, with no API key or signup required.Last updated421MIT
- AlicenseAqualityBmaintenanceOfficial MCP server for Rendobar. Lets AI agents run serverless media processing and upload local files.Last updated66732MIT
- AlicenseAqualityBmaintenanceMCP server for the Runway Aleph model line. Enables creating edit video tasks, polling status, and checking pricing through a single RunAPI API key.Last updated3Apache 2.0
- AlicenseAqualityAmaintenanceRender HTML/CSS/JS/SVG animations into high-quality MP4 videos. Exposes render tools to AI agents using Playwright (Chromium) and FFmpeg, running locally or remote.Last updated32MIT
- AlicenseAqualityAmaintenanceMCP server that fetches YouTube video transcripts and optionally summarizes them. Supports multiple transcript formats (text, JSON, SRT, WebVTT), multi-language retrieval, and flexible YouTube URL parsing.Last updated253MIT
- AlicenseAqualityDmaintenanceExtracts YouTube video metadata, titles, and descriptions along with transcripts generated from subtitles or OpenAI Whisper speech-to-text. This server enables users to retrieve and analyze detailed video content directly within MCP-compatible environments.Last updated217MIT

Urlbox MCP Serverofficial
AlicenseBqualityDmaintenanceEnables users to capture high-quality website screenshots, generate PDFs, and convert web content to HTML or markdown via the Urlbox API. It supports advanced features like ad-blocking, cookie banner removal, and metadata extraction directly through natural language prompts.Last updated293MIT- AlicenseAqualityDmaintenanceEnables language models to generate fun ASCII art featuring cows and other characters saying or thinking custom messages. Provides access to various cow characters including dragons, penguins, and skeletons for creative text art generation.Last updated43141MIT

Jsoncut MCP Serverofficial
AlicenseAqualityFmaintenanceEnables AI agents to generate JSON configurations for creating images and videos programmatically through the Jsoncut API, with support for layers, positioning, transitions, and validation.Last updated5431MIT
MMAudio MCPofficial
AlicenseBqualityCmaintenanceEnables AI-powered video-to-audio and text-to-audio generation using MMAudio's API. Create synchronized audio from video content or generate audio from text descriptions with configurable parameters.Last updated333MIT
@avclabs.ai/enhance-mcpofficial
AlicenseAqualityDmaintenanceEnables video enhancement through MCP tools for creating tasks, querying status, and synchronous enhancement, supporting URL or local file inputs.Last updated342Apache 2.0
vicsee-mcp-serverofficial
AlicenseAqualityBmaintenanceEnables AI agents to generate, edit, and upscale videos and images using VicSee's API, with support for multiple models and asynchronous task polling.Last updated7302MIT- AlicenseAqualityDmaintenanceAll Voice Lab MCP ServerLast updated1256MIT
- AlicenseAqualityBmaintenanceEnables creating and managing HappyHorse video generation tasks (edit, image-to-video, text-to-video) via RunAPI, with optional polling for completion and pricing lookup.Last updated5Apache 2.0

Runway API MCP Serverofficial
AlicenseAqualityDmaintenanceEnables AI video and image generation through the Runway API. Supports video generation from images and text prompts, image creation, video upscaling and editing, and task management.Last updated71121MIT- AlicenseBqualityFmaintenanceProvides programmatic access to Baidu's Xiling Digital Human platform, enabling AI assistants to generate digital human videos, clone voices, and create synthesized speech through 13 standardized MCP protocol interfaces.Last updated134MIT

flo-pluginofficial
AlicenseBqualityBmaintenanceIntegrates Flo's AI-powered media automation into Claude, enabling tasks like quality control, content moderation, delivery validation, and asset search via slash commands and MCP tools.Last updated1219MIT
gurupdf-mcpofficial
AlicenseAqualityAmaintenanceConvert, compress, merge, split and OCR PDFs plus 100+ file formats (Word, Excel, images, ebooks, video) right inside your AI agent. Exposes 126 GuruPDF tools over MCP — works with Claude, Cursor, VS Code, Windsurf, or any MCP client.Last updated4581MIT- AlicenseBqualityCmaintenanceAn MCP server that enables AI coding agents to control FMOD Studio for audio import, event creation, and bank building via TCP scripting.Last updated2211MIT
- AlicenseBqualityCmaintenanceAn AI agent that analyzes and transforms music projects in REAPER, explaining why it sounds a certain way and reshaping it toward a desired style through natural language interaction.Last updated32MIT
- AlicenseAqualityCmaintenanceControl mpv media player through AI conversation. Play music and video, manage playlists — all via natural language. Works with opencode and any MCP-compatible AI tool.Last updated1624MIT
- AlicenseAqualityDmaintenanceEnables AI assistants to capture screenshots and read clipboard content from Windows applications while operating within a WSL environment. It supports monitor or window-specific targeting and features intelligent image optimization for efficient data transfer.Last updated24MIT
- AlicenseAqualityBmaintenanceByteDance Seedream AI image generation and editing (style transfer, background change, virtual try-on) with multiple models, multi-resolution up to 4K, and streaming delivery.Last updated6MIT
MCP ConnectorsBrowse all →
Media intelligence analysis for audio, video, and images via the Echosaw MCP server.
32 creative AI tools (18 free) for agents: generate, upscale, mockup, print, watermark.
Composable APIs for document extraction, image transformation, and document & sheet generation.
A one-stop creative pipeline for AI agents: generate, upscale, enrich, sign, store, mint. 24 paid MCP tools powered by Stable Diffusion, Imagen 3, ESRGAN, and Gemini — plus 53K+ museum artworks from Alexandria Aeternum. Three payment rails, volume discounts, and a free trial to start.
Social-video URL → transcript, frames & metadata across 6 platforms. MCP, OAuth, free tier.
Music studio: ABC notation composition and Strudel live coding with ext-apps UI.
Download YouTube, TikTok, Vimeo, SoundCloud and 6 more platforms from any MCP AI chatbot.
One tool surface for music, image, video, and audio generation across Suno, Grok Imagine, Seedance, Kling, Hailuo, Wan, VEO, Ideogram, and GPT Image 2. Generate, edit, upscale, reframe, and master through one credit pool. Connect in one click with OAuth, no API key required.
Process video, audio, images, and documents with 86+ cloud media processing robots.
Agent-first image hosting — upload images and get instant CDN URLs.
Render HTML, Markdown, or any URL to images or PDF, plus reader-mode extraction. MCP-native.
Find & cut horizontal and vertical video clips (Shorts/Reels), transcribe & summarize. Pay per job.
Screenshot & HTML-to-PDF rendering API for AI agents — capture any URL or raw HTML as PNG/JPEG/PDF via a managed Chromium fleet. Free tier.
Convert files between 690+ formats: image, video, audio, documents, ebooks, archives. Free, no auth.
Privacy-first audio intelligence: BPM, key, waveform. Audio never stored. Pay per second.
Generate images, GIFs, and PDFs from HTML, URLs, or templates — from your AI agent.
Video transcoding and document conversion for AI agents. Transcode to MP4 (H.264), WebM/VP9, ProRes 422, GIF, or MP3 audio. Convert PDFs, DOCX, PPTX, XLSX, HTML, Markdown, and images. Prepaid wallet with per-job billing — no FFmpeg, no storage, no infrastructure to manage.
Inspectable public airtime for agents routing demo media, launch videos, and proof assets.
AI audio tools for music producers — stem splitting, vocal removal, BPM & key detection, audio-to-MIDI, format conversion, trimming, video-to-audio extraction and AI song generation.
The first artist-owned MCP server. Discover, narrate, and stream Matthew Hartley's debut album The Time Is Now from any compatible AI client. Exposes 8 tools (list_songs, get_song, list_chapters, get_chapter, get_artist, get_experience, get_experience_prompt, refresh_stream_urls) over a public HTTP endpoint with no auth. Apache 2.0 licensed.