Enables downloading videos from platforms like YouTube and converting them to text using OpenAI Whisper and ffmpeg. It supports multiple output formats including TXT, JSON, SRT, and VTT for transcriptions.
Enables AI image generation using Doubao Seedream models and video generation using Doubao Seedance models through Volcano Engine's API, supporting text-to-image, image-to-image, text-to-video, and task status queries.
Enables text-to-image generation using Zhipu AI's CogView-4 API. Supports generating images from text prompts with configurable size and quality parameters through MCP-compatible clients like Claude Desktop and Cline.
Provides powerful video and audio editing capabilities through FFmpeg, enabling AI assistants to perform professional-grade operations including format conversion, trimming, overlays, transitions, and advanced audio processing.
An MCP server that downloads videos/extracts audio from various platforms like YouTube, Bilibili, and TikTok, then transcribes them to text using OpenAI's Whisper model.
Provides image generation capabilities for Claude using the Replicate Flux model, allowing users to create images from text prompts with customizable parameters like aspect ratio and output format.