Integration with ElevenLabs for generating high-quality AI voice narration and text-to-speech for video content.
Provides specialized export and formatting tools to convert video content into Instagram Reels.
Uses OpenAI's APIs for text-to-speech narration and automated video captioning through Whisper.
Enables one-click conversion and formatting of video content for TikTok.
Supports exporting and optimizing video content for YouTube Shorts.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@mcp-videoRecord a cinematic scroll of example.com and add auto-captions."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
mcp-video
Cinema-grade video production MCP server. 8 tools for recording, editing, effects, captions, TTS, and smart screenshots.
Built on ffmpeg and Playwright. Works with any MCP client (Claude Code, Claude Desktop, custom agents).
Features
Tool | Operations | Description |
| cinema, scroll, multi-device | Record websites at 60fps with frame-by-frame capture |
| speed, crop, reverse, keyframe, pip | Edit clips with zoom/pan, PiP, slow-mo |
| grade, effect, lut, chroma | Color grading, 22 LUT presets, green screen |
| extract, music, ducking, mix, voice | Audio extraction, mixing, 9 voice effects |
| subtitles, caption, overlay, animate | Burn SRT, Whisper auto-caption, 15 text animations |
| concat, intro, social, beat-sync, templates | Join clips, social format conversion, beat sync |
| generate, voices, narrated | ElevenLabs/OpenAI TTS, full narrated videos |
| capture, detect | Element-aware screenshots, page feature detection |
Highlights
60fps frame-by-frame capture — Playwright screenshots every frame, ffmpeg encodes. Zero frame drops.
Cinema easing curves — 16 easing options including
cinematicandshowcasefor buttery smooth scrolling.Smart screenshots — Auto-detects 15+ UI elements (chat widgets, pricing sections, booking forms, etc.).
Narrated videos — Provide a URL + script, get a professional video with synchronized AI voiceover.
22 LUT presets — Film-grade color grading (teal-orange, noir, vintage, cyberpunk, etc.).
Social format export — One-click conversion to Instagram Reel, TikTok, YouTube Short, LinkedIn.
Dual transport — Stdio (default) or HTTP mode for persistent microservice deployment.
Prerequisites
Node.js >= 18
ffmpeg and ffprobe (validated on startup)
Playwright browsers (
npx playwright install chromium)Optional:
ELEVENLABS_API_KEYfor ElevenLabs TTSOptional:
OPENAI_API_KEYfor Whisper captions and OpenAI TTS
Quick Start
With Claude Code (stdio)
{
"mcpServers": {
"video": {
"command": "npx",
"args": ["-y", "mcp-video"]
}
}
}With npx
npx mcp-videoFrom source
git clone https://github.com/studiomeyer-io/mcp-video.git
cd mcp-video
npm install
npx playwright install chromium
npm run build
npm startHTTP mode
# Start as HTTP microservice
npx mcp-video --http --port=9847
# Or via environment variables
MCP_HTTP=1 MCP_PORT=9847 npx mcp-videoConfiguration
Environment Variable | Default | Description |
|
| Directory for generated files |
| — | ElevenLabs TTS API key |
| — | OpenAI API key (Whisper + TTS) |
|
| Enable HTTP transport |
|
| HTTP port |
|
| HTTP bind address |
|
| Enable debug logging |
Usage Examples
Record a website
Use video_record with type "cinema" to record https://example.com
with a smooth scroll and hover over the navbar.Create a narrated explainer video
Use video_speech with type "narrated" to create a narrated video of
https://example.com with these segments:
1. "Welcome to our homepage" — pause on hero section
2. "Check out our features" — scroll to features
3. "Get started today" — hover over CTA buttonAuto-caption a video
Use video_text with type "caption" to add auto-generated captions
to /path/to/video.mp4Export for social media
Use video_compose with type "social-all" to convert
/path/to/video.mp4 to all social media formats.Smart screenshot
Use video_screenshot with type "capture" to screenshot the chat widget
and pricing section on https://example.comArchitecture
src/
server.ts Entry point, 8 consolidated MCP tools
lib/ Logger, types, dual transport
handlers/ Tool handlers (video, editing, post-production, tts, screenshots)
schemas/ JSON Schema definitions for legacy tool format
tools/
engine/ Core engines
capture.ts Frame-by-frame recording (Playwright → PNG → ffmpeg)
encoder.ts ffmpeg encoding pipeline
scenes.ts Scene execution (scroll, hover, click, type, wait)
cursor.ts Visible cursor simulation
smart-screenshot.ts Element-aware screenshot engine
tts.ts ElevenLabs + OpenAI TTS with fallback
narrated-video.ts Full narration pipeline
social-format.ts Social media format conversion
concat.ts Video concatenation with transitions
lut-presets.ts 22 cinema LUT presets
...and moreDevelopment
npm run dev # Start with tsx (hot reload)
npm run typecheck # Type check
npm test # Run tests
npm run check # Verify ffmpeg/ffprobe installedLicense
MIT
Credits
Built by StudioMeyer. Part of our open-source toolkit for AI-powered content creation.
ai-shield — LLM security middleware for TypeScript
agent-fleet — Multi-agent orchestration for Claude Code
darwin-agents — Self-evolving agent framework