gemini-video-mcp-server
Provides video understanding capabilities by leveraging Google's Gemini AI, enabling analysis of video content such as security footage, lectures, tutorials, and meeting recordings with support for long videos up to 6 hours.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@gemini-video-mcp-serverSummarize this 8-hour security recording"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Gemini Video Understanding MCP Server
Give your AI coding assistant the power to understand videos!
An MCP (Model Context Protocol) server that enables Claude Code, Cursor, and other AI coding assistants to analyze and understand video content using Google's Gemini AI. Process security footage, lecture recordings, tutorials, and more - directly from your terminal.
Why This Exists
AI coding assistants like Claude Code and Cursor are incredibly powerful, but they can't natively understand video content. This MCP server bridges that gap by:
Using Gemini 3 Flash - latest model with 1M context, 3x faster (up to 6 hours of video!)
Providing a standardized MCP interface that works with any MCP-compatible client
Offering smart time estimation so you know what you're getting into before processing
Supporting segment analysis for efficient processing of long videos
Related MCP server: MCP Codebase Index
Features
Feature | Description |
Long Video Support | Analyze videos up to 6 hours using Gemini 3 Flash's 1M token context |
Smart Estimation | Get accurate time/cost estimates before processing |
Segment Analysis | Analyze specific time ranges for faster results |
Multiple Modes | Summary, detailed analysis, transcript, or timeline |
Q&A Capability | Ask specific questions about video content |
User Prompts | Confirms before processing long videos |
Use Cases
Security & Surveillance
"Analyze this security footage and tell me if anyone approaches the car between 2am and 4am"
"What time does the person in the dark hoodie appear in this footage?"
"Summarize all activity in this 8-hour security recording"Education & Learning
"Transcribe this 2-hour lecture on machine learning"
"Create a timeline of topics covered in this computer science class"
"What does the professor say about recursion? Include timestamps"Code Tutorials & Demos
"What VS Code extensions does the instructor install in this tutorial?"
"At what timestamp does the presenter start explaining the API integration?"
"Summarize the debugging techniques shown in this video"Meeting Recordings
"What action items were discussed in this meeting?"
"Summarize the key decisions made in this product review"
"Who presented the sales figures and what were the highlights?"Content Analysis
"What products are shown in this unboxing video?"
"Describe the UI/UX changes demonstrated in this app walkthrough"
"What error messages appear in this bug report screen recording?"Installation
Prerequisites
Python 3.10+
Google AI Studio API Key (free): https://aistudio.google.com/apikey
ffmpeg (optional, for segment analysis):
# macOS brew install ffmpeg # Ubuntu/Debian sudo apt install ffmpeg # Windows choco install ffmpeg
Quick Install
# Clone the repository
git clone https://github.com/yourusername/gemini-video-mcp-server.git
cd gemini-video-mcp-server
# Create virtual environment and install
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -e .Using uv (recommended)
git clone https://github.com/yourusername/gemini-video-mcp-server.git
cd gemini-video-mcp-server
uv venv && source .venv/bin/activate
uv pip install -e .Configuration
For Claude Code
Add to your Claude Code MCP settings (~/.claude/claude_desktop_config.json or via settings):
{
"mcpServers": {
"gemini-video": {
"command": "python",
"args": ["/absolute/path/to/gemini-video-mcp-server/server.py"],
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}For Cursor
Add to your Cursor MCP configuration (Settings > MCP Servers):
{
"mcpServers": {
"gemini-video": {
"command": "python",
"args": ["/absolute/path/to/gemini-video-mcp-server/server.py"],
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}For Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"gemini-video": {
"command": "python",
"args": ["/absolute/path/to/gemini-video-mcp-server/server.py"],
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}Using with uv (all platforms)
{
"mcpServers": {
"gemini-video": {
"command": "uv",
"args": ["run", "--directory", "/absolute/path/to/gemini-video-mcp-server", "python", "server.py"],
"env": {
"GEMINI_API_KEY": "your-api-key-here"
}
}
}
}Available Tools
estimate_video_analysis
Always call this first! Get time and resource estimates before processing.
Parameters:
- video_path (required): Path to the video file
- known_duration_seconds (optional): Video duration if known
Returns: File size, duration, upload time, processing time, token estimate, recommendationsanalyze_video
Full video analysis with multiple modes.
Parameters:
- video_path (required): Path to the video file
- mode: "summary" | "detailed" | "transcript" | "timeline" (default: "summary")
- custom_prompt (optional): Custom analysis prompt
- confirm_long_video: Set True for videos over 30 minutes
Modes:
- summary: Quick 2-3 paragraph overview
- detailed: Comprehensive scene-by-scene analysis
- transcript: Extract and transcribe speech
- timeline: Timestamped list of eventsanalyze_video_segment
Analyze a specific time range (requires ffmpeg).
Parameters:
- video_path (required): Path to the video file
- start_time (required): Start time ("HH:MM:SS", "MM:SS", or seconds)
- end_time (required): End time (same formats)
- prompt (optional): What to analyzeask_video_question
Ask specific questions about video content.
Parameters:
- video_path (required): Path to the video file
- question (required): Your question
- provide_timestamps: Include timestamps in answer (default: true)list_supported_formats
List all supported video formats and limits.
Example Workflow
You: Analyze this security camera footage: /path/to/footage.mp4
Claude: Let me first estimate the analysis time...
[Calls estimate_video_analysis]
This video is 2 hours and 15 minutes long. Full analysis will take approximately 25 minutes.
Would you like to:
1. Analyze the entire video
2. Analyze specific time ranges (faster)
3. Get a quick summary only
You: Just analyze from 2:00:00 to 2:30:00
Claude: [Calls analyze_video_segment with start_time="2:00:00", end_time="2:30:00"]
Here's what I found in that 30-minute segment...Processing Time Estimates
Video Length | Upload Time | Processing | Total |
5 minutes | ~30s | ~1 min | ~1.5 min |
30 minutes | ~2 min | ~3 min | ~5 min |
1 hour | ~5 min | ~6 min | ~11 min |
3 hours | ~15 min | ~18 min | ~33 min |
6 hours | ~30 min | ~36 min | ~66 min |
Actual times vary based on file size and network speed.
Supported Formats
Video: MP4 (recommended), MPEG, MOV, AVI, FLV, WebM, WMV, 3GP, MPG
Max Duration: 6 hours (Gemini 2.5 Pro)
Max File Size: 2GB per file
Troubleshooting
"GEMINI_API_KEY environment variable is not set"
Ensure the API key is in your MCP server configuration's env block.
"ffmpeg not found"
Install ffmpeg for segment analysis. Full video analysis works without it.
"Video processing failed"
Check the video file isn't corrupted
Ensure format is supported
Try a smaller segment first
Slow processing
Use
estimate_video_analysisto set expectationsUse
analyze_video_segmentfor specific sectionsCheck your internet upload speed
Development
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
GEMINI_API_KEY=your-key python test_server.py
# Run the server directly
GEMINI_API_KEY=your-key python server.pyHow It Works
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Claude Code │ │ MCP Server │ │ Gemini API │
│ Cursor/etc │────▶│ (this repo) │────▶│ (Google) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
│ "Analyze video" │ Upload & Process │
│──────────────────────▶│──────────────────────▶│
│ │ │
│ Text description │ Video understanding │
│◀──────────────────────│◀──────────────────────│Your AI assistant receives a request about a video
It calls this MCP server with the video path
The server uploads the video to Gemini API
Gemini processes the video (1 frame/second, ~66 tokens/frame)
The analysis is returned as text your assistant can understand
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License - see LICENSE file for details.
Acknowledgments
Built with FastMCP
Powered by Google Gemini
Follows MCP Specification
Made with love to give AI coding assistants superpowers
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/adamanz/gemini-video-mcp-server'
If you have feedback or need assistance with the MCP directory API, please join our Discord server