MCP Video Parser

A powerful video analysis system that uses the Model Context Protocol (MCP) to process, analyze, and query video content using AI vision models.

🎬 Features

AI-Powered Video Analysis: Automatically extracts and analyzes frames using vision LLMs (Llava)
Natural Language Queries: Search videos using conversational queries
Time-Based Search: Query videos by relative time ("last week") or specific dates
Location-Based Organization: Organize videos by location (shed, garage, etc.)
Audio Transcription: Extract and search through video transcripts
Chat Integration: Natural conversations with Mistral/Llama while maintaining video context
Scene Detection: Intelligent frame extraction based on visual changes
MCP Protocol: Standards-based integration with Claude and other MCP clients

🚀 Quick Start

Prerequisites

Python 3.10+
Ollama installed and running
ffmpeg (for video processing)

Installation

Clone the repository:

git clone https://github.com/michaelbaker-dev/mcpVideoParser.git
cd mcpVideoParser

Install dependencies:

pip install -r requirements.txt

Pull required Ollama models:

ollama pull llava:latest    # For vision analysis
ollama pull mistral:latest  # For chat interactions

Start the MCP server:

python mcp_video_server.py --http --host localhost --port 8000

Basic Usage

Process a video:

python process_new_video.py /path/to/video.mp4 --location garage

Start the chat client:

python standalone_client/mcp_http_client.py --chat-llm mistral:latest

Example queries:

"Show me the latest videos"
"What happened at the garage yesterday?"
"Find videos with cars"
"Give me a summary of all videos from last week"

🏗️ Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Video Files   │────▶│ Video Processor │────▶│ Frame Analysis  │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                                │                         │
                                ▼                         ▼
┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   MCP Server    │◀────│ Storage Manager │◀────│   Ollama LLM    │
└─────────────────┘     └─────────────────┘     └─────────────────┘
         │
         ▼
┌─────────────────┐
│   HTTP Client   │
└─────────────────┘

🛠️ Configuration

Edit config/default_config.json to customize:

Frame extraction rate: How many frames to analyze
Scene detection sensitivity: When to capture scene changes
Storage settings: Where to store videos and data
LLM models: Which models to use for vision and chat

See Configuration Guide for details.

🔧 MCP Tools

The server exposes these MCP tools:

process_video - Process and analyze a video file
query_location_time - Query videos by location and time
search_videos - Search video content and transcripts
get_video_summary - Get AI-generated summary of a video
ask_video - Ask questions about specific videos
analyze_moment - Analyze specific timestamp in a video
get_video_stats - Get system statistics
get_video_guide - Get usage instructions

🛠️ Utility Scripts

Video Cleanup

Clean all videos from the system and reset to a fresh state:

# Dry run to see what would be deleted
python clean_videos.py --dry-run

# Clean processed files and database (keeps originals)
python clean_videos.py

# Clean everything including original video files
python clean_videos.py --clean-originals

# Skip confirmation and backup
python clean_videos.py --yes --no-backup

This script will:

Remove all video entries from the database
Delete all processed frames and transcripts
Delete all videos from the location-based structure
Optionally delete original video files
Create a backup of the database before cleaning (unless --no-backup)

Video Processing

Process individual videos:

# Process a video with automatic location detection
python process_new_video.py /path/to/video.mp4

# Process with specific location
python process_new_video.py /path/to/video.mp4 --location garage

📖 Documentation

API Reference - Detailed MCP tool documentation
Configuration Guide - Customization options
Video Analysis Info - How video processing works
Development Guide - Contributing and testing
Deployment Guide - Production setup

🚦 Development

Running Tests

# All tests
python -m pytest tests/ -v

# Unit tests only
python -m pytest tests/unit/ -v

# Integration tests (requires Ollama)
python -m pytest tests/integration/ -v

Project Structure

mcp-video-server/
├── src/
│   ├── llm/            # LLM client implementations
│   ├── processors/     # Video processing logic
│   ├── storage/        # Database and file management
│   ├── tools/          # MCP tool definitions
│   └── utils/          # Utilities and helpers
├── standalone_client/  # HTTP client implementation
├── config/            # Configuration files
├── tests/             # Test suite
└── video_data/        # Video storage (git-ignored)

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

📝 Roadmap

✅ Basic video processing and analysis
✅ MCP server implementation
✅ Natural language queries
✅ Chat integration with context
🚧 Enhanced time parsing (see INTELLIGENT_QUERY_PLAN.md)
🚧 Multi-camera support
🚧 Real-time processing
🚧 Web interface

🐛 Troubleshooting

Common Issues

Ollama not running:

ollama serve  # Start Ollama

Missing models:

ollama pull llava:latest
ollama pull mistral:latest

Port already in use:

# Change port in command
python mcp_video_server.py --http --port 8001

📄 License

MIT License - see LICENSE for details.

🙏 Acknowledgments

Built on FastMCP framework
Uses Ollama for local LLM inference
Inspired by the Model Context Protocol specification

💬 Support

Version: 0.1.1
Author: Michael Baker
Status: Beta - Breaking changes possible

This server cannot be installed

security - not tested

license - permissive license

quality - not tested

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

A video analysis system that uses AI vision models to process, analyze, and query video content through natural language, enabling users to search videos by time, location, and content.

Related MCP Servers

YouTube MCP Server
ZubeidHendricks
A
security
F
license
A
quality
This server allows AI language models to interact with YouTube content through a standardized interface, providing features such as video and channel information retrieval, transcript management, and playlist operations.
Last updated -
7
211
280
TypeScript
YouTube MCP Server
icraft2170
-
security
F
license
-
quality
Enables AI language models to interact with YouTube content through a standardized interface, providing tools for retrieving video information, transcripts, channel analytics, and trend analysis.
Last updated -
4,423
41
JavaScript
Deep Research MCP Server
Hajime-Y
-
security
A
license
-
quality
An agent-based tool that provides web search and advanced research capabilities including document analysis, image description, and YouTube transcript retrieval.
Last updated -
11
Python
Apache 2.0
YouTube MCP
Prajwal-ak-0
-
security
F
license
-
quality
A Model Context Protocol server that analyzes YouTube videos, enabling users to extract transcripts, generate summaries, and query video content using Gemini AI.
Last updated -
11
Python

View all related MCP servers

MCP Video Parser

MCP Video Parser

🎬 Features

🚀 Quick Start

Prerequisites

Installation

Basic Usage

🏗️ Architecture

🛠️ Configuration

🔧 MCP Tools

🛠️ Utility Scripts

Video Cleanup

Video Processing

📖 Documentation

🚦 Development

Running Tests

Project Structure

🤝 Contributing

📝 Roadmap

🐛 Troubleshooting

Common Issues

📄 License

🙏 Acknowledgments

💬 Support

Related MCP Servers

YouTube MCP Server

YouTube MCP Server

Deep Research MCP Server

YouTube MCP

New MCP Servers

MCP directory API