Skip to main content
Glama

MCP Video Parser

MCP Video Parser

A powerful video analysis system that uses the Model Context Protocol (MCP) to process, analyze, and query video content using AI vision models.

๐ŸŽฌ Features

  • AI-Powered Video Analysis: Automatically extracts and analyzes frames using vision LLMs (Llava)

  • Natural Language Queries: Search videos using conversational queries

  • Time-Based Search: Query videos by relative time ("last week") or specific dates

  • Location-Based Organization: Organize videos by location (shed, garage, etc.)

  • Audio Transcription: Extract and search through video transcripts

  • Chat Integration: Natural conversations with Mistral/Llama while maintaining video context

  • Scene Detection: Intelligent frame extraction based on visual changes

  • MCP Protocol: Standards-based integration with Claude and other MCP clients

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.10+

  • Ollama installed and running

  • ffmpeg (for video processing)

Installation

  1. Clone the repository:

git clone https://github.com/michaelbaker-dev/mcpVideoParser.git cd mcpVideoParser
  1. Install dependencies:

pip install -r requirements.txt
  1. Pull required Ollama models:

ollama pull llava:latest # For vision analysis ollama pull mistral:latest # For chat interactions
  1. Start the MCP server:

python mcp_video_server.py --http --host localhost --port 8000

Basic Usage

  1. Process a video:

python process_new_video.py /path/to/video.mp4 --location garage
  1. Start the chat client:

python standalone_client/mcp_http_client.py --chat-llm mistral:latest
  1. Example queries:

  • "Show me the latest videos"

  • "What happened at the garage yesterday?"

  • "Find videos with cars"

  • "Give me a summary of all videos from last week"

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Video Files โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Video Processor โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Frame Analysis โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ–ผ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ MCP Server โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚ Storage Manager โ”‚โ—€โ”€โ”€โ”€โ”€โ”‚ Ollama LLM โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ HTTP Client โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ› ๏ธ Configuration

Edit config/default_config.json to customize:

  • Frame extraction rate: How many frames to analyze

  • Scene detection sensitivity: When to capture scene changes

  • Storage settings: Where to store videos and data

  • LLM models: Which models to use for vision and chat

See Configuration Guide for details.

๐Ÿ”ง MCP Tools

The server exposes these MCP tools:

  • process_video - Process and analyze a video file

  • query_location_time - Query videos by location and time

  • search_videos - Search video content and transcripts

  • get_video_summary - Get AI-generated summary of a video

  • ask_video - Ask questions about specific videos

  • analyze_moment - Analyze specific timestamp in a video

  • get_video_stats - Get system statistics

  • get_video_guide - Get usage instructions

๐Ÿ› ๏ธ Utility Scripts

Video Cleanup

Clean all videos from the system and reset to a fresh state:

# Dry run to see what would be deleted python clean_videos.py --dry-run # Clean processed files and database (keeps originals) python clean_videos.py # Clean everything including original video files python clean_videos.py --clean-originals # Skip confirmation and backup python clean_videos.py --yes --no-backup

This script will:

  • Remove all video entries from the database

  • Delete all processed frames and transcripts

  • Delete all videos from the location-based structure

  • Optionally delete original video files

  • Create a backup of the database before cleaning (unless --no-backup)

Video Processing

Process individual videos:

# Process a video with automatic location detection python process_new_video.py /path/to/video.mp4 # Process with specific location python process_new_video.py /path/to/video.mp4 --location garage

๐Ÿ“– Documentation

๐Ÿšฆ Development

Running Tests

# All tests python -m pytest tests/ -v # Unit tests only python -m pytest tests/unit/ -v # Integration tests (requires Ollama) python -m pytest tests/integration/ -v

Project Structure

mcp-video-server/ โ”œโ”€โ”€ src/ โ”‚ โ”œโ”€โ”€ llm/ # LLM client implementations โ”‚ โ”œโ”€โ”€ processors/ # Video processing logic โ”‚ โ”œโ”€โ”€ storage/ # Database and file management โ”‚ โ”œโ”€โ”€ tools/ # MCP tool definitions โ”‚ โ””โ”€โ”€ utils/ # Utilities and helpers โ”œโ”€โ”€ standalone_client/ # HTTP client implementation โ”œโ”€โ”€ config/ # Configuration files โ”œโ”€โ”€ tests/ # Test suite โ””โ”€โ”€ video_data/ # Video storage (git-ignored)

๐Ÿค Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

๐Ÿ“ Roadmap

  • โœ… Basic video processing and analysis

  • โœ… MCP server implementation

  • โœ… Natural language queries

  • โœ… Chat integration with context

  • ๐Ÿšง Enhanced time parsing (see INTELLIGENT_QUERY_PLAN.md)

  • ๐Ÿšง Multi-camera support

  • ๐Ÿšง Real-time processing

  • ๐Ÿšง Web interface

๐Ÿ› Troubleshooting

Common Issues

  1. Ollama not running:

ollama serve # Start Ollama
  1. Missing models:

ollama pull llava:latest ollama pull mistral:latest
  1. Port already in use:

# Change port in command python mcp_video_server.py --http --port 8001

๐Ÿ“„ License

MIT License - see LICENSE for details.

๐Ÿ™ Acknowledgments

  • Built on FastMCP framework

  • Uses Ollama for local LLM inference

  • Inspired by the Model Context Protocol specification

๐Ÿ’ฌ Support


Version: 0.1.1
Author: Michael Baker
Status: Beta - Breaking changes possible

-
security - not tested
A
license - permissive license
-
quality - not tested

local-only server

The server can only run on the client's local machine because it depends on local resources.

A video analysis system that uses AI vision models to process, analyze, and query video content through natural language, enabling users to search videos by time, location, and content.

  1. ๐ŸŽฌ Features
    1. ๐Ÿš€ Quick Start
      1. Prerequisites
      2. Installation
      3. Basic Usage
    2. ๐Ÿ—๏ธ Architecture
      1. ๐Ÿ› ๏ธ Configuration
        1. ๐Ÿ”ง MCP Tools
          1. ๐Ÿ› ๏ธ Utility Scripts
            1. Video Cleanup
            2. Video Processing
          2. ๐Ÿ“– Documentation
            1. ๐Ÿšฆ Development
              1. Running Tests
              2. Project Structure
            2. ๐Ÿค Contributing
              1. ๐Ÿ“ Roadmap
                1. ๐Ÿ› Troubleshooting
                  1. Common Issues
                2. ๐Ÿ“„ License
                  1. ๐Ÿ™ Acknowledgments
                    1. ๐Ÿ’ฌ Support

                      Related MCP Servers

                      • -
                        security
                        F
                        license
                        -
                        quality
                        Enables AI language models to interact with YouTube content through a standardized interface, providing tools for retrieving video information, transcripts, channel analytics, and trend analysis.
                        Last updated -
                        363
                        47
                      • -
                        security
                        A
                        license
                        -
                        quality
                        An agent-based tool that provides web search and advanced research capabilities including document analysis, image description, and YouTube transcript retrieval.
                        Last updated -
                        12
                        Apache 2.0
                        • Linux
                        • Apple
                      • A
                        security
                        F
                        license
                        A
                        quality
                        Enables natural language search and interaction with video content through three tools: ingesting videos to a Ragie index, retrieving relevant video segments based on queries, and creating video chunks from specific timestamps.
                        Last updated -
                        3
                        1
                        • Apple
                        • Linux
                      • A
                        security
                        F
                        license
                        A
                        quality
                        Enables comprehensive video file analysis including extracting metadata, stream information, bitrate calculations, and generating technical reports. Supports all FFmpeg-compatible video formats with output in JSON, text, or Markdown formats.
                        Last updated -
                        4
                        9
                        1
                        • Apple
                        • Linux

                      View all related MCP servers

                      MCP directory API

                      We provide all the information about MCP servers via our MCP API.

                      curl -X GET 'https://glama.ai/api/mcp/v1/servers/michaelbaker-dev/mcpVideoParser'

                      If you have feedback or need assistance with the MCP directory API, please join our Discord server