Skip to main content
Glama

MCP Video Parser

MCP Video Parser

A powerful video analysis system that uses the Model Context Protocol (MCP) to process, analyze, and query video content using AI vision models.

🎬 Features

  • AI-Powered Video Analysis: Automatically extracts and analyzes frames using vision LLMs (Llava)

  • Natural Language Queries: Search videos using conversational queries

  • Time-Based Search: Query videos by relative time ("last week") or specific dates

  • Location-Based Organization: Organize videos by location (shed, garage, etc.)

  • Audio Transcription: Extract and search through video transcripts

  • Chat Integration: Natural conversations with Mistral/Llama while maintaining video context

  • Scene Detection: Intelligent frame extraction based on visual changes

  • MCP Protocol: Standards-based integration with Claude and other MCP clients

πŸš€ Quick Start

Prerequisites

  • Python 3.10+

  • Ollama installed and running

  • ffmpeg (for video processing)

Installation

  1. Clone the repository:

git clone https://github.com/michaelbaker-dev/mcpVideoParser.git cd mcpVideoParser
  1. Install dependencies:

pip install -r requirements.txt
  1. Pull required Ollama models:

ollama pull llava:latest # For vision analysis ollama pull mistral:latest # For chat interactions
  1. Start the MCP server:

python mcp_video_server.py --http --host localhost --port 8000

Basic Usage

  1. Process a video:

python process_new_video.py /path/to/video.mp4 --location garage
  1. Start the chat client:

python standalone_client/mcp_http_client.py --chat-llm mistral:latest
  1. Example queries:

  • "Show me the latest videos"

  • "What happened at the garage yesterday?"

  • "Find videos with cars"

  • "Give me a summary of all videos from last week"

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Video Files │────▢│ Video Processor │────▢│ Frame Analysis β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ MCP Server │◀────│ Storage Manager │◀────│ Ollama LLM β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ HTTP Client β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ Configuration

Edit config/default_config.json to customize:

  • Frame extraction rate: How many frames to analyze

  • Scene detection sensitivity: When to capture scene changes

  • Storage settings: Where to store videos and data

  • LLM models: Which models to use for vision and chat

See Configuration Guide for details.

πŸ”§ MCP Tools

The server exposes these MCP tools:

  • process_video - Process and analyze a video file

  • query_location_time - Query videos by location and time

  • search_videos - Search video content and transcripts

  • get_video_summary - Get AI-generated summary of a video

  • ask_video - Ask questions about specific videos

  • analyze_moment - Analyze specific timestamp in a video

  • get_video_stats - Get system statistics

  • get_video_guide - Get usage instructions

πŸ› οΈ Utility Scripts

Video Cleanup

Clean all videos from the system and reset to a fresh state:

# Dry run to see what would be deleted python clean_videos.py --dry-run # Clean processed files and database (keeps originals) python clean_videos.py # Clean everything including original video files python clean_videos.py --clean-originals # Skip confirmation and backup python clean_videos.py --yes --no-backup

This script will:

  • Remove all video entries from the database

  • Delete all processed frames and transcripts

  • Delete all videos from the location-based structure

  • Optionally delete original video files

  • Create a backup of the database before cleaning (unless --no-backup)

Video Processing

Process individual videos:

# Process a video with automatic location detection python process_new_video.py /path/to/video.mp4 # Process with specific location python process_new_video.py /path/to/video.mp4 --location garage

πŸ“– Documentation

🚦 Development

Running Tests

# All tests python -m pytest tests/ -v # Unit tests only python -m pytest tests/unit/ -v # Integration tests (requires Ollama) python -m pytest tests/integration/ -v

Project Structure

mcp-video-server/ β”œβ”€β”€ src/ β”‚ β”œβ”€β”€ llm/ # LLM client implementations β”‚ β”œβ”€β”€ processors/ # Video processing logic β”‚ β”œβ”€β”€ storage/ # Database and file management β”‚ β”œβ”€β”€ tools/ # MCP tool definitions β”‚ └── utils/ # Utilities and helpers β”œβ”€β”€ standalone_client/ # HTTP client implementation β”œβ”€β”€ config/ # Configuration files β”œβ”€β”€ tests/ # Test suite └── video_data/ # Video storage (git-ignored)

🀝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

πŸ“ Roadmap

  • βœ… Basic video processing and analysis

  • βœ… MCP server implementation

  • βœ… Natural language queries

  • βœ… Chat integration with context

  • 🚧 Enhanced time parsing (see INTELLIGENT_QUERY_PLAN.md)

  • 🚧 Multi-camera support

  • 🚧 Real-time processing

  • 🚧 Web interface

πŸ› Troubleshooting

Common Issues

  1. Ollama not running:

ollama serve # Start Ollama
  1. Missing models:

ollama pull llava:latest ollama pull mistral:latest
  1. Port already in use:

# Change port in command python mcp_video_server.py --http --port 8001

πŸ“„ License

MIT License - see LICENSE for details.

πŸ™ Acknowledgments

  • Built on FastMCP framework

  • Uses Ollama for local LLM inference

  • Inspired by the Model Context Protocol specification

πŸ’¬ Support


Version: 0.1.1
Author: Michael Baker
Status: Beta - Breaking changes possible

-
security - not tested
A
license - permissive license
-
quality - not tested

local-only server

The server can only run on the client's local machine because it depends on local resources.

A video analysis system that uses AI vision models to process, analyze, and query video content through natural language, enabling users to search videos by time, location, and content.

  1. 🎬 Features
    1. πŸš€ Quick Start
      1. Prerequisites
      2. Installation
      3. Basic Usage
    2. πŸ—οΈ Architecture
      1. πŸ› οΈ Configuration
        1. πŸ”§ MCP Tools
          1. πŸ› οΈ Utility Scripts
            1. Video Cleanup
            2. Video Processing
          2. πŸ“– Documentation
            1. 🚦 Development
              1. Running Tests
              2. Project Structure
            2. 🀝 Contributing
              1. πŸ“ Roadmap
                1. πŸ› Troubleshooting
                  1. Common Issues
                2. πŸ“„ License
                  1. πŸ™ Acknowledgments
                    1. πŸ’¬ Support

                      Related MCP Servers

                      • -
                        security
                        F
                        license
                        -
                        quality
                        Enables AI language models to interact with YouTube content through a standardized interface, providing tools for retrieving video information, transcripts, channel analytics, and trend analysis.
                        Last updated -
                        363
                        47
                      • -
                        security
                        A
                        license
                        -
                        quality
                        An agent-based tool that provides web search and advanced research capabilities including document analysis, image description, and YouTube transcript retrieval.
                        Last updated -
                        12
                        Apache 2.0
                        • Linux
                        • Apple
                      • A
                        security
                        F
                        license
                        A
                        quality
                        Enables natural language search and interaction with video content through three tools: ingesting videos to a Ragie index, retrieving relevant video segments based on queries, and creating video chunks from specific timestamps.
                        Last updated -
                        3
                        1
                        • Apple
                        • Linux
                      • A
                        security
                        F
                        license
                        A
                        quality
                        Enables comprehensive video file analysis including extracting metadata, stream information, bitrate calculations, and generating technical reports. Supports all FFmpeg-compatible video formats with output in JSON, text, or Markdown formats.
                        Last updated -
                        4
                        9
                        1
                        • Apple
                        • Linux

                      View all related MCP servers

                      MCP directory API

                      We provide all the information about MCP servers via our MCP API.

                      curl -X GET 'https://glama.ai/api/mcp/v1/servers/michaelbaker-dev/mcpVideoParser'

                      If you have feedback or need assistance with the MCP directory API, please join our Discord server