Skip to main content
Glama

MCP Video Parser

MCP Video Parser

A powerful video analysis system that uses the Model Context Protocol (MCP) to process, analyze, and query video content using AI vision models.

🎬 Features

  • AI-Powered Video Analysis: Automatically extracts and analyzes frames using vision LLMs (Llava)
  • Natural Language Queries: Search videos using conversational queries
  • Time-Based Search: Query videos by relative time ("last week") or specific dates
  • Location-Based Organization: Organize videos by location (shed, garage, etc.)
  • Audio Transcription: Extract and search through video transcripts
  • Chat Integration: Natural conversations with Mistral/Llama while maintaining video context
  • Scene Detection: Intelligent frame extraction based on visual changes
  • MCP Protocol: Standards-based integration with Claude and other MCP clients

🚀 Quick Start

Prerequisites

  • Python 3.10+
  • Ollama installed and running
  • ffmpeg (for video processing)

Installation

  1. Clone the repository:
git clone https://github.com/michaelbaker-dev/mcpVideoParser.git cd mcpVideoParser
  1. Install dependencies:
pip install -r requirements.txt
  1. Pull required Ollama models:
ollama pull llava:latest # For vision analysis ollama pull mistral:latest # For chat interactions
  1. Start the MCP server:
python mcp_video_server.py --http --host localhost --port 8000

Basic Usage

  1. Process a video:
python process_new_video.py /path/to/video.mp4 --location garage
  1. Start the chat client:
python standalone_client/mcp_http_client.py --chat-llm mistral:latest
  1. Example queries:
  • "Show me the latest videos"
  • "What happened at the garage yesterday?"
  • "Find videos with cars"
  • "Give me a summary of all videos from last week"

🏗️ Architecture

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Video Files │────▶│ Video Processor │────▶│ Frame Analysis │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ MCP Server │◀────│ Storage Manager │◀────│ Ollama LLM │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ ▼ ┌─────────────────┐ │ HTTP Client │ └─────────────────┘

🛠️ Configuration

Edit config/default_config.json to customize:

  • Frame extraction rate: How many frames to analyze
  • Scene detection sensitivity: When to capture scene changes
  • Storage settings: Where to store videos and data
  • LLM models: Which models to use for vision and chat

See Configuration Guide for details.

🔧 MCP Tools

The server exposes these MCP tools:

  • process_video - Process and analyze a video file
  • query_location_time - Query videos by location and time
  • search_videos - Search video content and transcripts
  • get_video_summary - Get AI-generated summary of a video
  • ask_video - Ask questions about specific videos
  • analyze_moment - Analyze specific timestamp in a video
  • get_video_stats - Get system statistics
  • get_video_guide - Get usage instructions

🛠️ Utility Scripts

Video Cleanup

Clean all videos from the system and reset to a fresh state:

# Dry run to see what would be deleted python clean_videos.py --dry-run # Clean processed files and database (keeps originals) python clean_videos.py # Clean everything including original video files python clean_videos.py --clean-originals # Skip confirmation and backup python clean_videos.py --yes --no-backup

This script will:

  • Remove all video entries from the database
  • Delete all processed frames and transcripts
  • Delete all videos from the location-based structure
  • Optionally delete original video files
  • Create a backup of the database before cleaning (unless --no-backup)

Video Processing

Process individual videos:

# Process a video with automatic location detection python process_new_video.py /path/to/video.mp4 # Process with specific location python process_new_video.py /path/to/video.mp4 --location garage

📖 Documentation

🚦 Development

Running Tests

# All tests python -m pytest tests/ -v # Unit tests only python -m pytest tests/unit/ -v # Integration tests (requires Ollama) python -m pytest tests/integration/ -v

Project Structure

mcp-video-server/ ├── src/ │ ├── llm/ # LLM client implementations │ ├── processors/ # Video processing logic │ ├── storage/ # Database and file management │ ├── tools/ # MCP tool definitions │ └── utils/ # Utilities and helpers ├── standalone_client/ # HTTP client implementation ├── config/ # Configuration files ├── tests/ # Test suite └── video_data/ # Video storage (git-ignored)

🤝 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

📝 Roadmap

  • ✅ Basic video processing and analysis
  • ✅ MCP server implementation
  • ✅ Natural language queries
  • ✅ Chat integration with context
  • 🚧 Enhanced time parsing (see INTELLIGENT_QUERY_PLAN.md)
  • 🚧 Multi-camera support
  • 🚧 Real-time processing
  • 🚧 Web interface

🐛 Troubleshooting

Common Issues

  1. Ollama not running:
ollama serve # Start Ollama
  1. Missing models:
ollama pull llava:latest ollama pull mistral:latest
  1. Port already in use:
# Change port in command python mcp_video_server.py --http --port 8001

📄 License

MIT License - see LICENSE for details.

🙏 Acknowledgments

  • Built on FastMCP framework
  • Uses Ollama for local LLM inference
  • Inspired by the Model Context Protocol specification

💬 Support


Version: 0.1.1
Author: Michael Baker
Status: Beta - Breaking changes possible

-
security - not tested
A
license - permissive license
-
quality - not tested

local-only server

The server can only run on the client's local machine because it depends on local resources.

A video analysis system that uses AI vision models to process, analyze, and query video content through natural language, enabling users to search videos by time, location, and content.

  1. 🎬 Features
    1. 🚀 Quick Start
      1. Prerequisites
      2. Installation
      3. Basic Usage
    2. 🏗️ Architecture
      1. 🛠️ Configuration
        1. 🔧 MCP Tools
          1. 🛠️ Utility Scripts
            1. Video Cleanup
            2. Video Processing
          2. 📖 Documentation
            1. 🚦 Development
              1. Running Tests
              2. Project Structure
            2. 🤝 Contributing
              1. 📝 Roadmap
                1. 🐛 Troubleshooting
                  1. Common Issues
                2. 📄 License
                  1. 🙏 Acknowledgments
                    1. 💬 Support

                      Related MCP Servers

                      • A
                        security
                        F
                        license
                        A
                        quality
                        This server allows AI language models to interact with YouTube content through a standardized interface, providing features such as video and channel information retrieval, transcript management, and playlist operations.
                        Last updated -
                        7
                        182
                        205
                        TypeScript
                        • Linux
                        • Apple
                      • -
                        security
                        F
                        license
                        -
                        quality
                        Enables AI language models to interact with YouTube content through a standardized interface, providing tools for retrieving video information, transcripts, channel analytics, and trend analysis.
                        Last updated -
                        852
                        1
                        JavaScript
                      • -
                        security
                        A
                        license
                        -
                        quality
                        An agent-based tool that provides web search and advanced research capabilities including document analysis, image description, and YouTube transcript retrieval.
                        Last updated -
                        7
                        Python
                        Apache 2.0
                        • Linux
                        • Apple
                      • -
                        security
                        F
                        license
                        -
                        quality
                        A Model Context Protocol server that analyzes YouTube videos, enabling users to extract transcripts, generate summaries, and query video content using Gemini AI.
                        Last updated -
                        7
                        Python
                        • Linux
                        • Apple

                      View all related MCP servers

                      MCP directory API

                      We provide all the information about MCP servers via our MCP API.

                      curl -X GET 'https://glama.ai/api/mcp/v1/servers/michaelbaker-dev/mcpVideoParser'

                      If you have feedback or need assistance with the MCP directory API, please join our Discord server