Skip to main content
Glama

Video Content Summarization MCP Server

by fakad
MIT License
1
  • Apple

Video Content Summarization MCP Server

A Model Context Protocol (MCP) server that extracts content from multiple video platforms and generates intelligent knowledge graphs.

Features

🌐 Multi-Platform Support

  • Douyin (TikTok China) - Short video content extraction
  • Bilibili - Video and live streaming content
  • Xiaohongshu (Little Red Book) - Social media posts with OCR support
  • Zhihu - Q&A platform content

✨ Advanced Capabilities

  • OCR Text Recognition - Extract text from images using PaddleOCR
  • Knowledge Graph Generation - Intelligent content structuring
  • Chinese Content Optimization - Specialized processing for Chinese text
  • Context-Aware Extraction - Smart content understanding and quality control

Installation

Prerequisites

  • Python 3.8 or higher
  • Anaconda (recommended for dependency management)

Setup

  1. Clone the repository:
git clone https://github.com/fakad/video-sum-mcp.git cd video-sum-mcp
  1. Create and activate conda environment:
conda create -n vsc python=3.8 conda activate vsc
  1. Install dependencies:
pip install -r requirements.txt

Configuration

For Claude Desktop

Add this configuration to your Claude Desktop config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json

{ "mcpServers": { "video-sum-mcp": { "command": "python", "args": ["/path/to/video-sum-mcp/main.py"], "cwd": "/path/to/video-sum-mcp", "env": { "CONDA_DEFAULT_ENV": "vsc" } } } }

For Other MCP Clients

The server can be started directly:

python main.py

Usage

Basic Video Processing

# Example: Process a Bilibili video result = process_video( url="https://www.bilibili.com/video/BV1234567890", output_format="markdown" )

Supported URL Formats

  • Douyin: https://v.douyin.com/... or full URLs
  • Bilibili: https://www.bilibili.com/video/...
  • Xiaohongshu: https://www.xiaohongshu.com/discovery/item/...
  • Zhihu: https://www.zhihu.com/question/...

Context-Enhanced Processing

For platforms with anti-crawling measures, you can provide context:

result = process_video( url="https://...", context_text="Additional context information..." )

Features in Detail

OCR Integration

  • Automatic image text extraction from Xiaohongshu posts
  • PaddleOCR for accurate Chinese character recognition
  • Batch processing for multiple images

Knowledge Graph Generation

  • Structured content analysis
  • Intelligent relationship mapping
  • Quality control and validation

Anti-Crawling Strategies

  • Smart fallback mechanisms
  • Context-based extraction
  • User guidance for optimal results

Development

Project Structure

video-sum-mcp/ ├── core/ # Core functionality modules │ ├── extractors/ # Platform-specific extractors │ ├── processors/ # Content processing logic │ ├── knowledge_graph/ # Knowledge graph generation │ └── managers/ # Resource management ├── scripts/ # MCP server implementation ├── main.py # Main entry point ├── requirements.txt # Python dependencies └── pyproject.toml # Project configuration

Running Tests

python -m pytest

Dependencies

Key dependencies include:

  • bilibili-api-python - Bilibili API integration
  • yt-dlp - Video downloading capabilities
  • PaddleOCR - OCR text recognition
  • beautifulsoup4 - Web scraping
  • requests - HTTP requests

See requirements.txt for complete list.

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Related MCP Servers

  • -
    security
    F
    license
    -
    quality
    Enables extraction of transcript text from YouTube videos by providing the video URL, supporting standard, shortened, and embed URL formats.
    Last updated -
    1
    JavaScript
  • -
    security
    F
    license
    -
    quality
    A Model Context Protocol server that enables AI assistants to extract transcripts from YouTube videos, allowing AI to analyze and work with video content directly.
    Last updated -
    6
    1
    TypeScript
  • -
    security
    -
    license
    -
    quality
    Enables interaction with YouTube videos by extracting metadata, captions in multiple languages, and converting content to markdown with various templates.
    Last updated -
    TypeScript
  • -
    security
    F
    license
    -
    quality
    A Model Context Protocol server that provides AI models with real-time trending content from 18 major Chinese internet platforms, including Weibo, Zhihu, and Bilibili.
    Last updated -
    TypeScript

View all related MCP servers

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/fakad/video-sum-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server