Which integrations are available for this server?

Enables Google Search integration with 7 optimized search genres using official Google operators, including tools for single searches, batch searches, and combined search-and-crawl operations with metadata extraction. Extracts video transcripts and summaries from YouTube videos without requiring API keys, supporting multi-language transcript extraction and batch processing of multiple videos.

How do I use Crawl-MCP?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Crawl-MCP summarize this article about AI advancements from https://example.com/ai-news" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

Crawl-MCP: Unofficial MCP Server for crawl4ai

⚠️ Important: This is an unofficial MCP server implementation for the excellent crawl4ai library.
Not affiliated with the original crawl4ai project.

A comprehensive Model Context Protocol (MCP) server that wraps the powerful crawl4ai library with advanced AI capabilities. Extract and analyze content from any source: web pages, PDFs, Office documents, YouTube videos, and more. Features intelligent summarization to dramatically reduce token usage while preserving key information.

🌟 Key Features

🔍 Google Search Integration - 7 optimized search genres with Google official operators
🔍 Advanced Web Crawling: JavaScript support, deep site mapping, entity extraction
🌐 Universal Content Extraction: Web pages, PDFs, Word docs, Excel, PowerPoint, ZIP archives
🤖 AI-Powered Summarization: Smart token reduction (up to 88.5%) while preserving essential information
🎬 YouTube Integration: Extract video transcripts and summaries without API keys
⚡ Production Ready: 13 specialized tools with comprehensive error handling

🚀 Quick Start

Prerequisites (Required First)

Python 3.11 以上（FastMCP が Python 3.11+ を要求）

Install system dependencies for Playwright:

Ubuntu 24.04 LTS (Manual Required):

# Manual setup required due to t64 library transition sudo apt update && sudo apt install -y \ libnss3 libatk-bridge2.0-0 libxss1 libasound2t64 \ libgbm1 libgtk-3-0t64 libxshmfence-dev libxrandr2 \ libxcomposite1 libxcursor1 libxdamage1 libxi6 \ fonts-noto-color-emoji fonts-unifont python3-venv python3-pip python3 -m venv venv && source venv/bin/activate pip install playwright==1.55.0 && playwright install chromium sudo playwright install-deps

Other Linux/macOS:

sudo bash scripts/prepare_for_uvx_playwright.sh

Windows (as Administrator):

scripts/prepare_for_uvx_playwright.ps1

Installation

UVX (Recommended - Easiest):

# After system preparation above - that's it! uvx --from git+https://github.com/walksoda/crawl-mcp crawl-mcp

Docker (Production-Ready):

# Clone the repository git clone https://github.com/walksoda/crawl-mcp cd crawl-mcp # Build and run with Docker Compose (STDIO mode) docker-compose up --build # Or build and run HTTP mode on port 8000 docker-compose --profile http up --build crawl4ai-mcp-http # Or build manually docker build -t crawl4ai-mcp . docker run -it crawl4ai-mcp

Docker Features:

🔧 Multi-Browser Support: Chromium, Firefox, Webkit headless browsers
🐧 Google Chrome: Additional Chrome Stable for compatibility
⚡ Optimized Performance: Pre-configured browser flags for Docker
🔒 Security: Non-root user execution
📦 Complete Dependencies: All required libraries included

Claude Desktop Setup

UVX Installation: Add to your claude_desktop_config.json:

{ "mcpServers": { "crawl-mcp": { "transport": "stdio", "command": "uvx", "args": [ "--from", "git+https://github.com/walksoda/crawl-mcp", "crawl-mcp" ], "env": { "CRAWL4AI_LANG": "en" } } } }

Docker HTTP Mode:

{ "mcpServers": { "crawl-mcp": { "transport": "http", "baseUrl": "http://localhost:8000" } } }

For Japanese interface:

"env": { "CRAWL4AI_LANG": "ja" }

📖 Documentation

Topic	Description
Installation Guide	Complete installation instructions for all platforms
API Reference	Full tool documentation and usage examples
Configuration Examples	Platform-specific setup configurations
HTTP Integration	HTTP API access and integration methods
Advanced Usage	Power user techniques and workflows
Development Guide	Contributing and development setup

Language-Specific Documentation

English: docs/ directory
日本語: docs/ja/ directory

🛠️ Tool Overview

Web Crawling

crawl_url - Single page crawling with JavaScript support
deep_crawl_site - Multi-page site mapping and exploration
crawl_url_with_fallback - Robust crawling with retry strategies
batch_crawl - Process multiple URLs (max 5)
multi_url_crawl - Advanced multi-URL configuration

Search Integration

search_google - Genre-filtered Google search
search_and_crawl - Combined search and content extraction
batch_search_google - Multiple search queries (max 3)

Data Extraction

extract_structured_data - CSS/XPath/LLM-based structured extraction

Media Processing

process_file - PDF, Office, ZIP to markdown conversion
extract_youtube_transcript - Video transcript extraction
batch_extract_youtube_transcripts - Multiple videos (max 3)
get_youtube_video_info - Video metadata retrieval

🎯 Common Use Cases

Content Research:

search_and_crawl → extract_structured_data → analysis

Documentation Mining:

deep_crawl_site → batch processing → extraction

Media Analysis:

extract_youtube_transcript → summarization workflow

Site Mapping:

batch_crawl → multi_url_crawl → comprehensive data

🚨 Quick Troubleshooting

Installation Issues:

Re-run setup scripts with proper privileges
Try development installation method
Check browser dependencies are installed

Performance Issues:

Use wait_for_js: true for JavaScript-heavy sites
Increase timeout for slow-loading pages
Use extract_structured_data for targeted extraction

Configuration Issues:

Check JSON syntax in claude_desktop_config.json
Verify file paths are absolute
Restart Claude Desktop after configuration changes

🏗️ Project Structure

Original Library: crawl4ai by unclecode
MCP Wrapper: This repository (walksoda)
Implementation: Unofficial third-party integration

📄 License

This project is an unofficial wrapper around the crawl4ai library. Please refer to the original crawl4ai license for the underlying functionality.

🤝 Contributing

See our Development Guide for contribution guidelines and development setup instructions.

crawl4ai - The underlying web crawling library
Model Context Protocol - The standard this server implements
Claude Desktop - Primary client for MCP servers

Install Server

A

security – no known vulnerabilities

A

license - permissive license

A

quality - confirmed to work

How are these scores calculated?

Resources

GitHub Repository

Need Help?

Report Issue

Related Servers

Tools

View all tools

Crawl-MCP

Crawl-MCP: Unofficial MCP Server for crawl4ai

🌟 Key Features

🚀 Quick Start

Prerequisites (Required First)

Installation

Claude Desktop Setup

📖 Documentation

Language-Specific Documentation

🛠️ Tool Overview

Web Crawling

Search Integration

Data Extraction

Media Processing

🎯 Common Use Cases

🚨 Quick Troubleshooting

🏗️ Project Structure

📄 License

🤝 Contributing

Resources

Tools

Latest Blog Posts

MCP directory API

Crawl-MCP: Unofficial MCP Server for crawl4ai

🌟 Key Features

🚀 Quick Start

Prerequisites (Required First)

Installation

Claude Desktop Setup

📖 Documentation

Language-Specific Documentation

🛠️ Tool Overview

Web Crawling

Search Integration

Data Extraction

Media Processing

🎯 Common Use Cases

🚨 Quick Troubleshooting

🏗️ Project Structure

📄 License

🤝 Contributing

🔗 Related Projects

Resources

Tools

Latest Blog Posts

MCP directory API