Enables Google Search integration with 7 optimized search genres using official Google operators, including tools for single searches, batch searches, and combined search-and-crawl operations with metadata extraction.
Extracts video transcripts and summaries from YouTube videos without requiring API keys, supporting multi-language transcript extraction and batch processing of multiple videos.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Crawl-MCPsummarize this article about AI advancements from https://example.com/ai-news"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Crawl-MCP: Unofficial MCP Server for crawl4ai
β οΈ Important: This is an unofficial MCP server implementation for the excellent crawl4ai library.
Not affiliated with the original crawl4ai project.
A comprehensive Model Context Protocol (MCP) server that wraps the powerful crawl4ai library with advanced AI capabilities. Extract and analyze content from any source: web pages, PDFs, Office documents, YouTube videos, and more. Features intelligent summarization to dramatically reduce token usage while preserving key information.
π Key Features
π Google Search Integration - 7 optimized search genres with Google official operators
π Advanced Web Crawling: JavaScript support, deep site mapping, entity extraction
π Universal Content Extraction: Web pages, PDFs, Word docs, Excel, PowerPoint, ZIP archives
π€ AI-Powered Summarization: Smart token reduction (up to 88.5%) while preserving essential information
π¬ YouTube Integration: Extract video transcripts and summaries without API keys
β‘ Production Ready: 13 specialized tools with comprehensive error handling
π Quick Start
Prerequisites (Required First)
Python 3.11 δ»₯δΈοΌFastMCP γ Python 3.11+ γθ¦ζ±οΌ
Install system dependencies for Playwright:
Ubuntu 24.04 LTS (Manual Required):
Other Linux/macOS:
Windows (as Administrator):
Installation
UVX (Recommended - Easiest):
Docker (Production-Ready):
Docker Features:
π§ Multi-Browser Support: Chromium, Firefox, Webkit headless browsers
π§ Google Chrome: Additional Chrome Stable for compatibility
β‘ Optimized Performance: Pre-configured browser flags for Docker
π Security: Non-root user execution
π¦ Complete Dependencies: All required libraries included
Claude Desktop Setup
UVX Installation:
Add to your claude_desktop_config.json:
Docker HTTP Mode:
For Japanese interface:
π Documentation
Topic | Description |
Complete installation instructions for all platforms | |
Full tool documentation and usage examples | |
Platform-specific setup configurations | |
HTTP API access and integration methods | |
Power user techniques and workflows | |
Contributing and development setup |
Language-Specific Documentation
π οΈ Tool Overview
Web Crawling
crawl_url- Single page crawling with JavaScript supportdeep_crawl_site- Multi-page site mapping and explorationcrawl_url_with_fallback- Robust crawling with retry strategiesbatch_crawl- Process multiple URLs (max 5)multi_url_crawl- Advanced multi-URL configuration
Search Integration
search_google- Genre-filtered Google searchsearch_and_crawl- Combined search and content extractionbatch_search_google- Multiple search queries (max 3)
Data Extraction
extract_structured_data- CSS/XPath/LLM-based structured extraction
Media Processing
process_file- PDF, Office, ZIP to markdown conversionextract_youtube_transcript- Video transcript extractionbatch_extract_youtube_transcripts- Multiple videos (max 3)get_youtube_video_info- Video metadata retrieval
π― Common Use Cases
Content Research:
Documentation Mining:
Media Analysis:
Site Mapping:
π¨ Quick Troubleshooting
Installation Issues:
Re-run setup scripts with proper privileges
Try development installation method
Check browser dependencies are installed
Performance Issues:
Use
wait_for_js: truefor JavaScript-heavy sitesIncrease timeout for slow-loading pages
Use
extract_structured_datafor targeted extraction
Configuration Issues:
Check JSON syntax in
claude_desktop_config.jsonVerify file paths are absolute
Restart Claude Desktop after configuration changes
ποΈ Project Structure
Original Library: crawl4ai by unclecode
MCP Wrapper: This repository (walksoda)
Implementation: Unofficial third-party integration
π License
This project is an unofficial wrapper around the crawl4ai library. Please refer to the original crawl4ai license for the underlying functionality.
π€ Contributing
See our Development Guide for contribution guidelines and development setup instructions.
π Related Projects
crawl4ai - The underlying web crawling library
Model Context Protocol - The standard this server implements
Claude Desktop - Primary client for MCP servers