Crawl4AI MCP Server

A powerful Model Context Protocol (MCP) server that provides web scraping and crawling capabilities using Crawl4AI. This server acts as the "hands and eyes" for client-side AI, enabling intelligent web content analysis and extraction.

Features

🔍 Page Structure Analysis: Extract clean HTML or Markdown content from any webpage
🎯 Schema-Based Extraction: Precision data extraction using CSS selectors and AI-generated schemas
📸 Screenshot Capture: Visual webpage representation for analysis
⚡ Async Operations: Non-blocking web crawling with progress reporting
🛡️ Error Handling: Comprehensive error handling and validation
📊 MCP Integration: Full Model Context Protocol compatibility with logging and progress tracking

Architecture

┌─────────────────┐ ┌───────────────────┐ ┌─────────────────┐ │ Client AI │ │ Crawl4AI MCP │ │ Web Content │ │ ("Brain") │◄──►│ Server │◄──►│ (Websites) │ │ │ │ ("Hands & Eyes") │ │ │ └─────────────────┘ └───────────────────┘ └─────────────────┘

FastMCP: Handles MCP protocol and tool registration
AsyncWebCrawler: Provides async web scraping capabilities
Stdio Transport: MCP-compatible communication channel
Error-Safe Logging: All logs directed to stderr to prevent protocol corruption

Installation

Prerequisites

Python 3.10 or higher
pip package manager

Setup

Clone or download this repository:
git clone <repository-url> cd crawl4ai-mcp
Create and activate virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
Install dependencies:
pip install -r requirements.txt
Install Playwright browsers (required for screenshots):
playwright install

Usage

Starting the Server

# Activate virtual environment source venv/bin/activate # Start the MCP server python3 crawl4ai_mcp_server.py

Testing with MCP Inspector

For interactive testing and development:

# Start MCP Inspector interface fastmcp dev crawl4ai_mcp_server.py

This will start a web interface (usually at http://localhost:6274) where you can test all tools interactively.

Available Tools

1. server_status

Purpose: Get server health and capabilities information
Parameters: None

Example Response:

{ "server_name": "Crawl4AI-MCP-Server", "version": "1.0.0", "status": "operational", "capabilities": ["web_crawling", "content_extraction", "screenshot_capture", "schema_based_extraction"] }

2. get_page_structure

Purpose: Extract webpage content for analysis (the "eyes" function)
Parameters:

url (string): The webpage URL to analyze
format (string, optional): Output format - "html" or "markdown" (default: "html")

Example:

{ "url": "https://example.com", "format": "html" }

3. crawl_with_schema

Purpose: Precision data extraction using CSS selectors (the "hands" function)
Parameters:

url (string): The webpage URL to extract data from
extraction_schema (string): JSON string defining field names and CSS selectors

Example Schema:

{ "title": "h1", "description": "p.description", "price": ".price-value", "author": ".author-name", "tags": ".tag" }

Example Usage:

{ "url": "https://example.com/product", "extraction_schema": "{\"title\": \"h1\", \"price\": \".price\", \"description\": \"p\"}" }

4. take_screenshot

Purpose: Capture visual representation of webpage
Parameters:

url (string): The webpage URL to screenshot

Example:

{ "url": "https://example.com" }

Returns: Base64-encoded PNG image data with metadata

Integration with Claude Desktop

To use this server with Claude Desktop, add this configuration to your Claude Desktop settings:

{ "mcpServers": { "crawl4ai": { "command": "python3", "args": ["/path/to/crawl4ai-mcp/crawl4ai_mcp_server.py"], "env": {} } } }

Replace /path/to/crawl4ai-mcp/ with the actual path to your installation directory.

Error Handling

All tools include comprehensive error handling and return structured JSON responses:

{ "error": "Error description", "url": "https://example.com", "success": false }

Common error scenarios:

Invalid URL format
Network connectivity issues
Invalid extraction schemas
Screenshot capture failures

Development

Project Structure

crawl4ai-mcp/ ├── crawl4ai_mcp_server.py # Main server implementation ├── requirements.txt # Python dependencies ├── pyproject.toml # Project configuration ├── USAGE_EXAMPLES.md # Detailed usage examples └── README.md # This file

Dependencies

fastmcp: FastMCP framework for MCP server development
crawl4ai: Core web crawling and extraction library
pydantic: Data validation and parsing
playwright: Browser automation for screenshots

Testing

Run the linter to ensure code quality:

ruff check .

Test server startup:

python3 crawl4ai_mcp_server.py

Contributing

Fork the repository
Create a feature branch
Make your changes
Test thoroughly with MCP Inspector
Submit a pull request

License

This project is open source. See the LICENSE file for details.

Support

For issues and questions:

Check the troubleshooting section in USAGE_EXAMPLES.md
Test with MCP Inspector to isolate issues
Verify all dependencies are correctly installed
Ensure virtual environment is activated

Acknowledgments

Crawl4AI: Powerful web crawling and extraction capabilities
FastMCP: Streamlined MCP server development framework
Model Context Protocol: Standardized AI tool integration

This server cannot be installed

security - not tested

license - not found

quality - not tested

How are these scores calculated?

Related Resources

GitHub Repository

Need Help?

Report Issue

Related MCP Servers

ScrapeGraph MCP Serverofficial
ScrapeGraphAI
A
security
-
license
A
quality
A production-ready Model Context Protocol server that enables language models to leverage AI-powered web scraping capabilities, offering tools for transforming webpages to markdown, extracting structured data, and executing AI-powered web searches.
Last updated -
5
43
MIT License
Prysm MCP Server
pinkpixel-dev
A
security
A
license
A
quality
A Model Context Protocol server enabling AI assistants to scrape web content with high accuracy and flexibility, supporting multiple scraping modes and content formatting options.
Last updated -
4
17
2
MIT License
Better Fetch
flutterninja9
A
security
F
license
A
quality
A Model Context Protocol server that intelligently fetches and processes web content, transforming websites and documentation into clean, structured markdown with nested URL crawling capabilities.
Last updated -
2
20
5
Firecrawl MCP Server
ampcome-mcps
A
security
A
license
A
quality
A Model Context Protocol server that enables web scraping, crawling, and content extraction capabilities through integration with Firecrawl.
Last updated -
8
43,255
MIT License

View all related MCP servers

Crawl4AI MCP Server

Crawl4AI MCP Server

Features

Architecture

Installation

Prerequisites

Setup

Usage

Starting the Server

Testing with MCP Inspector

Available Tools

1. server_status

2. get_page_structure

3. crawl_with_schema

4. take_screenshot

Integration with Claude Desktop

Error Handling

Development

Project Structure

Dependencies

Testing

Contributing

License

Support

Acknowledgments

Related Resources

Related MCP Servers

ScrapeGraph MCP Serverofficial

Prysm MCP Server

Better Fetch

Firecrawl MCP Server

Appeared in Searches

New MCP Servers

MCP directory API