Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Crawl4AI MCP Serverget the main content from https://news.ycombinator.com in markdown format"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Crawl4AI MCP Server
A powerful Model Context Protocol (MCP) server that provides web scraping and crawling capabilities using Crawl4AI. This server acts as the "hands and eyes" for client-side AI, enabling intelligent web content analysis and extraction.
Features
π Page Structure Analysis: Extract clean HTML or Markdown content from any webpage
π― Schema-Based Extraction: Precision data extraction using CSS selectors and AI-generated schemas
πΈ Screenshot Capture: Visual webpage representation for analysis
β‘ Async Operations: Non-blocking web crawling with progress reporting
π‘οΈ Error Handling: Comprehensive error handling and validation
π MCP Integration: Full Model Context Protocol compatibility with logging and progress tracking
Related MCP server: MCP Web Tools Server
Architecture
βββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββ
β Client AI β β Crawl4AI MCP β β Web Content β
β ("Brain") βββββΊβ Server βββββΊβ (Websites) β
β β β ("Hands & Eyes") β β β
βββββββββββββββββββ βββββββββββββββββββββ βββββββββββββββββββFastMCP: Handles MCP protocol and tool registration
AsyncWebCrawler: Provides async web scraping capabilities
Stdio Transport: MCP-compatible communication channel
Error-Safe Logging: All logs directed to stderr to prevent protocol corruption
Installation
Prerequisites
Python 3.10 or higher
pip package manager
Setup
Clone or download this repository:
git clone <repository-url> cd crawl4ai-mcpCreate and activate virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activateInstall dependencies:
pip install -r requirements.txtInstall Playwright browsers (required for screenshots):
playwright install
Usage
Starting the Server
# Activate virtual environment
source venv/bin/activate
# Start the MCP server
python3 crawl4ai_mcp_server.pyTesting with MCP Inspector
For interactive testing and development:
# Start MCP Inspector interface
fastmcp dev crawl4ai_mcp_server.pyThis will start a web interface (usually at http://localhost:6274) where you can test all tools interactively.
Available Tools
1. server_status
Purpose: Get server health and capabilities information
Parameters: None
Example Response:
{
"server_name": "Crawl4AI-MCP-Server",
"version": "1.0.0",
"status": "operational",
"capabilities": ["web_crawling", "content_extraction", "screenshot_capture", "schema_based_extraction"]
}2. get_page_structure
Purpose: Extract webpage content for analysis (the "eyes" function)
Parameters:
url(string): The webpage URL to analyzeformat(string, optional): Output format - "html" or "markdown" (default: "html")
Example:
{
"url": "https://example.com",
"format": "html"
}3. crawl_with_schema
Purpose: Precision data extraction using CSS selectors (the "hands" function)
Parameters:
url(string): The webpage URL to extract data fromextraction_schema(string): JSON string defining field names and CSS selectors
Example Schema:
{
"title": "h1",
"description": "p.description",
"price": ".price-value",
"author": ".author-name",
"tags": ".tag"
}Example Usage:
{
"url": "https://example.com/product",
"extraction_schema": "{\"title\": \"h1\", \"price\": \".price\", \"description\": \"p\"}"
}4. take_screenshot
Purpose: Capture visual representation of webpage
Parameters:
url(string): The webpage URL to screenshot
Example:
{
"url": "https://example.com"
}Returns: Base64-encoded PNG image data with metadata
Integration with Claude Desktop
To use this server with Claude Desktop, add this configuration to your Claude Desktop settings:
{
"mcpServers": {
"crawl4ai": {
"command": "python3",
"args": ["/path/to/crawl4ai-mcp/crawl4ai_mcp_server.py"],
"env": {}
}
}
}Replace /path/to/crawl4ai-mcp/ with the actual path to your installation directory.
Error Handling
All tools include comprehensive error handling and return structured JSON responses:
{
"error": "Error description",
"url": "https://example.com",
"success": false
}Common error scenarios:
Invalid URL format
Network connectivity issues
Invalid extraction schemas
Screenshot capture failures
Development
Project Structure
crawl4ai-mcp/
βββ crawl4ai_mcp_server.py # Main server implementation
βββ requirements.txt # Python dependencies
βββ pyproject.toml # Project configuration
βββ USAGE_EXAMPLES.md # Detailed usage examples
βββ README.md # This fileDependencies
fastmcp: FastMCP framework for MCP server development
crawl4ai: Core web crawling and extraction library
pydantic: Data validation and parsing
playwright: Browser automation for screenshots
Testing
Run the linter to ensure code quality:
ruff check .Test server startup:
python3 crawl4ai_mcp_server.pyContributing
Fork the repository
Create a feature branch
Make your changes
Test thoroughly with MCP Inspector
Submit a pull request
License
This project is open source. See the LICENSE file for details.
Support
For issues and questions:
Check the troubleshooting section in USAGE_EXAMPLES.md
Test with MCP Inspector to isolate issues
Verify all dependencies are correctly installed
Ensure virtual environment is activated
Acknowledgments
Crawl4AI: Powerful web crawling and extraction capabilities
FastMCP: Streamlined MCP server development framework
Model Context Protocol: Standardized AI tool integration