Crawl4AI MCP Wrapper

CLAUDE.md•2.48 kB

# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Development Setup ### Prerequisites - Python 3.8+ - Crawl4AI Docker container running on port 11235 - Virtual environment activated ### Common Commands **Start Crawl4AI Docker container:** ```bash docker run -d -p 11235:11235 --name crawl4ai --shm-size=2g unclecode/crawl4ai:latest ``` **Set up virtual environment:** ```bash python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt ``` **Run tests:** ```bash python test_wrapper.py ``` **Run MCP server in debug mode:** ```bash export CRAWL4AI_MCP_LOG=DEBUG python crawl4ai_mcp.py ``` **Check container health:** ```bash curl http://localhost:11235/health ``` ## Architecture This is a Model Context Protocol (MCP) wrapper that provides reliable stdio transport to Crawl4AI Docker API: ``` Claude Code → stdio (MCP) → FastMCP Server (crawl4ai_mcp.py) → HTTP/REST → Crawl4AI Docker (port 11235) ``` ### Core Components - **crawl4ai_mcp.py**: Main MCP server using FastMCP library with 7 async tools - **test_wrapper.py**: Comprehensive test suite for all tools - **requirements.txt**: Minimal dependencies (fastmcp, httpx, pydantic) ### MCP Tools Available 1. `scrape_markdown` - Extract clean markdown with content filtering 2. `extract_html` - Get preprocessed HTML structures 3. `capture_screenshot` - Full-page PNG screenshots 4. `generate_pdf` - Create PDF documents 5. `execute_javascript` - Run JS in browser context 6. `crawl_urls` - Multi-URL crawling with config 7. `ask_crawl4ai` - Query Crawl4AI documentation ### Key Implementation Details - All tools use `httpx.AsyncClient` with appropriate timeouts (30-300s) - Error handling returns consistent format with `success: False` and error messages - Base URL configurable via `CRAWL4AI_BASE_URL` environment variable (default: http://localhost:11235) - Tools follow Crawl4AI Docker API endpoints exactly (`/md`, `/html`, `/screenshot`, etc.) ## Testing Always test after changes: ```bash python test_wrapper.py ``` Tests verify container health and all tool functionality. Must pass before MCP server will work properly. ## Debugging - Enable debug logging: `export CRAWL4AI_MCP_LOG=DEBUG` - Check container status: `docker ps | grep crawl4ai` - View Claude Code MCP logs: `tail -f ~/Library/Logs/Claude/mcp*.log` - Verify port 11235 availability: `lsof -i :11235`

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/joedank/mcpcrawl4ai'

If you have feedback or need assistance with the MCP directory API, please join our Discord server