INTEGRATION_GUIDE.mdā¢7.47 kB
# Integration Guide: Crawl4AI MCP Server
This guide provides detailed instructions for integrating the Crawl4AI MCP Server with various AI clients and development environments.
## Claude Desktop Integration
### Step 1: Locate Claude Desktop Configuration
The Claude Desktop configuration file is located at:
**macOS:**
```
~/Library/Application Support/Claude/claude_desktop_config.json
```
**Windows:**
```
%APPDATA%\Claude\claude_desktop_config.json
```
**Linux:**
```
~/.config/Claude/claude_desktop_config.json
```
### Step 2: Add MCP Server Configuration
Add the following configuration to your `claude_desktop_config.json` file:
```json
{
"mcpServers": {
"crawl4ai": {
"command": "python3",
"args": ["/absolute/path/to/crawl4ai-mcp/crawl4ai_mcp_server.py"],
"env": {
"PYTHONPATH": "/absolute/path/to/crawl4ai-mcp"
}
}
}
}
```
**Important:** Replace `/absolute/path/to/crawl4ai-mcp/` with the actual full path to your installation directory.
### Step 3: Virtual Environment Setup
If you're using a virtual environment (recommended), use this configuration instead:
```json
{
"mcpServers": {
"crawl4ai": {
"command": "/absolute/path/to/crawl4ai-mcp/venv/bin/python3",
"args": ["/absolute/path/to/crawl4ai-mcp/crawl4ai_mcp_server.py"],
"env": {}
}
}
}
```
### Step 4: Restart Claude Desktop
After updating the configuration:
1. Completely quit Claude Desktop
2. Restart the application
3. The Crawl4AI MCP server should be automatically available
## Verification
### Check Server Connection
Ask Claude Desktop to use the server:
```
Can you check the status of the Crawl4AI server?
```
Claude should be able to call the `server_status` tool and return server information.
### Test Basic Functionality
Try extracting content from a webpage:
```
Can you get the page structure from https://example.com?
```
### Test Schema Extraction
Try precision data extraction:
```
Can you extract the title and description from https://news.ycombinator.com using appropriate CSS selectors?
```
## Alternative Integration Methods
### Direct MCP Client Integration
For custom MCP clients, connect to the server using stdio transport:
```python
import subprocess
import json
# Start the MCP server
process = subprocess.Popen(
["python3", "/path/to/crawl4ai_mcp_server.py"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True
)
# Send MCP protocol messages
request = {
"jsonrpc": "2.0",
"id": 1,
"method": "tools/list",
"params": {}
}
process.stdin.write(json.dumps(request) + "\n")
process.stdin.flush()
# Read response
response = process.stdout.readline()
print(json.loads(response))
```
### FastMCP Development Server
For development and testing:
```bash
# Start interactive development server
cd /path/to/crawl4ai-mcp
source venv/bin/activate
fastmcp dev crawl4ai_mcp_server.py
```
This provides a web interface for testing tools interactively.
## Configuration Options
### Environment Variables
You can customize server behavior using environment variables:
```json
{
"mcpServers": {
"crawl4ai": {
"command": "python3",
"args": ["/path/to/crawl4ai_mcp_server.py"],
"env": {
"CRAWL4AI_VERBOSE": "false",
"CRAWL4AI_TIMEOUT": "30",
"PYTHONPATH": "/path/to/crawl4ai-mcp"
}
}
}
}
```
### Custom Server Settings
For advanced users, you can modify server behavior by editing `crawl4ai_mcp_server.py`:
```python
# Example: Increase crawler timeout
async with AsyncWebCrawler(
verbose=False,
max_timeout=60 # Increase from default 30 seconds
) as crawler:
# ... crawler operations
```
## Troubleshooting Integration Issues
### Common Problems
#### 1. Server Not Found
**Error:** "MCP server 'crawl4ai' not found"
**Solutions:**
- Verify the path in `claude_desktop_config.json` is correct and absolute
- Ensure the file `crawl4ai_mcp_server.py` exists at the specified path
- Check file permissions (must be readable and executable)
#### 2. Python Environment Issues
**Error:** "ModuleNotFoundError" or import errors
**Solutions:**
- Use the full path to Python in your virtual environment
- Verify all dependencies are installed: `pip install -r requirements.txt`
- Install Playwright browsers: `playwright install`
#### 3. Permission Denied
**Error:** "Permission denied" when starting server
**Solutions:**
```bash
# Make the server file executable
chmod +x /path/to/crawl4ai_mcp_server.py
# Or use explicit python command
"command": "python3" # instead of direct file execution
```
#### 4. Server Crashes on Startup
**Check server logs by running manually:**
```bash
cd /path/to/crawl4ai-mcp
source venv/bin/activate
python3 crawl4ai_mcp_server.py
```
Look for error messages in the output.
#### 5. Tools Not Available
**Symptoms:** Claude Desktop can't see the crawling tools
**Solutions:**
- Restart Claude Desktop completely
- Check server status with: "Can you call the server_status tool?"
- Verify MCP protocol communication is working
### Debug Mode
Enable verbose logging by modifying the server:
```python
# In crawl4ai_mcp_server.py
logging.basicConfig(
level=logging.DEBUG, # Change from INFO to DEBUG
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
stream=sys.stderr,
force=True
)
```
### Testing Configuration
Test your configuration before using with Claude Desktop:
```bash
# Test server startup
cd /path/to/crawl4ai-mcp
source venv/bin/activate
python3 crawl4ai_mcp_server.py
# Should show FastMCP banner and wait for input
# Press Ctrl+C to exit
```
## Integration Best Practices
### 1. Path Management
- Always use absolute paths in configuration
- Avoid spaces in directory names
- Use forward slashes even on Windows for JSON configuration
### 2. Virtual Environment
- Always use a dedicated virtual environment
- Keep the environment activated when testing
- Document the Python version used
### 3. Error Handling
- Monitor Claude Desktop logs for integration issues
- Test tools individually before complex workflows
- Keep server logs for troubleshooting
### 4. Performance
- Be mindful of web scraping rate limits
- Test with small pages before large ones
- Monitor memory usage during heavy crawling
## Advanced Integration
### Custom Tool Extensions
You can extend the server with additional tools by modifying `crawl4ai_mcp_server.py`:
```python
@mcp.tool()
async def custom_analysis_tool(
url: Annotated[str, Field(description="The URL to analyze")],
ctx: Context = None
) -> str:
"""Custom analysis tool for specific use cases."""
# Your custom implementation
pass
```
### Integration with Other Services
The server can be extended to integrate with other services:
```python
# Example: Add database storage
@mcp.tool()
async def store_crawl_result(
url: str,
content: str,
ctx: Context = None
) -> str:
"""Store crawl results in database."""
# Database integration code
pass
```
## Support and Maintenance
### Regular Updates
- Keep Crawl4AI dependencies updated
- Monitor FastMCP for updates
- Test integration after Claude Desktop updates
### Backup Configuration
- Keep a backup of your `claude_desktop_config.json`
- Document any custom modifications
- Version control your server extensions
For additional support, refer to the main README.md file and the troubleshooting section.