Skip to main content
Glama
stgmt

crawl4ai-mcp

by stgmt

🕷️ Crawl4AI MCP Server

NPM version Node.js License: MIT Downloads Author

MCP (Model Context Protocol) server for Crawl4AI - Universal web crawling and data extraction for AI agents.

Integrate powerful web scraping capabilities into Claude, ChatGPT, and any MCP-compatible AI assistant.

📑 Table of Contents

🚀 Quick Start

# Install globally
npm install -g crawl4ai-mcp-sse-stdio

# Run in different modes
npx crawl4ai-mcp --stdio --endpoint https://your-crawl4ai-server.com
npx crawl4ai-mcp --sse --port 3001 --endpoint https://your-crawl4ai-server.com
npx crawl4ai-mcp --http --port 3000 --endpoint https://your-crawl4ai-server.com

# With optional bearer token
npx crawl4ai-mcp --stdio --endpoint https://your-crawl4ai-server.com --bearer-token your-token

With Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "crawl4ai": {
      "command": "npx",
      "args": [
        "crawl4ai-mcp-sse-stdio",
        "--stdio",
        "--endpoint", "https://your-crawl4ai-server.com",
        "--bearer-token", "your-optional-token"
      ]
    }
  }
}

🐳 Docker Usage

Docker Hub

Official Docker image available on Docker Hub:

Docker Image

# Pull and run the official Docker image
docker pull stgmt/crawl4ai-mcp:latest
docker run -p 3000:3000 stgmt/crawl4ai-mcp:latest

Docker Compose

Create a docker-compose.yml:

version: '3.8'
services:
  crawl4ai-mcp:
    image: stgmt/crawl4ai-mcp:latest
    ports:
      - "3000:3000"
    environment:
      - CRAWL4AI_ENDPOINT=https://your-crawl4ai-server.com
      - CRAWL4AI_BEARER_TOKEN=your-optional-token
    restart: unless-stopped

Run with:

docker-compose up -d

🛠️ Available Tools

1. crawl - Full Web Crawling

Extract complete content from any webpage.

{
  "name": "crawl",
  "arguments": {
    "url": "https://example.com",
    "wait_for": "css:.content",
    "timeout": 30000
  }
}

2. md - Markdown Extraction

Get clean markdown content from webpages.

{
  "name": "md", 
  "arguments": {
    "url": "https://docs.example.com",
    "clean": true
  }
}

3. html - Raw HTML

Retrieve raw HTML content.

{
  "name": "html",
  "arguments": {
    "url": "https://example.com"
  }
}

4. screenshot - Visual Capture

Take screenshots of webpages.

{
  "name": "screenshot",
  "arguments": {
    "url": "https://example.com",
    "full_page": true
  }
}

5. pdf - PDF Generation

Convert webpages to PDF.

{
  "name": "pdf",
  "arguments": {
    "url": "https://example.com",
    "format": "A4"
  }
}

6. execute_js - JavaScript Execution

Execute JavaScript on webpages.

{
  "name": "execute_js",
  "arguments": {
    "url": "https://example.com",
    "script": "document.title"
  }
}

⚙️ Configuration

Environment Variables

# REQUIRED: Crawl4AI endpoint URL
export CRAWL4AI_ENDPOINT="https://your-crawl4ai-server.com"

# OPTIONAL: Bearer authentication token
export CRAWL4AI_BEARER_TOKEN="your-api-token"

Command Line Options

crawl4ai-mcp --help

Options:
  --stdio              Run in STDIO mode for MCP clients
  --sse                Run in SSE mode for web interfaces
  --http               Run in HTTP mode
  --endpoint ENDPOINT  Crawl4AI API endpoint URL (REQUIRED)
  --bearer-token TOKEN Bearer authentication token (OPTIONAL)
  --port PORT          HTTP server port (default: 3000)
  --sse-port PORT      SSE server port (default: 9001)
  --version, -v        Show version

Basic Commands

# HTTP mode (recommended for testing)
crawl4ai-mcp --http --port 3000 --endpoint https://your-crawl4ai-server.com

# SSE mode (Server-Sent Events)
crawl4ai-mcp --sse --port 3001 --endpoint https://your-crawl4ai-server.com

# STDIO mode (for MCP clients)
crawl4ai-mcp --stdio --endpoint https://your-crawl4ai-server.com

# With optional bearer token
crawl4ai-mcp --http --port 3000 --endpoint https://your-crawl4ai-server.com --bearer-token your-token

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

📄 License

MIT License - see LICENSE file for details.


Made with ❤️ for the AI community

A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

Maintainers
Response time
0dRelease cycle
9Releases (12mo)

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/stgmt/crawl4ai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server