Crawlab MCP Server
Official
by crawlab-team
# Crawlab MCP Server
This is a Model Context Protocol (MCP) server for Crawlab, allowing AI applications to interact with Crawlab's functionality.
## Overview
The MCP server provides a standardized way for AI applications to access Crawlab's features, including:
- Spider management (create, read, update, delete)
- Task management (run, cancel, restart)
- File management (read, write)
- Resource access (spiders, tasks)
## Architecture
The MCP Server/Client architecture facilitates communication between AI applications and Crawlab:
```mermaid
graph TB
User[User] --> Client[MCP Client]
Client --> LLM[LLM Provider]
Client <--> Server[MCP Server]
Server <--> Crawlab[Crawlab API]
subgraph "MCP System"
Client
Server
end
subgraph "Crawlab System"
Crawlab
DB[(Database)]
Crawlab <--> DB
end
class User,LLM,Crawlab,DB external;
class Client,Server internal;
%% Flow annotations
LLM -.-> |Tool calls| Client
Client -.-> |Executes tool calls| Server
Server -.-> |API requests| Crawlab
Crawlab -.-> |API responses| Server
Server -.-> |Tool results| Client
Client -.-> |Human-readable response| User
classDef external fill:#f9f9f9,stroke:#333,stroke-width:1px;
classDef internal fill:#d9edf7,stroke:#31708f,stroke-width:1px;
```
### Communication Flow
1. **User Query**: The user sends a natural language query to the MCP Client
2. **LLM Processing**: The Client forwards the query to an LLM provider (e.g., Claude, OpenAI)
3. **Tool Selection**: The LLM identifies necessary tools and generates tool calls
4. **Tool Execution**: The Client sends tool calls to the MCP Server
5. **API Interaction**: The Server executes the corresponding Crawlab API requests
6. **Response Generation**: Results flow back through the Server to the Client to the LLM
7. **User Response**: The Client delivers the final human-readable response to the user
## Installation and Usage
### Option 1: Install as a Python package
You can install the MCP server as a Python package, which provides a convenient CLI:
```bash
# Install from source
pip install -e .
# Or install from GitHub (when available)
# pip install git+https://github.com/crawlab-team/crawlab-mcp-server.git
```
After installation, you can use the CLI:
```bash
# Start the MCP server
crawlab_mcp-mcp server [--spec PATH_TO_SPEC] [--host HOST] [--port PORT]
# Start the MCP client
crawlab_mcp-mcp client SERVER_URL
```
### Option 2: Running Locally
### Prerequisites
- Python 3.8+
- Crawlab instance running and accessible
- API token from Crawlab
### Configuration
1. Copy the `.env.example` file to `.env`:
```
cp .env.example .env
```
2. Edit the `.env` file with your Crawlab API details:
```
CRAWLAB_API_BASE_URL=http://your-crawlab-instance:8080/api
CRAWLAB_API_TOKEN=your_api_token_here
```
### Running Locally
1. Install dependencies:
```
pip install -r requirements.txt
```
2. Run the server:
```
python server.py
```
### Running with Docker
1. Build the Docker image:
```
docker build -t crawlab-mcp-server .
```
2. Run the container:
```
docker run -p 8000:8000 --env-file .env crawlab-mcp-server
```
## Integration with Docker Compose
To add the MCP server to your existing Crawlab Docker Compose setup, add the following service to your `docker-compose.yml`:
```yaml
services:
# ... existing Crawlab services
mcp-server:
build: ./backend/mcp-server
ports:
- "8000:8000"
environment:
- CRAWLAB_API_BASE_URL=http://backend:8000/api
- CRAWLAB_API_TOKEN=your_api_token_here
depends_on:
- backend
```
## Using with AI Applications
The MCP server enables AI applications to interact with Crawlab through natural language. Following the architecture diagram above, here's how to use the MCP system:
### Setting Up the Connection
1. **Start the MCP Server**: Make sure your MCP server is running and accessible
2. **Configure the AI Client**: Connect your AI application to the MCP server
### Example: Using with Claude Desktop
1. Open Claude Desktop
2. Go to Settings > MCP Servers
3. Add a new server with the URL of your MCP server (e.g., `http://localhost:8000`)
4. In a conversation with Claude, you can now use Crawlab functionality by describing what you want to do in natural language
### Example Interactions
Based on our architecture, here are example interactions with the system:
**Create a Spider:**
```
User: "Create a new spider named 'Product Scraper' for the e-commerce project"
↓
LLM identifies intent and calls the create_spider tool
↓
MCP Server executes the API call to Crawlab
↓
Spider is created and details are returned to the user
```
**Run a Task:**
```
User: "Run the 'Product Scraper' spider on all available nodes"
↓
LLM calls the run_spider tool with appropriate parameters
↓
MCP Server sends the command to Crawlab API
↓
Task is started and confirmation is returned to the user
```
### Available Commands
You can interact with the system using natural language commands like:
- "List all my spiders"
- "Create a new spider with these specifications..."
- "Show me the code for the spider named X"
- "Update the file main.py in spider X with this code..."
- "Run spider X and notify me when it's complete"
- "Show me the results of the last run of spider X"
## Available Resources and Tools
These are the underlying tools that power the natural language interactions:
### Resources
- `spiders`: List all spiders
- `tasks`: List all tasks
### Tools
#### Spider Management
- `get_spider`: Get details of a specific spider
- `create_spider`: Create a new spider
- `update_spider`: Update an existing spider
- `delete_spider`: Delete a spider
#### Task Management
- `get_task`: Get details of a specific task
- `run_spider`: Run a spider
- `cancel_task`: Cancel a running task
- `restart_task`: Restart a task
- `get_task_logs`: Get logs for a task
#### File Management
- `get_spider_files`: List files for a spider
- `get_spider_file`: Get content of a specific file
- `save_spider_file`: Save content to a file