Crawlab MCP 服务器

这是 Crawlab 的模型上下文协议 (MCP) 服务器，允许 AI 应用程序与 Crawlab 的功能进行交互。

概述

MCP 服务器为 AI 应用程序访问 Crawlab 的功能提供了标准化的方式，包括：

蜘蛛管理（创建、读取、更新、删除）
任务管理（运行、取消、重启）
文件管理（读、写）
资源访问（蜘蛛、任务）

Related MCP server: OneSearch MCP Server

建筑学

MCP 服务器/客户端架构促进了 AI 应用程序和 Crawlab 之间的通信：

graph TB
    User[User] --> Client[MCP Client]
    Client --> LLM[LLM Provider]
    Client <--> Server[MCP Server]
    Server <--> Crawlab[Crawlab API]

    subgraph "MCP System"
        Client
        Server
    end

    subgraph "Crawlab System"
        Crawlab
        DB[(Database)]
        Crawlab <--> DB
    end

    class User,LLM,Crawlab,DB external;
    class Client,Server internal;

    %% Flow annotations
    LLM -.-> |Tool calls| Client
    Client -.-> |Executes tool calls| Server
    Server -.-> |API requests| Crawlab
    Crawlab -.-> |API responses| Server
    Server -.-> |Tool results| Client
    Client -.-> |Human-readable response| User

    classDef external fill:#f9f9f9,stroke:#333,stroke-width:1px;
    classDef internal fill:#d9edf7,stroke:#31708f,stroke-width:1px;

通信流程

用户查询：用户向 MCP 客户端发送自然语言查询
LLM 处理：客户端将查询转发给 LLM 提供商（例如 Claude、OpenAI）
工具选择：LLM 识别必要的工具并生成工具调用
工具执行：客户端向 MCP 服务器发送工具调用
API 交互：服务器执行相应的 Crawlab API 请求
响应生成：结果通过服务器流回到客户端，再到 LLM
用户响应：客户端向用户提供最终的可读响应

安装和使用

选项 1：作为 Python 包安装

您可以将 MCP 服务器安装为 Python 包，它提供了方便的 CLI：

# Install from source
pip install -e .

# Or install from GitHub (when available)
# pip install git+https://github.com/crawlab-team/crawlab-mcp-server.git

安装后，您可以使用 CLI：

# Start the MCP server
crawlab_mcp-mcp server [--spec PATH_TO_SPEC] [--host HOST] [--port PORT]

# Start the MCP client
crawlab_mcp-mcp client SERVER_URL

选项 2：本地运行

先决条件

Python 3.8+
Crawlab 实例正在运行并可访问
来自 Crawlab 的 API 令牌

配置

将.env.example文件复制到.env ：
```
cp .env.example .env
```

使用您的 Crawlab API 详细信息编辑.env文件：

CRAWLAB_API_BASE_URL=http://your-crawlab-instance:8080/api
CRAWLAB_API_TOKEN=your_api_token_here

本地运行

安装依赖项：
```
pip install -r requirements.txt
```
运行服务器：
```
python server.py
```

使用 Docker 运行

构建 Docker 镜像：
```
docker build -t crawlab-mcp-server .
```

运行容器：

docker run -p 8000:8000 --env-file .env crawlab-mcp-server

与 Docker Compose 集成

要将 MCP 服务器添加到您现有的 Crawlab Docker Compose 设置中，请将以下服务添加到您的docker-compose.yml中：

services:
  # ... existing Crawlab services
  
  mcp-server:
    build: ./backend/mcp-server
    ports:
      - "8000:8000"
    environment:
      - CRAWLAB_API_BASE_URL=http://backend:8000/api
      - CRAWLAB_API_TOKEN=your_api_token_here
    depends_on:
      - backend

与 AI 应用程序一起使用

MCP 服务器使 AI 应用能够通过自然语言与 Crawlab 进行交互。按照上面的架构图，MCP 系统的使用方法如下：

建立连接

启动 MCP 服务器：确保您的 MCP 服务器正在运行并且可以访问
配置 AI 客户端：将您的 AI 应用程序连接到 MCP 服务器

示例：与 Claude Desktop 一起使用

打开 Claude 桌面
前往“设置”>“MCP 服务器”
使用您的 MCP 服务器的 URL 添加新服务器（例如http://localhost:8000 ）
在与 Claude 的对话中，您现在可以通过用自然语言描述您想要做的事情来使用 Crawlab 功能

交互示例

根据我们的架构，以下是与系统交互的示例：

创建一个蜘蛛：

User: "Create a new spider named 'Product Scraper' for the e-commerce project"
↓
LLM identifies intent and calls the create_spider tool
↓
MCP Server executes the API call to Crawlab
↓
Spider is created and details are returned to the user

运行任务：

User: "Run the 'Product Scraper' spider on all available nodes"
↓
LLM calls the run_spider tool with appropriate parameters
↓
MCP Server sends the command to Crawlab API
↓
Task is started and confirmation is returned to the user

可用命令

您可以使用自然语言命令与系统进行交互，例如：

“列出我所有的蜘蛛”
“根据这些规格创建一个新的蜘蛛……”
“向我展示名为 X 的蜘蛛的代码”
“使用此代码更新 spider X 中的 main.py 文件...”
“运行 Spider X 并在完成后通知我”
“向我展示 Spider X 上次运行的结果”

可用资源和工具

这些是支持自然语言交互的底层工具：

资源

spiders ：列出所有蜘蛛
tasks ：列出所有任务

工具

蜘蛛管理

get_spider ：获取特定蜘蛛的详细信息
create_spider ：创建一个新的蜘蛛
update_spider ：更新现有的蜘蛛
delete_spider ：删除蜘蛛

任务管理

get_task ：获取特定任务的详细信息
run_spider ：运行蜘蛛
cancel_task ：取消正在运行的任务
restart_task ：重新启动任务
get_task_logs ：获取任务日志

文件管理

get_spider_files ：列出蜘蛛的文件
get_spider_file ：获取特定文件的内容
save_spider_file ：将内容保存到文件

Crawlab MCP Server