mcp 服务器 firecrawl

Firecrawl MCP 服务器

模型上下文协议 (MCP) 服务器实现与Firecrawl集成以实现网页抓取功能。

非常感谢@vrknetha 、 @cawstudios的初步实施！

特征

抓取、爬取、搜索、提取、深入研究和批量抓取支持
使用 JS 渲染进行网页抓取
URL 发现和抓取
带有内容提取的网页搜索
使用指数退避算法自动重试
- 具有内置速率限制的高效批处理
云 API 信用使用情况监控
综合测井系统
支持云和自托管 Firecrawl 实例
移动/桌面视口支持
带有标签包含/排除的智能内容过滤
SSE 支持

在 MCP.so 的游乐场或Klavis AI上使用我们的 MCP 服务器。

安装

使用 npx 运行

env FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp

手动安装

npm install -g firecrawl-mcp

在光标上运行

配置 Cursor 🖥️ 注意：需要 Cursor 版本 0.45.6+ 有关最新的配置说明，请参阅 Cursor 官方文档中有关配置 MCP 服务器的说明： Cursor MCP 服务器配置指南

在 Cursor v0.45.6中配置 Firecrawl MCP

打开游标设置
前往“功能”>“MCP 服务器”
点击“+ 添加新的 MCP 服务器”
输入以下内容：
- 名称：“firecrawl-mcp”（或您喜欢的名称）
- 类型：“命令”
- 命令： env FIRECRAWL_API_KEY=your-api-key npx -y firecrawl-mcp

在 Cursor v0.48.6中配置 Firecrawl MCP

打开游标设置
前往“功能”>“MCP 服务器”
点击“+ 添加新的全局 MCP 服务器”
输入以下代码：
{ "mcpServers": { "firecrawl-mcp": { "command": "npx", "args": ["-y", "firecrawl-mcp"], "env": { "FIRECRAWL_API_KEY": "YOUR-API-KEY" } } } }

如果您使用的是 Windows 并且遇到问题，请尝试cmd /c "set FIRECRAWL_API_KEY=your-api-key && npx -y firecrawl-mcp"

将your-api-key替换为您的 Firecrawl API 密钥。如果您还没有，可以创建一个帐户并从https://www.firecrawl.dev/app/api-keys获取。

添加后，刷新 MCP 服务器列表即可查看新工具。Composer 代理会在适当的情况下自动使用 Firecrawl MCP，但您可以通过描述您的网页抓取需求来明确请求使用 Firecrawl MCP。通过 Command+L (Mac) 访问 Composer，选择提交按钮旁边的“代理”，然后输入您的查询。

在风帆冲浪中奔跑

将其添加到您的./codeium/windsurf/model_config.json ：

{
  "mcpServers": {
    "mcp-server-firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "YOUR_API_KEY"
      }
    }
  }
}

使用 SSE 本地模式运行

要在本地使用服务器发送事件 (SSE) 而不是默认的 stdio 传输来运行服务器：

env SSE_LOCAL=true FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp

使用网址： http://localhost:3000/sse

通过 Smithery 安装（旧版）

要通过Smithery自动为 Claude Desktop 安装 Firecrawl：

npx -y @smithery/cli install @mendableai/mcp-server-firecrawl --client claude

在 VS Code 上运行

对于一键安装，请单击下面的安装按钮之一...

如需手动安装，请将以下 JSON 块添加到 VS Code 中的“用户设置 (JSON)”文件中。您可以按下Ctrl + Shift + P并输入Preferences: Open User Settings (JSON)来完成此操作。

{
  "mcp": {
    "inputs": [
      {
        "type": "promptString",
        "id": "apiKey",
        "description": "Firecrawl API Key",
        "password": true
      }
    ],
    "servers": {
      "firecrawl": {
        "command": "npx",
        "args": ["-y", "firecrawl-mcp"],
        "env": {
          "FIRECRAWL_API_KEY": "${input:apiKey}"
        }
      }
    }
  }
}

或者，你可以将其添加到工作区中名为.vscode/mcp.json的文件中。这样你就可以与其他人共享配置：

{
  "inputs": [
    {
      "type": "promptString",
      "id": "apiKey",
      "description": "Firecrawl API Key",
      "password": true
    }
  ],
  "servers": {
    "firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "${input:apiKey}"
      }
    }
  }
}

配置

环境变量

云 API 必需

FIRECRAWL_API_KEY ：您的 Firecrawl API 密钥
- 使用云 API 时必需（默认）
- 使用带有FIRECRAWL_API_URL的自托管实例时可选
FIRECRAWL_API_URL （可选）：自托管实例的自定义 API 端点
- 例如： https://firecrawl.your-domain.com
- 如果未提供，则将使用云 API（需要 API 密钥）

可选配置

重试配置

FIRECRAWL_RETRY_MAX_ATTEMPTS ：最大重试次数（默认值：3）
FIRECRAWL_RETRY_INITIAL_DELAY ：第一次重试前的初始延迟（以毫秒为单位）（默认值：1000）
FIRECRAWL_RETRY_MAX_DELAY ：重试之间的最大延迟时间（以毫秒为单位）（默认值：10000）
FIRECRAWL_RETRY_BACKOFF_FACTOR ：指数退避乘数（默认值：2）

信用使用监控

FIRECRAWL_CREDIT_WARNING_THRESHOLD ：信用使用警告阈值（默认值：1000）
FIRECRAWL_CREDIT_CRITICAL_THRESHOLD ：信用使用临界阈值（默认值：100）

配置示例

对于具有自定义重试和信用监控的云 API 使用情况：

# Required for cloud API
export FIRECRAWL_API_KEY=your-api-key

# Optional retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=5        # Increase max retry attempts
export FIRECRAWL_RETRY_INITIAL_DELAY=2000    # Start with 2s delay
export FIRECRAWL_RETRY_MAX_DELAY=30000       # Maximum 30s delay
export FIRECRAWL_RETRY_BACKOFF_FACTOR=3      # More aggressive backoff

# Optional credit monitoring
export FIRECRAWL_CREDIT_WARNING_THRESHOLD=2000    # Warning at 2000 credits
export FIRECRAWL_CREDIT_CRITICAL_THRESHOLD=500    # Critical at 500 credits

对于自托管实例：

# Required for self-hosted
export FIRECRAWL_API_URL=https://firecrawl.your-domain.com

# Optional authentication for self-hosted
export FIRECRAWL_API_KEY=your-api-key  # If your instance requires auth

# Custom retry configuration
export FIRECRAWL_RETRY_MAX_ATTEMPTS=10
export FIRECRAWL_RETRY_INITIAL_DELAY=500     # Start with faster retries

与 Claude Desktop 一起使用

将其添加到您的claude_desktop_config.json中：

{
  "mcpServers": {
    "mcp-server-firecrawl": {
      "command": "npx",
      "args": ["-y", "firecrawl-mcp"],
      "env": {
        "FIRECRAWL_API_KEY": "YOUR_API_KEY_HERE",

        "FIRECRAWL_RETRY_MAX_ATTEMPTS": "5",
        "FIRECRAWL_RETRY_INITIAL_DELAY": "2000",
        "FIRECRAWL_RETRY_MAX_DELAY": "30000",
        "FIRECRAWL_RETRY_BACKOFF_FACTOR": "3",

        "FIRECRAWL_CREDIT_WARNING_THRESHOLD": "2000",
        "FIRECRAWL_CREDIT_CRITICAL_THRESHOLD": "500"
      }
    }
  }
}

系统配置

服务器包含几个可配置的参数，可以通过环境变量设置。如果未配置，则使用以下默认值：

const CONFIG = {
  retry: {
    maxAttempts: 3, // Number of retry attempts for rate-limited requests
    initialDelay: 1000, // Initial delay before first retry (in milliseconds)
    maxDelay: 10000, // Maximum delay between retries (in milliseconds)
    backoffFactor: 2, // Multiplier for exponential backoff
  },
  credit: {
    warningThreshold: 1000, // Warn when credit usage reaches this level
    criticalThreshold: 100, // Critical alert when credit usage reaches this level
  },
};

这些配置控制：

重试行为
- 由于速率限制，自动重试失败的请求
- 使用指数退避算法来避免 API 过载
- 示例：使用默认设置，将在以下时间尝试重试：
  - 第一次重试：延迟 1 秒
  - 第二次重试：延迟 2 秒
  - 第三次重试：延迟 4 秒（上限为 maxDelay）
信用使用监控
- 跟踪云 API 使用情况的 API 信用消耗
- 在指定阈值处发出警告
- 有助于防止意外的服务中断
- 示例：使用默认设置：
  - 剩余 1000 个积分时发出警告
  - 剩余 100 个积分，发出严重警报

速率限制和批处理

该服务器利用 Firecrawl 的内置速率限制和批处理功能：

采用指数退避算法的自动速率限制处理
批量操作的高效并行处理
智能请求排队和限制
暂时性错误自动重试

可用工具

1. 抓取工具（ `firecrawl_scrape` ）

使用高级选项从单个 URL 抓取内容。

{
  "name": "firecrawl_scrape",
  "arguments": {
    "url": "https://example.com",
    "formats": ["markdown"],
    "onlyMainContent": true,
    "waitFor": 1000,
    "timeout": 30000,
    "mobile": false,
    "includeTags": ["article", "main"],
    "excludeTags": ["nav", "footer"],
    "skipTlsVerification": false
  }
}

2.批量抓取工具（ `firecrawl_batch_scrape` ）

通过内置速率限制和并行处理有效地抓取多个 URL。

{
  "name": "firecrawl_batch_scrape",
  "arguments": {
    "urls": ["https://example1.com", "https://example2.com"],
    "options": {
      "formats": ["markdown"],
      "onlyMainContent": true
    }
  }
}

响应包括用于状态检查的操作ID：

{
  "content": [
    {
      "type": "text",
      "text": "Batch operation queued with ID: batch_1. Use firecrawl_check_batch_status to check progress."
    }
  ],
  "isError": false
}

3. 检查批次状态（ `firecrawl_check_batch_status` ）

检查批量操作的状态。

{
  "name": "firecrawl_check_batch_status",
  "arguments": {
    "id": "batch_1"
  }
}

4.搜索工具（ `firecrawl_search` ）

搜索网络并选择性地从搜索结果中提取内容。

{
  "name": "firecrawl_search",
  "arguments": {
    "query": "your search query",
    "limit": 5,
    "lang": "en",
    "country": "us",
    "scrapeOptions": {
      "formats": ["markdown"],
      "onlyMainContent": true
    }
  }
}

5. 爬网工具（ `firecrawl_crawl` ）

使用高级选项启动异步爬网。

{
  "name": "firecrawl_crawl",
  "arguments": {
    "url": "https://example.com",
    "maxDepth": 2,
    "limit": 100,
    "allowExternalLinks": false,
    "deduplicateSimilarURLs": true
  }
}

6. 提取工具（ `firecrawl_extract` ）

使用 LLM 功能从网页中提取结构化信息。支持云端 AI 和自托管 LLM 提取。

{
  "name": "firecrawl_extract",
  "arguments": {
    "urls": ["https://example.com/page1", "https://example.com/page2"],
    "prompt": "Extract product information including name, price, and description",
    "systemPrompt": "You are a helpful assistant that extracts product information",
    "schema": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "price": { "type": "number" },
        "description": { "type": "string" }
      },
      "required": ["name", "price"]
    },
    "allowExternalLinks": false,
    "enableWebSearch": false,
    "includeSubdomains": false
  }
}

响应示例：

{
  "content": [
    {
      "type": "text",
      "text": {
        "name": "Example Product",
        "price": 99.99,
        "description": "This is an example product description"
      }
    }
  ],
  "isError": false
}

提取工具选项：

urls ：从中提取信息的 URL 数组
prompt ：LLM 提取的自定义提示
systemPrompt ：指导 LLM 的系统提示
schema ：用于结构化数据提取的 JSON 模式
allowExternalLinks ：允许从外部链接提取
enableWebSearch ：启用网页搜索以获取更多上下文
includeSubdomains ：在提取中包含子域

使用自托管实例时，提取将使用您配置的 LLM。对于云 API，它使用 Firecrawl 的托管 LLM 服务。

7.深度研究工具（firecrawl_deep_research）

使用智能爬取、搜索和 LLM 分析对查询进行深度网络研究。

{
  "name": "firecrawl_deep_research",
  "arguments": {
    "query": "how does carbon capture technology work?",
    "maxDepth": 3,
    "timeLimit": 120,
    "maxUrls": 50
  }
}

参数：

查询（字符串，必需）：要探索的研究问题或主题。
maxDepth（数字，可选）：爬行/搜索的最大递归深度（默认值：3）。
timeLimit（数字，可选）：研究会话的时间限制（以秒为单位）（默认值：120）。
maxUrls（数字，可选）：要分析的最大 URL 数量（默认值：50）。

由法学硕士根据研究生成的最终分析。(data.finalAnalysis)
还可能包括研究过程中使用的结构化活动和来源。

8.生成LLMs.txt工具（firecrawl_generate_llmstxt）

为给定域生成标准化的 llms.txt 文件（以及可选的 llms-full.txt 文件）。该文件定义了大型语言模型应如何与网站交互。

{
  "name": "firecrawl_generate_llmstxt",
  "arguments": {
    "url": "https://example.com",
    "maxUrls": 20,
    "showFullText": true
  }
}

参数：

url（字符串，必需）：要分析的网站的基本 URL。
maxUrls（数字，可选）：要包含的最大 URL 数量（默认值：10）。
showFullText（布尔值，可选）：是否在响应中包含 llms-full.txt 内容。

生成的 llms.txt 文件内容和可选的 llms-full.txt（data.llmstxt 和/或 data.llmsfulltxt）

日志系统

该服务器包括全面的日志记录：

运营状况及进展
性能指标
信用使用监控
速率限制跟踪
错误条件

日志消息示例：

[INFO] Firecrawl MCP Server initialized successfully
[INFO] Starting scrape for URL: https://example.com
[INFO] Batch operation queued with ID: batch_1
[WARNING] Credit usage has reached warning threshold
[ERROR] Rate limit exceeded, retrying in 2s...

错误处理

服务器提供了强大的错误处理：

暂时性错误自动重试
带退避的速率限制处理
详细错误消息
信用使用警告
网络弹性

错误响应示例：

{
  "content": [
    {
      "type": "text",
      "text": "Error: Rate limit exceeded. Retrying in 2 seconds..."
    }
  ],
  "isError": true
}

发展

# Install dependencies
npm install

# Build
npm run build

# Run tests
npm test

贡献

分叉存储库
创建你的功能分支
运行测试： npm test
提交拉取请求

感谢贡献者

感谢@vrknetha 、 @cawstudios的初步实施！

感谢 MCP.so 和 Klavis AI 的托管以及@gstarwd 、 @xiangkaiz和@zihaolin96集成我们的服务器。

执照

MIT 许可证 - 详情请参阅许可证文件

Install Server

HTTP connection URL

security – no known vulnerabilities

license - permissive license

quality - confirmed to work

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

Tools

View all tools

模型上下文协议 (MCP) 服务器实现与 FireCrawl 集成，具有高级网页抓取功能。

Related Resources

Reddit Discussion about this server

Related MCP Servers

Search1API MCP Server
fatwang2
A
security
A
license
A
quality
A Model Context Protocol (MCP) server that provides search and crawl functionality using Search1API.
Last updated -
5
206
111
TypeScript
MIT License
mcp-hn
erithwik
A
security
A
license
A
quality
A Model Context Protocol (MCP) server that provides tools for searching and fetching information from Hacker News.
Last updated -
4
48
Python
MIT License
WebSearch
josemartinrodriguezmortaloni
A
security
F
license
A
quality
Built as a Model Context Protocol (MCP) server that provides advanced web search, content extraction, web crawling, and scraping capabilities using the Firecrawl API.
Last updated -
4
1
Python
OneSearch MCP Server
yokingma
-
security
A
license
-
quality
A Model Context Protocol server that enables web search, scraping, crawling, and content extraction through multiple engines including SearXNG, Firecrawl, and Tavily.
Last updated -
35
11
TypeScript
MIT License

View all related MCP servers