浏览器自动化 MCP 服务器

🤖 浏览器自动化代理

一款基于 MCP（模型控制程序）构建的强大浏览器自动化工具，它将网页抓取功能与 LLM 驱动的智能技术相结合。该代理可以搜索 Google、导航网页，并智能地从各种网站（包括 GitHub、Stack Overflow 和文档网站）抓取内容。

🚀 功能

🔍 Google 搜索集成：查找并检索任何查询的热门搜索结果
🕸️ 智能网页抓取：针对不同网站类型定制抓取策略：
- 📂 GitHub 存储库
- 💬 Stack Overflow 问答
- 📚 文档页面
- 🌐 通用网站
🧠 人工智能处理：使用 Mistral AI 理解和处理抓取的内容
🥷 隐身模式：实施浏览器指纹保护以避免被发现
💾 内容保存：自动保存从抓取页面截取的屏幕截图和文本内容

🏗️ 建筑

该项目采用由 MCP 提供支持的客户端-服务器架构：

🖥️ 服务器：处理浏览器自动化和网页抓取任务
👤 客户端：使用 Mistral AI 和 LangGraph 提供 AI 接口
📡 通信：使用 stdio 进行客户端-服务器通信

⚙️ 要求

🐍 Python 3.8+
🎭 剧作家
🧩 MCP（模型控制程序）
🔑 Mistral AI API 密钥

📥 安装

克隆存储库：

git clone https://github.com/yourusername/browser-automation-agent.git
cd browser-automation-agent

安装依赖项：

pip install -r requirements.txt

安装 Playwright 浏览器：

playwright install

在项目根目录中创建一个.env文件并添加您的 Mistral AI API 密钥：

MISTRAL_API_KEY=your_api_key_here

📋 使用方法

运行服务器

python main.py

运行客户端

python client.py

示例交互

一旦服务器和客户端都运行：

出现提示时输入您的查询
代理人将：
- 🔍 在 Google 上搜索相关结果
- 🧭 导航至顶部结果
- 📊 根据网站类型抓取内容
- 📸 将屏幕截图和内容保存到文件中
- 📤 返回已处理的信息

🛠️ 工具功能

`get_top_google_url`

🔍 搜索 Google 并返回给定查询的最佳结果 URL。

`browse_and_scrape`

🌐 导航到 URL 并根据网站类型抓取内容。

`scrape_github`

📂 专门从 GitHub 存储库中提取 README 内容和代码块。

`scrape_stackoverflow`

💬 从 Stack Overflow 页面提取问题、答案、评论和代码块。

`scrape_documentation`

📚 针对提取文档内容和代码示例进行了优化。

`scrape_generic`

🌐 从通用网站中提取段落文本和代码块。

📁 文件结构

browser-automation-agent/
├── main.py            # MCP server implementation
├── client.py          # Mistral AI client implementation
├── requirements.txt   # Project dependencies
├── .env               # Environment variables (API keys)
└── README.md          # Project documentation

📤 输出文件

代理生成两种带有时间戳的输出文件：

📸 final_page_YYYYMMDD_HHMMSS.png ：最终页面状态的屏幕截图
📄 scraped_content_YYYYMMDD_HHMMSS.txt ：从页面中提取的文本内容

⚙️ 定制

您可以在代码中修改以下参数：

🖥️ 浏览器窗口大小：在browse_and_scrape中调整width和height
👻 无头模式：设置headless=True进行隐形浏览器操作
🔢 Google 搜索结果数量：更改get_top_google_url中的num_results

❓ 故障排除

🔌 连接问题：确保服务器和客户端都在不同的终端中运行
🎭 Playwright 错误：确保浏览器已playwright install
🔑 API 密钥错误：验证您的 Mistral API 密钥是否在.env文件中正确设置
🛣️ 路径错误：如果需要，请更新client.py中main.py的路径

📜 许可证

MIT 许可证

🤝 贡献

欢迎贡献代码！欢迎提交 Pull 请求。

由🧩 MCP、🎭 Playwright 和🧠 Mistral AI 构建

This server cannot be installed

security - not tested

license - not found

quality - not tested

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

通过浏览器自动化工具实现智能网页抓取，该工具可以搜索 Google、导航到网页以及从包括 GitHub、Stack Overflow 和文档网站在内的各种网站提取内容。

Related MCP Servers

MCP Webscan Server
bsmi021
A
security
A
license
A
quality
Enables web content scanning and analysis by fetching, analyzing, and extracting information from web pages using tools like page fetching, link extraction, site crawling, and more.
Last updated -
6
7
TypeScript
MIT License
Browser Use Server
ztobs
A
security
F
license
A
quality
Enables browser automation using Python scripts, offering operations like taking webpage screenshots, retrieving HTML content, and executing JavaScript.
Last updated -
4
18
Python
MCP Server Firecrawl
Msparihar
A
security
A
license
A
quality
A server that provides web scraping and intelligent content searching capabilities using the Firecrawl API, enabling AI agents to extract structured data from websites and perform content searches.
Last updated -
5
2
TypeScript
MIT License
MCP Firecrawl Server
codyde
A
security
F
license
A
quality
A server that provides tools to scrape websites and extract structured data from them using Firecrawl's APIs, supporting both basic website scraping in multiple formats and custom schema-based data extraction.
Last updated -
2
JavaScript

View all related MCP servers

Appeared in Searches

Information about Scrapping or Web Scraping

Browser Automation MCP Server