Skip to main content
Glama

Browser-MCP Server

by Euraxluo
README.md23.2 kB
# Session-Based Browser-Use FastMCP Server [![CI](https://github.com/Euraxluo/browser-mcp/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/Euraxluo/browser-mcp/actions/workflows/ci.yml) [![codecov](https://codecov.io/gh/Euraxluo/browser-mcp/branch/main/graph/badge.svg)](https://codecov.io/gh/Euraxluo/browser-mcp) [English](#english) | [中文](#chinese) ## English A modern Model Context Protocol (MCP) server that provides advanced browser automation capabilities using the FastMCP framework. Features session-based instance management, TTL cleanup, PDF generation, file downloads, cookie management, and comprehensive browser configuration options. **All browser operations are implemented via [browser-use](https://github.com/archipelago-technology/browser-use).** ### 🎯 Key Features - **Session-Based Management**: Each MCP session gets its own isolated browser instance automatically - **Advanced Browser Control**: Full browser automation with Playwright backend (via browser-use) - **PDF Generation**: Convert web pages to PDF with custom formatting options - **File Operations**: Download/upload files, manage file system, and access all temp files - **Cookie Management**: Set, get, and manage browser cookies for authentication - **Screenshot Capture**: Take full-page, viewport, or element screenshots - **Tab Management**: Create, switch, and close browser tabs - **Content Extraction**: Extract and search page content - **Session Persistence**: Automatic cleanup with configurable TTL - **Multi-Instance Support**: Run multiple isolated browser sessions - **Configurable Security**: All browser security settings are configurable via API ### 🚀 Quick Start 1. **Install Dependencies**: Using uv (recommended): ```bash uv sync --all-extras ``` 2. **Install the Browser**: ```bash uv run playwright install --with-deps chromium ``` 3. **Start the Server**: Using uv (recommended): ```bash uv run main.py ``` 4. **Basic Usage (Direct SessionBrowserManager)**: ```python # Direct usage without MCP protocol (for testing/development) from browser_fastmcp_server import SessionBrowserManager, BrowserConfig import asyncio async def main(): # Create session manager manager = SessionBrowserManager(max_instances=5, default_ttl=300) await manager.start_cleanup_task() # Create a new browser session session_id = "test_session_123" instance = await manager.get_or_create_session_instance( session_id, BrowserConfig(headless=True) ) # Navigate to a website browser_session = instance.browser_session await browser_session.navigate("https://example.com") # Get page elements state_summary = await browser_session.get_state_summary(cache_clickable_elements_hashes=True) print(f"Interactive elements: {len(state_summary.selector_map)}") # Take a screenshot page = await browser_session.get_current_page() screenshot_bytes = await page.screenshot(full_page=True) # Close session when done await manager.close_session(session_id) await manager.shutdown() if __name__ == "__main__": asyncio.run(main()) ``` ### 🛠️ Run Tests Install test dependencies and run all tests: ```bash uv run python -m pytest test_browser_workflow_test.py test_browser_fastmcp_client.py test_browser_test.py -v ``` ### 🛠️ Core Tools (API) #### Session Management - `create_chrome_instance(headless, viewport_width, viewport_height)` → Create a new browser session, returns `session_id` - `close_instance(session_id)` → Close a specific session - `get_instance_info(session_id)` → Get info for a session - `check_browser_health(session_id)` → Check the health status of a browser session and provide recovery suggestions - `get_browser_status()` → List all sessions - `close_all_instances()` → Close all sessions #### Browser Configuration - `set_browser_config(session_id, headless, no_sandbox, user_agent, viewport_width, viewport_height, disable_web_security)` → Set browser config (restart if needed) - `get_browser_config(session_id)` → Get current config #### Navigation & Page Control - `navigate_to(session_id, url, new_tab=False)` → Go to any URL (optionally in new tab) - `navigate_back(session_id)` / `navigate_forward(session_id)` → History navigation - `refresh_page(session_id)` → Refresh the current page - `get_page_state(session_id)` → List interactive elements with indices #### Tab Management - `get_tabs_info(session_id)` → List all open tabs - `switch_tab(session_id, page_id)` → Switch between tabs - `close_tab(session_id, page_id)` → Close specific tab #### Element Interaction - `click_element(session_id, index)` → Click element by index - `click_element_by_xpath(session_id, xpath)` → Click element by XPath - `input_text(session_id, index, text)` → Type into form fields - `set_element_value(session_id, index, value)` → Set input/select value directly - `get_element_info(session_id, index=None, xpath=None)` → Get element info (by index or xpath) - `send_keys(session_id, keys)` → Send keyboard shortcuts - `upload_file(session_id, index, file_path)` → Upload files to forms - `get_dropdown_options(session_id, index)` → Inspect select elements #### Media & Files - `take_screenshot(session_id, target=None, width=None, height=None, full_page=True, quality=90, format="png")` → Capture screenshots - `generate_pdf(session_id, url=None, html_content=None, output_filename=None, ...)` → Save page as PDF - `download_file(session_id, url, output_filename=None, timeout=30)` → Download files from URLs - `download_image(session_id, image_url, output_filename=None, timeout=30)` → Download images specifically #### Cookie & Session Management - `set_cookie(session_id, name, value, domain, path, http_only, secure, same_site, expires, max_age)` → Set browser cookies - `get_cookies(session_id, domain=None)` → Retrieve current cookies #### Utilities - `scroll_page(session_id, direction="down")` → Scroll up/down - `extract_content(session_id, query)` → Extract text content - `wait(seconds)` → Pause execution - `browser_tips()` → Get automation best practices - `search_bing(session_id, query)` → Bing search ### 📚 Resources (REST-style) - `browser://status` → Manager and sessions status - `browser://instances` → All sessions info - `browser://instance/{id}/page` → Session page info - `browser://instance/{id}/tabs` → Session tabs - `browser://instance/{id}/screenshots` → Session screenshots - `browser://instance/{id}/status` → Session status (detailed) - `browser://instance/{id}/files` → Session temp files - `browser://instance/{id}/cookies` → Session cookies - `browser://instance/{id}/file/{relative_path}` → Read a file in session temp - `browser://help` → This help ### 🔧 Configuration Configure the server using environment variables: ```bash # Maximum number of concurrent browser instances BROWSER_MAXIMUM_INSTANCES=10 # Session TTL in seconds (default: 30 minutes) BROWSER_INSTANCE_TTL=1800 # Command execution timeout in seconds BROWSER_EXECUTE_TIMEOUT=30 # Cleanup interval in seconds BROWSER_CLEANUP_INTERVAL=60 ``` ### 📝 Prompts Built-in prompts for common automation scenarios: - `web_testing(url, test_scenario)` → Web testing workflows - `data_extraction(url, data_type)` → Data extraction strategies - `form_filling(url, form_data)` → Automated form filling (returns conversation) - `automation_troubleshooting()` → Debugging help ### 🔌 MCP Integration #### Using with Claude Desktop 1. **Add to Claude Desktop Configuration**: Edit your Claude Desktop configuration file (usually at `~/Library/Application Support/Claude/claude_desktop_config.json` on macOS): ```json { "mcpServers": { "browser-mcp": { "command": "uv", "args": ["run", "fastmcp", "run", "/path/to/browser-mcp/browser_fastmcp_server.py"], "env": { "BROWSER_MAXIMUM_INSTANCES": "5", "BROWSER_INSTANCE_TTL": "1800" } } } } ``` 2. **Restart Claude Desktop** to load the MCP server 3. **Start Using**: The browser automation tools will now be available in your Claude conversations #### Using with MCP Client (Two Ways) **Method 1: Network-based MCP Client (via HTTP/SSE)** ```python import asyncio from mcp import ClientSession, SSEClientTransport async def main(): # Connect to the running server via network transport = SSEClientTransport("http://localhost:8000/sse") async with ClientSession(transport) as session: # Initialize session await session.initialize() # Start browser info = await session.call_tool("create_chrome_instance", {"headless": True}) session_id = info["session_id"] # Navigate to website await session.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"}) # Take screenshot await session.call_tool("take_screenshot", {"session_id": session_id}) # Close session await session.call_tool("close_instance", {"session_id": session_id}) if __name__ == "__main__": asyncio.run(main()) ``` **Method 2: Direct Client (No Network)** ```python import asyncio from fastmcp import Client from browser_fastmcp_server import mcp as browsers_mcp async def main(): # Direct client connection (no network) client = Client(browsers_mcp) async with client: # Start browser session = await client.call_tool("create_chrome_instance", {"headless": True}) session_id = session.data.session_id # Navigate to website await client.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"}) # Take screenshot await client.call_tool("take_screenshot", {"session_id": session_id}) # Close session await client.call_tool("close_instance", {"session_id": session_id}) if __name__ == "__main__": asyncio.run(main()) ``` ### 🔒 Authentication For server deployments requiring authentication, modify `main.py` to set an AuthProvider before startup: **Basic Authentication:** ```python from fastmcp.auth import BasicAuth # Add this before mcp.run() mcp.auth = BasicAuth(username="admin", password="password") ``` **JWT Authentication (Recommended for Production):** For more advanced authentication, we recommend using [fastmcp-authentication](https://github.com/Euraxluo/fastmcp-authentication): ```python from fastmcp_authentication import BearerAuthProvider JWKS_URI = "http://localhost:8080/.well-known/jwks.json" auth = BearerAuthProvider( jwks_uri=JWKS_URI, issuer="http://localhost:8080", audience="localhost:8080", algorithm="RS256" ) mcp.auth = auth ``` ### 💡 Use Cases - **Web Testing**: Automated functional, security, and performance testing - **Data Scraping**: Extract structured data from websites - **Form Automation**: Fill and submit web forms programmatically - **Content Monitoring**: Track changes in web content - **Screenshot Documentation**: Capture visual evidence for reports - **PDF Generation**: Convert web pages to PDF documents - **Session Management**: Handle authenticated workflows ### 🔒 Security Features - Session isolation between MCP clients - Secure cookie management with HttpOnly and Secure flags - Configurable browser security settings (CORS, sandbox, etc.) - Automatic cleanup of temporary files - TTL-based session expiration ### 🐳 Docker Usage Build the image: ```bash docker build -t browser-mcp . ``` Run the server (default: port 8000, SSE transport): ```bash docker run -p 8000:8000 browser-mcp ``` You can override startup parameters via environment variables: ```bash docker run -e MCP_PORT=9000 -e MCP_TRANSPORT=http -e MCP_HOST=127.0.0.1 -p 9000:9000 browser-mcp ``` --- ## Chinese 基于会话的浏览器自动化 FastMCP 服务器,提供先进的浏览器自动化功能,使用 FastMCP 框架构建。**所有浏览器操作均通过 [browser-use](https://github.com/archipelago-technology/browser-use) 实现。** ### 🎯 核心特性 - **基于会话的管理**: 每个 MCP 会话自动获得独立的浏览器实例 - **高级浏览器控制**: 基于 Playwright 的完整浏览器自动化(由 browser-use 提供) - **PDF 生成**: 将网页转换为 PDF,支持自定义格式选项 - **文件操作**: 下载/上传文件,管理临时文件目录 - **Cookie 管理**: 设置、获取和管理浏览器 Cookie 用于身份验证 - **截图捕获**: 全页面、视口或元素截图 - **标签页管理**: 创建、切换和关闭浏览器标签页 - **内容提取**: 提取和搜索页面内容 - **会话持久化**: 自动清理,可配置 TTL - **多实例支持**: 运行多个隔离的浏览器会话 - **可配置安全性**: 所有浏览器安全设置均可通过 API 配置 ### 🚀 快速开始 1. **安装依赖**: 使用 uv(推荐): ```bash uv sync --all-extras ``` 2. **安装浏览器**: ```bash uv run playwright install --with-deps chromium ``` 3. **启动服务器**: 使用 uv(推荐): ```bash uv run main.py ``` 4. **基本使用(直接使用 SessionBrowserManager)**: ```python # 直接使用,不通过 MCP 协议(用于测试/开发) from browser_fastmcp_server import SessionBrowserManager, BrowserConfig import asyncio async def main(): # 创建会话管理器 manager = SessionBrowserManager(max_instances=5, default_ttl=300) await manager.start_cleanup_task() # 创建新浏览器会话 session_id = "test_session_123" instance = await manager.get_or_create_session_instance( session_id, BrowserConfig(headless=True) ) # 导航到网站 browser_session = instance.browser_session await browser_session.navigate("https://example.com") # 获取页面元素 state_summary = await browser_session.get_state_summary(cache_clickable_elements_hashes=True) print(f"交互元素: {len(state_summary.selector_map)}") # 截图 page = await browser_session.get_current_page() screenshot_bytes = await page.screenshot(full_page=True) # 完成后关闭会话 await manager.close_session(session_id) await manager.shutdown() if __name__ == "__main__": asyncio.run(main()) ``` ### 🛠️ 运行测试 安装测试依赖并运行所有测试: ```bash uv run python -m pytest test_browser_workflow_test.py test_browser_fastmcp_client.py test_browser_test.py -v ``` ### 🛠️ 核心工具(API) #### 会话管理 - `create_chrome_instance(headless, viewport_width, viewport_height)` → 创建新浏览器会话,返回 `session_id` - `close_instance(session_id)` → 关闭指定会话 - `get_instance_info(session_id)` → 获取会话信息 - `check_browser_health(session_id)` → 检查浏览器会话的健康状态并提供恢复建议 - `get_browser_status()` → 列出所有会话 - `close_all_instances()` → 关闭所有会话 #### 浏览器配置 - `set_browser_config(session_id, headless, no_sandbox, user_agent, viewport_width, viewport_height, disable_web_security)` → 设置浏览器配置(如需重启自动重启) - `get_browser_config(session_id)` → 获取当前配置 #### 导航和页面控制 - `navigate_to(session_id, url, new_tab=False)` → 导航到 URL(可选新标签页) - `navigate_back(session_id)` / `navigate_forward(session_id)` → 历史记录导航 - `refresh_page(session_id)` → 刷新当前页面 - `get_page_state(session_id)` → 获取带索引的交互元素 #### 标签页管理 - `get_tabs_info(session_id)` → 列出所有打开的标签页 - `switch_tab(session_id, page_id)` → 切换标签页 - `close_tab(session_id, page_id)` → 关闭指定标签页 #### 元素交互 - `click_element(session_id, index)` → 按索引点击元素 - `click_element_by_xpath(session_id, xpath)` → 按 XPath 点击元素 - `input_text(session_id, index, text)` → 在表单字段中输入文本 - `set_element_value(session_id, index, value)` → 直接设置输入/选择值 - `get_element_info(session_id, index=None, xpath=None)` → 获取元素信息(按索引或 xpath) - `send_keys(session_id, keys)` → 发送键盘快捷键 - `upload_file(session_id, index, file_path)` → 上传文件到表单 - `get_dropdown_options(session_id, index)` → 检查 select 元素 #### 媒体和文件 - `take_screenshot(session_id, target=None, width=None, height=None, full_page=True, quality=90, format="png")` → 截图 - `generate_pdf(session_id, url=None, html_content=None, output_filename=None, ...)` → 保存页面为 PDF - `download_file(session_id, url, output_filename=None, timeout=30)` → 下载文件 - `download_image(session_id, image_url, output_filename=None, timeout=30)` → 下载图片 #### Cookie 和会话管理 - `set_cookie(session_id, name, value, domain, path, http_only, secure, same_site, expires, max_age)` → 设置 Cookie - `get_cookies(session_id, domain=None)` → 获取当前 Cookie #### 实用工具 - `scroll_page(session_id, direction="down")` → 上下滚动 - `extract_content(session_id, query)` → 提取文本内容 - `wait(seconds)` → 暂停执行 - `browser_tips()` → 获取自动化最佳实践 - `search_bing(session_id, query)` → Bing 搜索 ### 📚 资源(REST 风格) - `browser://status` → 管理器和会话状态 - `browser://instances` → 所有会话信息 - `browser://instance/{id}/page` → 会话页面信息 - `browser://instance/{id}/tabs` → 会话标签页 - `browser://instance/{id}/screenshots` → 会话截图 - `browser://instance/{id}/status` → 会话详细状态 - `browser://instance/{id}/files` → 会话临时文件 - `browser://instance/{id}/cookies` → 会话 Cookie - `browser://instance/{id}/file/{relative_path}` → 读取会话临时文件 - `browser://help` → 帮助 ### 🔧 配置 使用环境变量配置服务器: ```bash # 最大并发浏览器实例数 BROWSER_MAXIMUM_INSTANCES=10 # 会话 TTL(秒)(默认:30分钟) BROWSER_INSTANCE_TTL=1800 # 命令执行超时(秒) BROWSER_EXECUTE_TIMEOUT=30 # 清理间隔(秒) BROWSER_CLEANUP_INTERVAL=60 ``` ### 📝 提示 常见自动化场景的内置 prompt: - `web_testing(url, test_scenario)` → Web 测试工作流 - `data_extraction(url, data_type)` → 数据提取策略 - `form_filling(url, form_data)` → 自动表单填写(返回对话) - `automation_troubleshooting()` → 调试帮助 ### 🔌 MCP 集成 #### 与 Claude Desktop 一起使用 1. **添加到 Claude Desktop 配置**: 编辑 Claude Desktop 配置文件(macOS 上通常位于 `~/Library/Application Support/Claude/claude_desktop_config.json`): ```json { "mcpServers": { "browser-mcp": { "command": "uv", "args": ["run", "fastmcp", "run", "/path/to/browser-mcp/browser_fastmcp_server.py"], "env": { "BROWSER_MAXIMUM_INSTANCES": "5", "BROWSER_INSTANCE_TTL": "1800" } } } } ``` 2. **重启 Claude Desktop** 以加载 MCP 服务器 3. **开始使用**: 浏览器自动化工具现在可在您的 Claude 对话中使用 #### 与 MCP 客户端一起使用(两种方式) **方式一:基于网络的 MCP 客户端(通过 HTTP/SSE)** ```python import asyncio from mcp import ClientSession, SSEClientTransport async def main(): # 通过网络连接到运行的服务器 transport = SSEClientTransport("http://localhost:8000/sse") async with ClientSession(transport) as session: # 初始化会话 await session.initialize() # 启动浏览器 info = await session.call_tool("create_chrome_instance", {"headless": True}) session_id = info["session_id"] # 导航到网站 await session.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"}) # 截图 await session.call_tool("take_screenshot", {"session_id": session_id}) # 关闭会话 await session.call_tool("close_instance", {"session_id": session_id}) if __name__ == "__main__": asyncio.run(main()) ``` **方式二:直接客户端(无网络)** ```python import asyncio from fastmcp import Client from browser_fastmcp_server import mcp as browsers_mcp async def main(): # 直接客户端连接(无网络) client = Client(browsers_mcp) async with client: # 启动浏览器 session = await client.call_tool("create_chrome_instance", {"headless": True}) session_id = session.data.session_id # 导航到网站 await client.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"}) # 截图 await client.call_tool("take_screenshot", {"session_id": session_id}) # 关闭会话 await client.call_tool("close_instance", {"session_id": session_id}) if __name__ == "__main__": asyncio.run(main()) ``` ### 🔒 身份验证 对于需要身份验证的服务器部署,在启动前修改 `main.py` 设置 AuthProvider: **基本身份验证:** ```python from fastmcp.auth import BasicAuth # 在 mcp.run() 之前添加 mcp.auth = BasicAuth(username="admin", password="password") ``` **JWT 身份验证(生产环境推荐):** 对于更高级的身份验证,我们推荐使用 [fastmcp-authentication](https://github.com/Euraxluo/fastmcp-authentication): ```python from fastmcp_authentication import BearerAuthProvider JWKS_URI = "http://localhost:8080/.well-known/jwks.json" auth = BearerAuthProvider( jwks_uri=JWKS_URI, issuer="http://localhost:8080", audience="localhost:8080", algorithm="RS256" ) mcp.auth = auth ``` ### 💡 使用场景 - **Web 测试**: 自动化功能、安全和性能测试 - **数据抓取**: 从网站提取结构化数据 - **表单自动化**: 程序化填写和提交 Web 表单 - **内容监控**: 跟踪 Web 内容变化 - **截图文档**: 为报告捕获视觉证据 - **PDF 生成**: 将网页转换为 PDF 文档 - **会话管理**: 处理身份验证工作流 ### 🔒 安全功能 - MCP 客户端之间的会话隔离 - 支持 HttpOnly 和 Secure 标志的安全 Cookie 管理 - 可配置的浏览器安全设置(CORS、沙箱等) - 临时文件自动清理 - 基于 TTL 的会话过期 ### 🐳 Docker 用法 构建镜像: ```bash docker build -t browser-mcp . ``` 运行服务(默认8000端口,SSE模式): ```bash docker run -p 8000:8000 browser-mcp ``` 可通过环境变量覆盖启动参数: ```bash docker run -e MCP_PORT=9000 -e MCP_TRANSPORT=http -e MCP_HOST=127.0.0.1 -p 9000:9000 browser-mcp ``` ---

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Euraxluo/browser-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server