Skip to main content
Glama
by Euraxluo

Session-Based Browser-Use FastMCP Server

CI codecov

English | 中文

English

A modern Model Context Protocol (MCP) server that provides advanced browser automation capabilities using the FastMCP framework. Features session-based instance management, TTL cleanup, PDF generation, file downloads, cookie management, and comprehensive browser configuration options. All browser operations are implemented via

🎯 Key Features

  • Session-Based Management: Each MCP session gets its own isolated browser instance automatically

  • Advanced Browser Control: Full browser automation with Playwright backend (via browser-use)

  • PDF Generation: Convert web pages to PDF with custom formatting options

  • File Operations: Download/upload files, manage file system, and access all temp files

  • Cookie Management: Set, get, and manage browser cookies for authentication

  • Screenshot Capture: Take full-page, viewport, or element screenshots

  • Tab Management: Create, switch, and close browser tabs

  • Content Extraction: Extract and search page content

  • Session Persistence: Automatic cleanup with configurable TTL

  • Multi-Instance Support: Run multiple isolated browser sessions

  • Configurable Security: All browser security settings are configurable via API

🚀 Quick Start

  1. Install Dependencies:

    Using uv (recommended):

    uv sync --all-extras
  2. Install the Browser:

    uv run playwright install --with-deps chromium
  3. Start the Server:

    Using uv (recommended):

    uv run main.py
  4. Basic Usage (Direct SessionBrowserManager):

    # Direct usage without MCP protocol (for testing/development) from browser_fastmcp_server import SessionBrowserManager, BrowserConfig import asyncio async def main(): # Create session manager manager = SessionBrowserManager(max_instances=5, default_ttl=300) await manager.start_cleanup_task() # Create a new browser session session_id = "test_session_123" instance = await manager.get_or_create_session_instance( session_id, BrowserConfig(headless=True) ) # Navigate to a website browser_session = instance.browser_session await browser_session.navigate("https://example.com") # Get page elements state_summary = await browser_session.get_state_summary(cache_clickable_elements_hashes=True) print(f"Interactive elements: {len(state_summary.selector_map)}") # Take a screenshot page = await browser_session.get_current_page() screenshot_bytes = await page.screenshot(full_page=True) # Close session when done await manager.close_session(session_id) await manager.shutdown() if __name__ == "__main__": asyncio.run(main())

🛠️ Run Tests

Install test dependencies and run all tests:

uv run python -m pytest test_browser_workflow_test.py test_browser_fastmcp_client.py test_browser_test.py -v

🛠️ Core Tools (API)

Session Management

  • create_chrome_instance(headless, viewport_width, viewport_height) → Create a new browser session, returns session_id

  • close_instance(session_id) → Close a specific session

  • get_instance_info(session_id) → Get info for a session

  • check_browser_health(session_id) → Check the health status of a browser session and provide recovery suggestions

  • get_browser_status() → List all sessions

  • close_all_instances() → Close all sessions

Browser Configuration

  • set_browser_config(session_id, headless, no_sandbox, user_agent, viewport_width, viewport_height, disable_web_security) → Set browser config (restart if needed)

  • get_browser_config(session_id) → Get current config

Navigation & Page Control

  • navigate_to(session_id, url, new_tab=False) → Go to any URL (optionally in new tab)

  • navigate_back(session_id) / navigate_forward(session_id) → History navigation

  • refresh_page(session_id) → Refresh the current page

  • get_page_state(session_id) → List interactive elements with indices

Tab Management

  • get_tabs_info(session_id) → List all open tabs

  • switch_tab(session_id, page_id) → Switch between tabs

  • close_tab(session_id, page_id) → Close specific tab

Element Interaction

  • click_element(session_id, index) → Click element by index

  • click_element_by_xpath(session_id, xpath) → Click element by XPath

  • input_text(session_id, index, text) → Type into form fields

  • set_element_value(session_id, index, value) → Set input/select value directly

  • get_element_info(session_id, index=None, xpath=None) → Get element info (by index or xpath)

  • send_keys(session_id, keys) → Send keyboard shortcuts

  • upload_file(session_id, index, file_path) → Upload files to forms

  • get_dropdown_options(session_id, index) → Inspect select elements

Media & Files

  • take_screenshot(session_id, target=None, width=None, height=None, full_page=True, quality=90, format="png") → Capture screenshots

  • generate_pdf(session_id, url=None, html_content=None, output_filename=None, ...) → Save page as PDF

  • download_file(session_id, url, output_filename=None, timeout=30) → Download files from URLs

  • download_image(session_id, image_url, output_filename=None, timeout=30) → Download images specifically

Cookie & Session Management

  • set_cookie(session_id, name, value, domain, path, http_only, secure, same_site, expires, max_age) → Set browser cookies

  • get_cookies(session_id, domain=None) → Retrieve current cookies

Utilities

  • scroll_page(session_id, direction="down") → Scroll up/down

  • extract_content(session_id, query) → Extract text content

  • wait(seconds) → Pause execution

  • browser_tips() → Get automation best practices

  • search_bing(session_id, query) → Bing search

📚 Resources (REST-style)

  • browser://status → Manager and sessions status

  • browser://instances → All sessions info

  • browser://instance/{id}/page → Session page info

  • browser://instance/{id}/tabs → Session tabs

  • browser://instance/{id}/screenshots → Session screenshots

  • browser://instance/{id}/status → Session status (detailed)

  • browser://instance/{id}/files → Session temp files

  • browser://instance/{id}/cookies → Session cookies

  • browser://instance/{id}/file/{relative_path} → Read a file in session temp

  • browser://help → This help

🔧 Configuration

Configure the server using environment variables:

# Maximum number of concurrent browser instances BROWSER_MAXIMUM_INSTANCES=10 # Session TTL in seconds (default: 30 minutes) BROWSER_INSTANCE_TTL=1800 # Command execution timeout in seconds BROWSER_EXECUTE_TIMEOUT=30 # Cleanup interval in seconds BROWSER_CLEANUP_INTERVAL=60

📝 Prompts

Built-in prompts for common automation scenarios:

  • web_testing(url, test_scenario) → Web testing workflows

  • data_extraction(url, data_type) → Data extraction strategies

  • form_filling(url, form_data) → Automated form filling (returns conversation)

  • automation_troubleshooting() → Debugging help

🔌 MCP Integration

Using with Claude Desktop

  1. Add to Claude Desktop Configuration:

    Edit your Claude Desktop configuration file (usually at ~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

    { "mcpServers": { "browser-mcp": { "command": "uv", "args": ["run", "fastmcp", "run", "/path/to/browser-mcp/browser_fastmcp_server.py"], "env": { "BROWSER_MAXIMUM_INSTANCES": "5", "BROWSER_INSTANCE_TTL": "1800" } } } }
  2. Restart Claude Desktop to load the MCP server

  3. Start Using: The browser automation tools will now be available in your Claude conversations

Using with MCP Client (Two Ways)

Method 1: Network-based MCP Client (via HTTP/SSE)

import asyncio from mcp import ClientSession, SSEClientTransport async def main(): # Connect to the running server via network transport = SSEClientTransport("http://localhost:8000/sse") async with ClientSession(transport) as session: # Initialize session await session.initialize() # Start browser info = await session.call_tool("create_chrome_instance", {"headless": True}) session_id = info["session_id"] # Navigate to website await session.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"}) # Take screenshot await session.call_tool("take_screenshot", {"session_id": session_id}) # Close session await session.call_tool("close_instance", {"session_id": session_id}) if __name__ == "__main__": asyncio.run(main())

Method 2: Direct Client (No Network)

import asyncio from fastmcp import Client from browser_fastmcp_server import mcp as browsers_mcp async def main(): # Direct client connection (no network) client = Client(browsers_mcp) async with client: # Start browser session = await client.call_tool("create_chrome_instance", {"headless": True}) session_id = session.data.session_id # Navigate to website await client.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"}) # Take screenshot await client.call_tool("take_screenshot", {"session_id": session_id}) # Close session await client.call_tool("close_instance", {"session_id": session_id}) if __name__ == "__main__": asyncio.run(main())

🔒 Authentication

For server deployments requiring authentication, modify main.py to set an AuthProvider before startup:

Basic Authentication:

from fastmcp.auth import BasicAuth # Add this before mcp.run() mcp.auth = BasicAuth(username="admin", password="password")

JWT Authentication (Recommended for Production):

For more advanced authentication, we recommend using fastmcp-authentication:

from fastmcp_authentication import BearerAuthProvider JWKS_URI = "http://localhost:8080/.well-known/jwks.json" auth = BearerAuthProvider( jwks_uri=JWKS_URI, issuer="http://localhost:8080", audience="localhost:8080", algorithm="RS256" ) mcp.auth = auth

💡 Use Cases

  • Web Testing: Automated functional, security, and performance testing

  • Data Scraping: Extract structured data from websites

  • Form Automation: Fill and submit web forms programmatically

  • Content Monitoring: Track changes in web content

  • Screenshot Documentation: Capture visual evidence for reports

  • PDF Generation: Convert web pages to PDF documents

  • Session Management: Handle authenticated workflows

🔒 Security Features

  • Session isolation between MCP clients

  • Secure cookie management with HttpOnly and Secure flags

  • Configurable browser security settings (CORS, sandbox, etc.)

  • Automatic cleanup of temporary files

  • TTL-based session expiration

🐳 Docker Usage

Build the image:

docker build -t browser-mcp .

Run the server (default: port 8000, SSE transport):

docker run -p 8000:8000 browser-mcp

You can override startup parameters via environment variables:

docker run -e MCP_PORT=9000 -e MCP_TRANSPORT=http -e MCP_HOST=127.0.0.1 -p 9000:9000 browser-mcp

Chinese

基于会话的浏览器自动化 FastMCP 服务器,提供先进的浏览器自动化功能,使用 FastMCP 框架构建。所有浏览器操作均通过

🎯 核心特性

  • 基于会话的管理: 每个 MCP 会话自动获得独立的浏览器实例

  • 高级浏览器控制: 基于 Playwright 的完整浏览器自动化(由 browser-use 提供)

  • PDF 生成: 将网页转换为 PDF,支持自定义格式选项

  • 文件操作: 下载/上传文件,管理临时文件目录

  • Cookie 管理: 设置、获取和管理浏览器 Cookie 用于身份验证

  • 截图捕获: 全页面、视口或元素截图

  • 标签页管理: 创建、切换和关闭浏览器标签页

  • 内容提取: 提取和搜索页面内容

  • 会话持久化: 自动清理,可配置 TTL

  • 多实例支持: 运行多个隔离的浏览器会话

  • 可配置安全性: 所有浏览器安全设置均可通过 API 配置

🚀 快速开始

  1. 安装依赖:

    使用 uv(推荐):

    uv sync --all-extras
  2. 安装浏览器:

    uv run playwright install --with-deps chromium
  3. 启动服务器:

    使用 uv(推荐):

    uv run main.py
  4. 基本使用(直接使用 SessionBrowserManager):

    # 直接使用,不通过 MCP 协议(用于测试/开发) from browser_fastmcp_server import SessionBrowserManager, BrowserConfig import asyncio async def main(): # 创建会话管理器 manager = SessionBrowserManager(max_instances=5, default_ttl=300) await manager.start_cleanup_task() # 创建新浏览器会话 session_id = "test_session_123" instance = await manager.get_or_create_session_instance( session_id, BrowserConfig(headless=True) ) # 导航到网站 browser_session = instance.browser_session await browser_session.navigate("https://example.com") # 获取页面元素 state_summary = await browser_session.get_state_summary(cache_clickable_elements_hashes=True) print(f"交互元素: {len(state_summary.selector_map)}") # 截图 page = await browser_session.get_current_page() screenshot_bytes = await page.screenshot(full_page=True) # 完成后关闭会话 await manager.close_session(session_id) await manager.shutdown() if __name__ == "__main__": asyncio.run(main())

🛠️ 运行测试

安装测试依赖并运行所有测试:

uv run python -m pytest test_browser_workflow_test.py test_browser_fastmcp_client.py test_browser_test.py -v

🛠️ 核心工具(API)

会话管理

  • create_chrome_instance(headless, viewport_width, viewport_height) → 创建新浏览器会话,返回 session_id

  • close_instance(session_id) → 关闭指定会话

  • get_instance_info(session_id) → 获取会话信息

  • check_browser_health(session_id) → 检查浏览器会话的健康状态并提供恢复建议

  • get_browser_status() → 列出所有会话

  • close_all_instances() → 关闭所有会话

浏览器配置

  • set_browser_config(session_id, headless, no_sandbox, user_agent, viewport_width, viewport_height, disable_web_security) → 设置浏览器配置(如需重启自动重启)

  • get_browser_config(session_id) → 获取当前配置

导航和页面控制

  • navigate_to(session_id, url, new_tab=False) → 导航到 URL(可选新标签页)

  • navigate_back(session_id) / navigate_forward(session_id) → 历史记录导航

  • refresh_page(session_id) → 刷新当前页面

  • get_page_state(session_id) → 获取带索引的交互元素

标签页管理

  • get_tabs_info(session_id) → 列出所有打开的标签页

  • switch_tab(session_id, page_id) → 切换标签页

  • close_tab(session_id, page_id) → 关闭指定标签页

元素交互

  • click_element(session_id, index) → 按索引点击元素

  • click_element_by_xpath(session_id, xpath) → 按 XPath 点击元素

  • input_text(session_id, index, text) → 在表单字段中输入文本

  • set_element_value(session_id, index, value) → 直接设置输入/选择值

  • get_element_info(session_id, index=None, xpath=None) → 获取元素信息(按索引或 xpath)

  • send_keys(session_id, keys) → 发送键盘快捷键

  • upload_file(session_id, index, file_path) → 上传文件到表单

  • get_dropdown_options(session_id, index) → 检查 select 元素

媒体和文件

  • take_screenshot(session_id, target=None, width=None, height=None, full_page=True, quality=90, format="png") → 截图

  • generate_pdf(session_id, url=None, html_content=None, output_filename=None, ...) → 保存页面为 PDF

  • download_file(session_id, url, output_filename=None, timeout=30) → 下载文件

  • download_image(session_id, image_url, output_filename=None, timeout=30) → 下载图片

Cookie 和会话管理

  • set_cookie(session_id, name, value, domain, path, http_only, secure, same_site, expires, max_age) → 设置 Cookie

  • get_cookies(session_id, domain=None) → 获取当前 Cookie

实用工具

  • scroll_page(session_id, direction="down") → 上下滚动

  • extract_content(session_id, query) → 提取文本内容

  • wait(seconds) → 暂停执行

  • browser_tips() → 获取自动化最佳实践

  • search_bing(session_id, query) → Bing 搜索

📚 资源(REST 风格)

  • browser://status → 管理器和会话状态

  • browser://instances → 所有会话信息

  • browser://instance/{id}/page → 会话页面信息

  • browser://instance/{id}/tabs → 会话标签页

  • browser://instance/{id}/screenshots → 会话截图

  • browser://instance/{id}/status → 会话详细状态

  • browser://instance/{id}/files → 会话临时文件

  • browser://instance/{id}/cookies → 会话 Cookie

  • browser://instance/{id}/file/{relative_path} → 读取会话临时文件

  • browser://help → 帮助

🔧 配置

使用环境变量配置服务器:

# 最大并发浏览器实例数 BROWSER_MAXIMUM_INSTANCES=10 # 会话 TTL(秒)(默认:30分钟) BROWSER_INSTANCE_TTL=1800 # 命令执行超时(秒) BROWSER_EXECUTE_TIMEOUT=30 # 清理间隔(秒) BROWSER_CLEANUP_INTERVAL=60

📝 提示

常见自动化场景的内置 prompt:

  • web_testing(url, test_scenario) → Web 测试工作流

  • data_extraction(url, data_type) → 数据提取策略

  • form_filling(url, form_data) → 自动表单填写(返回对话)

  • automation_troubleshooting() → 调试帮助

🔌 MCP 集成

与 Claude Desktop 一起使用

  1. 添加到 Claude Desktop 配置:

    编辑 Claude Desktop 配置文件(macOS 上通常位于 ~/Library/Application Support/Claude/claude_desktop_config.json):

    { "mcpServers": { "browser-mcp": { "command": "uv", "args": ["run", "fastmcp", "run", "/path/to/browser-mcp/browser_fastmcp_server.py"], "env": { "BROWSER_MAXIMUM_INSTANCES": "5", "BROWSER_INSTANCE_TTL": "1800" } } } }
  2. 重启 Claude Desktop 以加载 MCP 服务器

  3. 开始使用: 浏览器自动化工具现在可在您的 Claude 对话中使用

与 MCP 客户端一起使用(两种方式)

方式一:基于网络的 MCP 客户端(通过 HTTP/SSE)

import asyncio from mcp import ClientSession, SSEClientTransport async def main(): # 通过网络连接到运行的服务器 transport = SSEClientTransport("http://localhost:8000/sse") async with ClientSession(transport) as session: # 初始化会话 await session.initialize() # 启动浏览器 info = await session.call_tool("create_chrome_instance", {"headless": True}) session_id = info["session_id"] # 导航到网站 await session.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"}) # 截图 await session.call_tool("take_screenshot", {"session_id": session_id}) # 关闭会话 await session.call_tool("close_instance", {"session_id": session_id}) if __name__ == "__main__": asyncio.run(main())

方式二:直接客户端(无网络)

import asyncio from fastmcp import Client from browser_fastmcp_server import mcp as browsers_mcp async def main(): # 直接客户端连接(无网络) client = Client(browsers_mcp) async with client: # 启动浏览器 session = await client.call_tool("create_chrome_instance", {"headless": True}) session_id = session.data.session_id # 导航到网站 await client.call_tool("navigate_to", {"session_id": session_id, "url": "https://example.com"}) # 截图 await client.call_tool("take_screenshot", {"session_id": session_id}) # 关闭会话 await client.call_tool("close_instance", {"session_id": session_id}) if __name__ == "__main__": asyncio.run(main())

🔒 身份验证

对于需要身份验证的服务器部署,在启动前修改 main.py 设置 AuthProvider:

基本身份验证:

from fastmcp.auth import BasicAuth # 在 mcp.run() 之前添加 mcp.auth = BasicAuth(username="admin", password="password")

JWT 身份验证(生产环境推荐):

对于更高级的身份验证,我们推荐使用 fastmcp-authentication

from fastmcp_authentication import BearerAuthProvider JWKS_URI = "http://localhost:8080/.well-known/jwks.json" auth = BearerAuthProvider( jwks_uri=JWKS_URI, issuer="http://localhost:8080", audience="localhost:8080", algorithm="RS256" ) mcp.auth = auth

💡 使用场景

  • Web 测试: 自动化功能、安全和性能测试

  • 数据抓取: 从网站提取结构化数据

  • 表单自动化: 程序化填写和提交 Web 表单

  • 内容监控: 跟踪 Web 内容变化

  • 截图文档: 为报告捕获视觉证据

  • PDF 生成: 将网页转换为 PDF 文档

  • 会话管理: 处理身份验证工作流

🔒 安全功能

  • MCP 客户端之间的会话隔离

  • 支持 HttpOnly 和 Secure 标志的安全 Cookie 管理

  • 可配置的浏览器安全设置(CORS、沙箱等)

  • 临时文件自动清理

  • 基于 TTL 的会话过期

🐳 Docker 用法

构建镜像:

docker build -t browser-mcp .

运行服务(默认8000端口,SSE模式):

docker run -p 8000:8000 browser-mcp

可通过环境变量覆盖启动参数:

docker run -e MCP_PORT=9000 -e MCP_TRANSPORT=http -e MCP_HOST=127.0.0.1 -p 9000:9000 browser-mcp

-
security - not tested
A
license - permissive license
-
quality - not tested

Related MCP Servers

  • A
    security
    A
    license
    A
    quality
    A MCP server that provides browser automation tools, allowing users to navigate websites, take screenshots, click elements, fill forms, and execute JavaScript through Playwright.
    Last updated -
    8
    1
    Apache 2.0
    • Apple
  • -
    security
    A
    license
    -
    quality
    An MCP server that enables AI assistants to control a web browser through natural language commands, allowing them to navigate websites and extract information via SSE transport.
    Last updated -
    774
    MIT License
    • Apple
  • -
    security
    F
    license
    -
    quality
    A MCP server that allows AI assistants to interact with the browser, including getting page content as markdown, modifying page styles, and searching browser history.
    Last updated -
    84
  • -
    security
    F
    license
    -
    quality
    A FastMCP server that enables browser automation through natural language commands, allowing Language Models to browse the web, fill out forms, click buttons, and perform other web-based tasks via a simple API.
    Last updated -
    3

View all related MCP servers

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Euraxluo/browser-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server