The Browser Agent MCP provides Claude Desktop with autonomous browser automation and API interaction capabilities for complex web tasks and data retrieval.
Browser Automation:
Navigate to URLs with configurable timeouts and wait conditions
Capture full-page or element-specific screenshots with optional masking and custom naming
Interact with web elements through clicks, hovers, form filling, and dropdown selections
Execute arbitrary JavaScript code directly in the browser context
Adjust viewport size, height, width, and device scale factor
API Interaction:
Perform all standard HTTP requests (GET, POST, PUT, PATCH, DELETE) with custom headers
Send and receive JSON data through API endpoints
Advanced Features:
Maintain persistent browser sessions across multiple commands
Chain multiple operations for complex workflows with intelligent error recovery
Access browser console logs via
browser://logs
resourceRetrieve screenshots by name using
screenshot://[name]
resourceProvide detailed error information for debugging and recovery
Supports browser automation in Firefox, allowing for navigation, DOM manipulation, form filling, and JavaScript execution through Playwright.
Demonstrated capability to navigate to Google, perform searches, and interact with search results through browser automation.
Enables execution of arbitrary JavaScript code in browser context, with the ability to capture console logs and return results.
Provides automation capabilities for Safari's WebKit engine, enabling navigation, screenshots, DOM interactions, and JavaScript execution.
Demonstrated capability to navigate to Wikipedia, perform searches, and interact with content through browser automation.
Referenced in the demo section with timestamps linking to a YouTube demonstration video.
MCP Browser Agent
Features
Advanced Browser Automation
Navigate to any URL with customizable load strategies
Capture full-page or element-specific screenshots
Perform precise DOM interactions (click, fill, select, hover)
Execute arbitrary JavaScript in browser context with console logs capture
Powerful API Client
Execute HTTP requests (GET, POST, PUT, PATCH, DELETE)
Configure request headers and body content
Process response data with JSON formatting
Error handling with detailed feedback
MCP Resource Management
Access browser console logs as resources
Retrieve screenshots through MCP resource interface
Persistent session with headful browser instance
AI Agent Capabilities
Chain multiple browser operations for complex tasks
Follow multi-step instructions with intelligent error recovery
Technical task automation through natural language instructions
Demo
Click on any timestamp to jump to that section of the video
00:00 - Google Search for MCP
Navigation to Google homepage and search for "Model Context Protocol". Demonstration of Claude Desktop using the MCP integration to perform a basic web search and process the results.
00:33 - Screenshot Capture
Taking a screenshot of the search results with a custom filename and showcasing it in Finder. Shows how Claude can capture and save visual content from web pages during browser automation.
01:00 - Wikipedia Search
Navigation to Wikipedia.org and search for "Model Context Protocol". Illustrates Claude's ability to interact with different websites and their search functionality through the MCP integration.
01:38 - Dropdown Menu Interaction I
Navigation to a test website (the-internet.herokuapp.com/dropdown) and selection of "Option 1" from a dropdown menu. Demonstrates Claude's capability to interact with form elements and make selections.
01:56 - Dropdown Menu Interaction II
Changing the selection to "Option 2" from the same dropdown menu. Shows Claude's ability to manipulate the same form element multiple times and make different selections.
02:09 - Login Form Completion
Navigation to a login page (the-internet.herokuapp.com/login) and filling in the username field with "tomsmith" and password field with "SuperSecretPassword!". Demonstrates form filling automation.
02:28 - Login Submission
Submitting the login credentials and completing the authentication process. Shows Claude's ability to trigger form submissions and navigate through multi-step processes.
02:36 - API Request Execution
Performing a GET request to JSONPlaceholder API endpoint. Demonstrates Claude's capability to make direct API calls and process the returned data through the MCP integration.
Requirements
Node.js 16 or higher
Claude Desktop
Playwright dependencies
Browser Support
This package includes Playwright and the necessary dependencies for running browser automation. When you run npm install
, the required Playwright dependencies will be installed. The package supports the following browsers:
Chrome (default)
Firefox
Microsoft Edge
WebKit (Safari engine)
When you first use a browser type, Playwright will automatically install the corresponding browser drivers as needed. You can also install them manually with the following commands:
Note about Safari: Playwright doesn't provide direct support for Safari browser. Instead, it uses WebKit, which is the browser engine that powers Safari.
Note about Edge: When selecting Edge as the browser type, the agent will actually launch Microsoft Edge (not Chromium). Technically, in Playwright, Edge is launched using the Chromium browser instance with the 'msedge' channel parameter because Microsoft Edge is based on Chromium.
Installation
Installing Manually
Clone or download this repository:
Install dependencies:
Build the project:
Running the MCP Server
There are two ways to run the MCP server:
Option 1: Running manually
Open a terminal or command prompt
Navigate to the project directory
Run the server directly:
Keep this terminal window open while using Claude Desktop. The server will run until you close the terminal.
Option 2: Auto-starting with Claude Desktop (recommended for regular use)
The Claude Desktop can automatically start the MCP server when needed. To set this up:
Configuration
The Claude Desktop configuration file is located at:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
Windows:
%APPDATA%\Claude\claude_desktop_config.json
Linux:
~/.config/Claude/claude_desktop_config.json
Edit this file to add the Browser Agent MCP configuration. If the file doesn't exist, create it:
Important: Replace ABSOLUTE_PATH_TO_DIRECTORY
with the complete absolute path where you installed the MCP
macOS/Linux example:
/Users/username/mcp-browser-agent
Windows example:
C:\\Users\\username\\mcp-browser-agent
If you already have other MCPs configured, simply add the "browserAgent" section inside the "mcpServers" object. Here's an example of a configuration with multiple MCPs:
Browser Selection
The MCP Browser Agent supports multiple browser types. By default, it uses Chrome, but you can specify a different browser in several ways:
Option 1: Configuration File
Create or edit the file .mcp_browser_agent_config.json
in your home directory:
Supported values for browserType
are:
chrome
- Uses installed Chrome (default)firefox
- Uses Firefox 'Nightly' browserwebkit
- Uses WebKit engine (Note: This is not Safari itself but the WebKit rendering engine that powers Safari)edge
- Uses Microsoft Edge
Note about Safari: Playwright doesn't provide direct support for Safari browser. Instead, it uses WebKit, which is the browser engine that powers Safari. The WebKit implementation in Playwright provides similar functionality but is not identical to the Safari browser experience.
Option 2: Command Line Argument
When starting the MCP server manually, you can specify the browser type:
Option 3: Environment Variable
Set the MCP_BROWSER_TYPE
environment variable:
Option 4: Claude Desktop Configuration
When configuring the MCP in Claude Desktop's claude_desktop_config.json
, you can specify the browser type:
Technical Implementation
MCP Browser Agent is built on the Model Context Protocol, enabling Claude to interact with a headful browser through Playwright. The implementation consists of four main components:
Server (index.ts)
Initializes the MCP server with Model Context Protocol standard protocol
Configures server capabilities for tools and resources
Establishes communication with Claude through the stdio transport
Tools Registry (tools.ts)
Defines browser and API tool schemas
Specifies parameters, validation rules, and descriptions
Registers tools with the MCP server for Claude's discovery
Request Handlers (handlers.ts)
Manages MCP protocol requests for tools and resources
Exposes browser logs and screenshots as queryable resources
Routes tool execution requests to the appropriate handlers
Executor (executor.ts)
Manages browser and API client lifecycle
Implements browser automation functions using Playwright
Handles API requests with proper error handling and response parsing
Maintains stateful browser session between commands
Agent Capabilities
Unlike basic integrations, MCP Browser Agent functions as a true AI agent by:
Maintaining persistent browser state across multiple commands
Capturing detailed console logs for debugging
Storing screenshots for reference and review
Managing complex interaction sequences
Providing detailed error information for recovery
Supporting chained operations for complex workflows
Available Tools
Browser Tools
Tool Name | Description | Parameters |
| Navigate to a URL |
(required),
,
|
| Capture screenshot |
(required),
,
,
,
|
| Click element |
(required) |
| Fill form input |
(required),
(required) |
| Select dropdown option |
(required),
(required) |
| Hover over element |
(required) |
| Execute JavaScript |
(required) |
API Tools
Tool Name | Description | Parameters |
| GET request |
(required),
|
| POST request |
(required),
(required),
|
| PUT request |
(required),
(required),
|
| PATCH request |
(required),
(required),
|
| DELETE request |
(required),
|
Resource Access
The MCP Browser Agent exposes the following resources:
browser://logs
- Access browser console logsscreenshot://[name]
- Access screenshots by name
Example Usage
Here are some realistic examples of how to use the MCP Browser Agent with Claude:
Basic Browser Navigation
Simple Interactions
Basic Form Filling
Simple JavaScript Execution
Basic API Requests
These examples represent the actual capabilities of the MCP Browser Agent and are more realistic about what it can accomplish in its current state.
Troubleshooting
"Server disconnected" error
If you see the error "MCP Browser Agent: Server disconnected" in Claude Desktop:
Verify the server is running:
Open a terminal and manually run
node dist/index.js
from the project directoryIf the server starts successfully, use Claude while keeping this terminal open
Check your configuration:
Ensure the absolute path in
claude_desktop_config.json
is correct for your systemDouble-check that you've used double backslashes (
\\
) for Windows pathsVerify you're using the complete path from the root of your filesystem
Browser not appearing
If the browser doesn't launch or you don't see it:
Check if the specified browser is installed
Verify that you have the browser (Chrome, Firefox, Edge, or Safari/WebKit) installed on your system
The browser drivers are handled automatically by Playwright
Restart the server and Claude Desktop
Kill any existing node processes that might be running the server
Restart Claude Desktop to establish a fresh connection
Browser process not closing properly
There are known issues with Chromium and Chrome browsers where the process sometimes doesn't terminate properly after use. If you experience this issue:
Manually close the browser process:
Windows: Press Ctrl+Shift+Esc to open Task Manager, find the Chrome/Chromium process and end it
macOS: Open Activity Monitor (Applications > Utilities > Activity Monitor), find the Chrome/Chromium process and click the X to terminate it
Linux: Run
ps aux | grep chrome
orps aux | grep chromium
to find the process, thenkill <PID>
to terminate it
Note about browser compatibility:
This issue has been observed primarily with Chromium and Chrome
Firefox and Playwright's built-in browser don't typically experience this problem
This MCP integration is built on Playwright, which has known issues and bugs that may affect its operation. Please report any issues you encounter with the browser automation toPlaywright's GitHub issues. The Playwright team is continuously working to address these issues, but this agent provides a foundation for browser automation capabilities with Claude Desktop despite these limitations.
Development
Project Structure
src/index.ts
: Main entry point and MCP server initializationsrc/tools.ts
: Tool schemas and registrationsrc/handlers.ts
: MCP request handlers for tools and resourcessrc/executor.ts
: Tool implementation logic using Playwright
Building
Watching for Changes
Testing
The project includes tests to verify core functionality and browser handling.
Tests verify configuration integrity, browser automation features, error handling, and process cleanup. The test suite focuses particularly on ensuring proper handling of browser processes due to known issues with Chrome/Chromium termination.
Security Considerations
This MCP integration provides Claude with autonomous browser control capabilities. Please review ourSecurity Policy for important information about prohibited uses, security implications, and best practices.
The MCP Browser Agent is designed for legitimate automation tasks but could potentially be misused. Users are responsible for ensuring their use complies with all applicable laws, terms of service, and ethical guidelines. See our detailed Security Policy for more information.
Contributing
Contributions to the MCP Browser Agent are welcome! Here are some areas where you can help:
Adding new browser automation capabilities
Improving error handling and recovery
Enhancing screenshot and resource management
Creating useful workflows and examples
Optimizing performance for complex operations
License
This project is licensed under the Mozilla Public License 2.0 - see the LICENSE file for details.
Related Links
local-only server
The server can only run on the client's local machine because it depends on local resources.
Tools
A Model Context Protocol (MCP) integration that provides Claude Desktop with autonomous browser automation capabilities. This agent enables Claude to interact with web content, manipulate DOM elements, execute JavaScript, and perform API requests.
Related MCP Servers
- -securityFlicense-qualityA Model Context Protocol server built with mcp-framework that allows users to create and manage custom tools for processing data, integrating with the Claude Desktop via CLI.Last updated -325
- AsecurityFlicenseAqualityA Model Context Protocol (MCP) server that integrates with OmniFocus to enable Claude (or other MCP-compatible AI assistants) to interact with your tasks and projects.Last updated -73291
- AsecurityFlicenseAqualityA Model Context Protocol (MCP) server that allows Claude AI to interact with custom tools, enabling extension of Claude's capabilities through the MCP framework.Last updated -
- AsecurityFlicenseAqualityA Model Context Protocol server that allows integration with Claude Desktop by creating and managing custom tools that can be executed through the MCP framework.Last updated -31