BrowserGenie MCP Server
BrowserGenie MCP Server gives AI models full control over a Chrome browser via 50+ tools covering automation, inspection, and auditing.
Navigation: Navigate to URLs, go back/forward, reload pages (with optional cache bypass).
Tab Management: List, open, close, and switch between tabs.
Keyboard Input: Press keys with modifier support (Ctrl, Shift, Alt, Meta); type text character by character with configurable delays.
Mouse & Interaction: Click by CSS selector, XPath, or coordinates (single/double); type into fields; drag and drop; hover to trigger states.
Touch Gestures: Simulate swipes, long presses, pinches, and double taps (requires mobile viewport).
Screenshots: Capture the visible viewport or full scrollable page in PNG or JPEG.
DevTools – Sources: Read full page HTML, CSS, JavaScript sources, and all loaded resources.
DevTools – Modify: Live DOM mutation (set HTML/attributes, remove elements) and inline CSS style changes.
DevTools – Network: Collect and filter request/response logs by URL, method, status, or type; inspect individual requests; clear logs.
DevTools – Storage: CRUD operations for cookies, localStorage, and sessionStorage.
DevTools – Console: Retrieve console messages filtered by level; execute arbitrary JavaScript in the page context.
Accessibility & Auditing: Generate accessibility trees, run WCAG/axe-core audits, check color contrast, tab order, focus paths, performance metrics, font loading, broken resources, security headers, and cookie banner detection.
Element Inspection: Find elements by text, role, CSS, or XPath; get element state (visible, enabled, focused); query Shadow DOM; retrieve computed styles.
QA & Assertions: Assert element conditions, no console/network errors, CSS properties, network requests made, page load times; check form validity; emulate network conditions; intercept requests; snapshot/restore page state; wait for conditions; stress test refreshes; get unified diagnostics.
Interaction Inspection: Hover and inspect DOM/style changes, force pseudo-states (hover, focus, active), extract tooltip text.
Macro Recording: Record and replay user interactions (clicks, typing, and other changes).
Visual Regression: Compare two screenshots pixel-by-pixel for differences.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@BrowserGenie MCP Servertake a screenshot of the current page"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
BrowserGenie MCP Server
An MCP (Model Context Protocol) server that gives AI models full control over a Chrome browser. It pairs with the BrowserGenie Extension to expose 50+ browser automation tools over stdio — navigation, clicking, typing, screenshots, touch gestures, macro recording, and complete DevTools access.
Two-repo setup: This is the server half. The Chrome extension lives in a separate repository. Both are required.
How It Works
AI Client (Claude, Cursor, etc.)
│ stdio (JSON-RPC / MCP)
▼
MCP Server ◄── this repo
│ WebSocket ws://localhost:7890
▼
Chrome Extension
├── chrome.tabs → Navigation, tab management
├── chrome.debugger → DevTools Protocol (CDP)
├── chrome.scripting → Content script injection
├── chrome.cookies → Cookie management
└── Content Scripts → Real DOM event simulationThe MCP server bridges your AI client (over stdio) and the Chrome extension (over WebSocket). Every MCP tool call is forwarded to the extension, executed in the browser, and the result is returned to the AI client.
Requirements
Node.js 18+
npm
The companion BrowserGenie Extension installed in Chrome
Installation
Option A — npx (recommended)
No installation required. Run directly from npm:
npx browser-genie-mcp-serverOr install globally:
npm install -g browser-genie-mcp-server
browser-genie-mcp-serverOption B — Run from source
For development or to use the latest unreleased changes:
git clone https://github.com/BrowserGenie/mcp.git
cd mcp
npm install
npm run build
node dist/index.jsConfiguration
Add the server to your AI client's MCP configuration.
Claude Desktop
File: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
File: %APPDATA%\Claude\claude_desktop_config.json (Windows)
{
"mcpServers": {
"browser-genie": {
"command": "npx",
"args": ["browser-genie-mcp-server"]
}
}
}Claude Code
File: ~/.claude/settings.json or project-level .mcp.json:
{
"mcpServers": {
"browser-genie": {
"command": "npx",
"args": ["browser-genie-mcp-server"]
}
}
}Cursor / other MCP clients
{
"mcpServers": {
"browser-genie": {
"command": "npx",
"args": ["browser-genie-mcp-server"]
}
}
}If you installed the package globally with
npm install -g browser-genie-mcp-server, you can use"command": "browser-genie-mcp-server"with noargsinstead.
After saving the config, restart your AI client.
Environment Variables
Variable | Default | Description |
|
| Port the WebSocket server listens on. The extension must use the same port. |
Pass via the MCP config env block:
{
"mcpServers": {
"browser-genie": {
"command": "npx",
"args": ["browser-genie-mcp-server"],
"env": { "WEBSOCKET_PORT": "8080" }
}
}
}If you change the port, also update
WEBSOCKET_URLinconstants.tsinside the extension repo.
Verifying the Connection
Once the server is running and the extension is loaded in Chrome, click the extension icon. The popup should show a green Connected indicator. If it shows Disconnected, ensure the MCP server process is running.
MCP Tools Reference
Every tool accepts an optional apiKey (string) when authentication is enabled in the extension popup, and an optional tabId (number) to target a specific tab (defaults to the active tab).
Tip: If you see "No active tab found" or "Cannot access a chrome:// URL", use
list_tabsto find a valid tab ID, or navigate to a regular web page first.
Navigation
Tool | Description | Key Parameters |
| Navigate to a URL |
|
| Go back in history | — |
| Go forward in history | — |
| Reload the page |
|
Tab Management
Tool | Description | Key Parameters |
| List all open tabs | — |
| Focus a tab |
|
| Open a new tab |
|
| Close a tab |
|
| Capture URL, title, and DOM hash for state comparison | — |
| Verify two tabs have identical state |
|
| Test cross-tab localStorage sync |
|
Keyboard
Tool | Description | Key Parameters |
| Press a key with optional modifiers |
|
| Type text character by character |
|
Mouse & Interaction
Tool | Description | Key Parameters |
| Click via coordinates, CSS, or XPath with human-like Bézier curve movement and randomized delays |
|
| Click a field, optionally clear it, then type text with per-character jitter |
|
| Drag from one point to another |
|
| Hover to trigger CSS states / tooltips with randomized dwell time |
|
Click behavior:
doubleClick: truefires two rapid clicks at the same position. The element's own handlers determine focus/select behavior — the tool does not automatically select text or set focus beyond what the browser does natively.
Touch Gestures
Tool | Description | Key Parameters |
| Touch swipe from point A to B with configurable duration |
|
| Long-press on an element or coordinates |
|
| Pinch zoom with two-finger convergence/divergence |
|
| Double-tap for mobile interactions (zoom, edit) |
|
Ensure the viewport is set to mobile with
touch: trueviaresize_viewportoremulate_devicebefore using touch gestures.
Screenshots
Tool | Description | Key Parameters |
| Capture visible viewport |
|
| Capture full scrollable page |
|
Both return an image content block.
DevTools — Sources
Tool | Description | Key Parameters |
| Full outerHTML of the page | — |
| CSS stylesheet sources |
|
| JavaScript sources |
|
| List all resources with URLs & sizes |
|
| Search regex pattern across HTML and all loaded scripts |
|
DevTools — Modify
Tool | Description | Key Parameters |
| Live DOM mutation |
|
| Set inline styles |
|
DevTools — Network
Tool | Description | Key Parameters |
| Get collected request/response logs |
|
| Full details of one request |
|
| Clear collected logs | — |
| Get only failed/errored requests (4xx, 5xx, failed) |
|
Network logs are collected from when the debugger attaches. Call any DevTools tool first to trigger attachment before the traffic you want to capture.
DevTools — Storage
Tool | Description |
| Cookie CRUD |
| localStorage |
| sessionStorage |
DevTools — Console
Tool | Description | Key Parameters |
| Retrieve console messages by level |
|
| Run JS in the page and return result |
|
Accessibility & Auditing
Tool | Description | Key Parameters |
| Text-based accessibility tree snapshot | — |
| Raw accessibility tree as JSON |
|
| Run axe-core WCAG audit |
|
| Check text contrast ratios |
|
| Static list of focusable elements in tab order with unique selectors | — |
| Interactive — simulate Tab presses and record where focus lands, flagging invisible targets |
|
| Single snapshot of navigation timing, LCP, CLS, FID, memory | — |
| Start/stop/get timeline recording of memory, LCP, CLS over time |
|
| Verify web font loading status | — |
| Find broken images, stylesheets, fonts | — |
| Inspect CSP, HSTS, X-Frame-Options, etc. |
|
| Detect cookie consent banners and CMP patterns | — |
When to use
get_tab_ordervsrecord_focus_path: Useget_tab_orderfor a one-time snapshot of all focusable elements and their order. Userecord_focus_pathwhen you want to verify the actual focus behavior during keyboard navigation — it presses Tab repeatedly and records where focus lands, catching invisible or hidden focus traps.
Element Inspection
Tool | Description | Key Parameters |
| Find by text, role, aria-label, CSS, or XPath |
|
| Get exists, visible, enabled, focused, checked, etc. |
|
| Query inside a single shadow root |
|
| Query through nested shadow roots by host path |
|
| Return full shadow DOM tree as JSON |
|
| Get computed CSS styles for an element |
|
QA & Assertions
Tool | Description | Key Parameters |
| Assert conditions on an element (exists, visible, text, etc.) |
|
| Assert zero console errors/warnings |
|
| Assert zero failed network requests |
|
| Assert computed style value |
|
| Assert a matching request was made |
|
| Assert navigation timing is within threshold |
|
| Check HTML5 form validation state |
|
| Simulate Tab key and track focus movement |
|
| Set files on a file input |
|
| Simulate slow/offline network |
|
| Block, modify, or allow network requests |
|
| Capture full page state (HTML, storage, cookies) | — |
| Restore from a snapshot |
|
| Poll a JS expression until true or timeout |
|
| Refresh page N times and run assertion each time |
|
| Unified diagnostic — console + network + resource errors in one call |
|
Interaction Inspection
Tool | Description | Key Parameters |
| Hover and capture DOM/style changes |
|
| Force hover/focus/active and read computed styles |
|
| Extract tooltip text from title, aria-describedby, aria-labelledby, or CSS |
|
Macro Recording
Tool | Description | Key Parameters |
| Start recording clicks, typing, and changes | — |
| Stop recording and return events JSON | — |
| Replay recorded events with speed multiplier |
|
Visual Regression
Tool | Description | Key Parameters |
| Pixel-by-pixel diff of two base64 PNGs using pixelmatch |
|
Project Structure
browser-genie-mcp-server/
├── src/
│ ├── index.ts # Entry point — stdio MCP transport
│ ├── server.ts # McpServer setup & tool registration
│ ├── websocket-bridge.ts # WebSocket server + request/response correlation
│ ├── auth.ts # API key validation helper
│ ├── types.ts # Shared types & constants (port, message shapes)
│ └── tools/ # One file per tool category
│ ├── navigation.ts
│ ├── tab-management.ts
│ ├── click.ts
│ ├── input.ts
│ ├── keyboard.ts
│ ├── hover.ts
│ ├── drag-drop.ts
│ ├── screenshot.ts
│ ├── gestures.ts
│ ├── macros.ts
│ ├── visual-regression.ts
│ ├── devtools-sources.ts
│ ├── devtools-modify.ts
│ ├── devtools-network.ts
│ ├── devtools-storage.ts
│ ├── devtools-console.ts
│ ├── accessibility.ts
│ ├── emulation.ts
│ ├── elements.ts
│ ├── audit.ts
│ ├── interaction.ts
│ ├── monitoring.ts
│ └── qa.ts
├── dist/ # Compiled output (after npm run build)
├── package.json
├── tsconfig.json
├── .gitignore
└── LICENSEDevelopment
# Watch mode — recompiles on every file save
npm run dev
# One-shot build
npm run build
# Run the compiled server directly
npm startImportant Notes
Debugger Banner
When any DevTools feature is first used on a tab, Chrome shows an "Extension is debugging this browser" banner. This is a Chrome security requirement and cannot be suppressed. The debugger attaches lazily — only when a DevTools tool is first called for that tab.
Service Worker Lifecycle
Chrome MV3 service workers terminate after ~30 seconds of inactivity. The extension uses chrome.alarms to keep the WebSocket alive, with automatic exponential-backoff reconnection (1 s → 2 s → 4 s → … → 30 s max).
Human-Like Interactions
All mouse and keyboard interactions include randomized delays and natural movement patterns to avoid bot detection. Click uses Bézier curves, typing has per-character jitter, and hover has randomized dwell time.
Contributing
Contributions are welcome! Please open an issue first to discuss what you'd like to change, then submit a pull request.
Fork the repository
Create a feature branch:
git checkout -b feat/my-featureCommit your changes:
git commit -m 'feat: add my feature'Push and open a Pull Request
Related
BrowserGenie Extension — the Chrome extension that pairs with this server
Model Context Protocol — the open protocol this server implements
License
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Appeared in Searches
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/BrowserGenie/mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server