Question 1

What can you do with this server?

Accepted Answer

OpenChrome is a harness-engineered MCP server that drives AI agents through a real Chrome browser via CDP, enabling authenticated, parallel, and bot-resistant browser automation.

Navigation & Tab Management

* Navigate to URLs, go forward/back, reload pages, create/close tabs
* Manage multiple isolated workers and parallel browser lanes
* List and switch Chrome profiles for per-profile isolation

Page Interaction

* Click, type, scroll, hover, drag-and-drop, file uploads via natural language or precise coordinates
* Fill forms (single or multi-field), execute multi-step action sequences (act)
* Execute custom JavaScript (async/await, Shadow DOM helpers)

Content Reading & Extraction

* Read pages as DOM, accessibility tree, CSS diagnostics, semantic summary, or clean Markdown
* Find elements by natural language, CSS/XPath selectors, or vision-based annotated screenshots
* Extract structured JSON data via schema, JSON-LD, Microdata, OpenGraph, or CSS
* Inspect focused page state (forms, errors, tabs) without loading the full DOM

Parallel & Workflow Automation

* Initialize multi-worker workflows with isolated tabs visiting different sites simultaneously
* Track workflow progress, collect results, batch-execute JavaScript across tabs
* Paginate through multi-page content (keyboard, click, URL, scroll strategies)
* Execute cached plans by ID, bypassing per-step LLM calls

Network & Environment Control

* Simulate network conditions (offline, 2G–4G, custom throttling)
* Intercept, block, modify, or log network requests; capture traffic with/without response bodies
* Manage cookies, localStorage/sessionStorage, HTTP auth credentials
* Override user agent, geolocation, and device emulation (mobile/tablet/desktop)

Crawling & Scraping

* Recursively crawl websites via BFS, best-first, or sitemap.xml with auto-discovery
* Async/resumable crawl jobs with concurrency and content filtering options
* Validate page health (navigation + console errors + DOM summary)

Screenshots, PDFs & Monitoring

* Capture viewport or full-page screenshots (PNG/WebP/JPEG) and generate PDFs
* Capture browser console output and collect Web Vitals (LCP, CLS, INP, TTFB, FCP)
* Capture evidence bundles (DOM + screenshot + network + console + perceptual hash)
* Diff two evidence bundles deterministically for before/after comparison
* Monitor CDP connection health with heartbeat and reconnect metrics
* Detect page gates (CAPTCHA, bot-check, SSO redirect, paywall, 2FA)

Assertions & Contracts

* Evaluate Outcome Contract assertions against page evidence for pass/fail/inconclusive verdicts
* Inspect safety policy matrix and preview policy decisions for tool/args combinations

Session & Context Management

* Save/load checkpoints, snapshot/resume session context after compaction
* Query tool call journal for session history, auditing, and handoff summaries
* Compact journal entries into model-friendly summaries

Skills, Memory & Recording

* Record and recall domain skills (reusable automation workflows keyed by domain+name)
* Store domain knowledge (selectors, tips, avoid patterns) in a memory store
* Record/replay full sessions; export as JSON, HTML, Puppeteer, Playwright, or MCP-replay formats

Task Orchestration & AI Support

* Create and manage background tasks with polling; create task-scoped browser lanes
* Structured reflection artifacts for task-failure recovery guidance
* Check session progress (progressing, stalling, stuck) with advisory suggestions
* Anti-bot stealth modes (CDP-free navigation, Turnstile bypass), circuit breakers, auto-recovery

Utilities

* Generate TOTP 2FA codes for configured domains
* Ask vision questions about screenshots via host LLM sampling
* Fetch large output handles with pagination; copy text to clipboard
* Run environment diagnostics (openchrome doctor) for health checks

Question 2

Which integrations are available for this server?

Accepted Answer

Enables AI agents to navigate, interact, and extract data from Amazon's website through the OpenChrome MCP server.

Enables AI agents to navigate, interact, and extract data from Datadog's dashboard through the OpenChrome MCP server.

Enables AI agents to navigate, interact, and extract data from eBay's website through the OpenChrome MCP server.

Enables AI agents to navigate, interact, and extract data from Stripe's dashboard through the OpenChrome MCP server.

Enables AI agents to navigate, interact, and extract data from Walmart's website through the OpenChrome MCP server.

Question 3

How do I use OpenChrome?

Accepted Answer

1. Click on "Install Server".
2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@OpenChrome compare AirPods Pro prices on Amazon, Best Buy, and Walmart"

That's it! The server will respond to your query, and you can continue using it as needed.

Here is a step-by-step guide with screenshots.

	Traditional (Playwright et al.)	OpenChrome
5-site task*	~250s (login each)	~3s (parallel)
Memory*	~2.5 GB (5 browsers)	~300 MB (1 Chrome)
Re-auth	every run	never
Bot detection	flagged	invisible (real Chrome)

Subsystem	What it does
Hint engine (30+ rules)	Catches error→recovery patterns and corrects the agent before mistakes cascade. Promotes repeated patterns to permanent rules.
Recovery runtime	Deterministic, bounded recovery for a tool call — recover in-server, no LLM round-trip (pilot tier).
Ralph engine	7-strategy interaction waterfall: AX click → CSS → CDP coords → JS → keyboard → raw mouse → human escalation.
3-level circuit breaker	Element / page / global — stops the agent burning tokens on permanently broken elements.
Outcome classifier	Reports what actually happened after a click (SUCCESS / SILENT_CLICK / WRONG_ELEMENT).
49 reliability mechanisms	8 defense layers from process lifecycle to MCP gateway — no single failure hangs the server. See `docs/architecture.md`.

Topic	Link
Architecture & reliability layers	`docs/architecture.md`
Getting started walkthrough	`docs/getting-started.md`
Full tool catalogue	`docs/agent/capability-map.md`
CLI & playbook	`docs/cli.md` · `docs/cli/playbook.md`
HTTP daemon mode	`docs/getting-started/http-daemon.md`
Research recipes	`docs/recipes/README.md`
Latest release notes	`docs/releases/v1.12.5.md`

OpenChrome

What it is

Quick start

What you can do with it

Using it conveniently

Drive it from the shell — no MCP host needed

Declarative scenarios with `oc playbook`

Keep one browser warm — HTTP daemon mode

Diagnose your environment

Token-efficient page reads

Why agents fail less on OpenChrome

Other capabilities worth knowing

Server & headless deployment

Host plugins

Claude Code

Codex CLI

Shared skill body

Documentation

Development

License

Maintenance

Resources

Looking for Admin?

Tools

Latest Blog Posts

MCP directory API