Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@PilotGo to Amazon and find the current price for a PlayStation 5."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
pilot
Browser automation for AI agents. 20x faster than the alternatives.
pilot is an MCP server that gives your AI agent a fast, persistent browser. Built on Playwright, it runs Chromium in-process over stdio — no HTTP server, no cold starts, no per-action overhead.
LLM Client → stdio (MCP) → pilot → Playwright → Chromium
in-process persistent
First call: ~3s (launch)
Every call after: ~5-50msWhy pilot?
pilot | @playwright/mcp | BrowserMCP | |
Latency/action | ~5-50ms | ~100-200ms | ~150-300ms |
Architecture | In-process stdio | Separate process | Chrome extension |
Persistent browser | Yes | Per-session | Yes |
Tools | 51 (configurable profiles) | 25+ | ~20 |
Token control |
| No | No |
Iframe support | Full (list, switch, snapshot inside) | NOT_PLANNED | No |
Cookie import | Chrome, Arc, Brave, Edge, Comet | No | No |
Snapshot diffing | Track page changes between actions | No | No |
Handoff/Resume | Open headed Chrome, interact manually, resume | No | No |
Speed matters when your agent makes hundreds of browser calls in a session. At 100 actions, that's 5 seconds with pilot vs 20 seconds with alternatives.
Quick Start
npx pilot-mcp
npx playwright install chromiumAdd to your Claude Code config (.mcp.json):
{
"mcpServers": {
"pilot": {
"command": "npx",
"args": ["-y", "pilot-mcp"]
}
}
}For Cursor, add the same config to your Cursor MCP settings.
That's it. Your AI agent now has a browser.
How It Works
Snapshot once, interact by ref. No CSS selectors needed.
pilot_snapshot → @e1 [button] "Submit", @e2 [textbox] "Email", ...
pilot_fill → { ref: "@e2", value: "user@example.com" }
pilot_click → { ref: "@e1" }The ref system gives LLMs a simple, reliable way to interact with pages. Stale refs are auto-detected with clear error messages.
Token Control
Large pages can blow up your context window. Pilot gives you fine-grained control:
pilot_snapshot({ max_elements: 20 })
→ Returns 20 elements + "614 more elements not shown"
pilot_snapshot({ structure_only: true })
→ Pure tree structure, no text content
pilot_snapshot({ interactive_only: true, max_elements: 15 })
→ Only buttons/links/inputs, capped at 15Combine max_elements, structure_only, interactive_only, compact, and depth to get exactly the level of detail you need. Start small, expand as needed.
Tool Profiles
48+ tools can overwhelm LLMs (research shows degradation at 30+ tools). Use PILOT_PROFILE to load only what you need:
Profile | Tools | Use case |
| 9 | Simple automation — navigate, snapshot, click, fill, type, press_key, wait, screenshot |
| 25 | Common workflows — core + tabs, scroll, hover, drag, iframe, page reading |
| 51 | Everything (default) |
{
"mcpServers": {
"pilot": {
"command": "npx",
"args": ["-y", "pilot-mcp"],
"env": { "PILOT_PROFILE": "standard" }
}
}
}Tools (51)
Navigation
Tool | Description |
| Navigate to a URL |
| Go back in browser history |
| Go forward in browser history |
| Reload the current page |
Snapshots
Tool | Description |
| Accessibility tree with |
| Unified diff showing what changed since last snapshot |
| Screenshot with red overlay boxes at each |
Interaction
Tool | Description |
| Click by |
| Hover over an element |
| Clear and fill an input/textarea |
| Select a dropdown option by value, label, or text |
| Type text character by character |
| Press keyboard keys (Enter, Tab, Escape, etc.) |
| Drag from one element to another |
| Scroll element into view or scroll page |
| Wait for element visibility, network idle, or page load |
| Upload files to a file input |
Iframes
Tool | Description |
| List all frames (iframes) on the page |
| Switch context into an iframe by index or name |
| Switch back to the main frame |
After switching frames, pilot_snapshot, pilot_click, pilot_fill, and all interaction tools operate inside that iframe. Use pilot_frames to discover available iframes, then pilot_frame_select to enter one.
Page Inspection
Tool | Description |
| Clean text extraction (strips script/style/svg) |
| Get innerHTML of element or full page |
| All links as text + href pairs |
| All form fields as structured JSON |
| All attributes of an element |
| Computed CSS property value |
| Check visible/hidden/enabled/disabled/checked/focused |
| Text diff between two URLs (staging vs production, etc.) |
Debugging
Tool | Description |
| Console messages from circular buffer |
| Network requests from circular buffer |
| Captured alert/confirm/prompt messages |
| Run JavaScript on the page (supports |
| Get all cookies as JSON |
| Get localStorage/sessionStorage (sensitive values auto-redacted) |
| Page load performance timings (DNS, TTFB, DOM parse, load) |
Visual
Tool | Description |
| Screenshot of page or specific element |
| Save page as PDF |
| Screenshots at mobile (375), tablet (768), and desktop (1280) |
Tabs
Tool | Description |
| List open tabs |
| Open a new tab |
| Close a tab |
| Switch to a tab |
Settings & Session
Tool | Description |
| Set viewport size |
| Set a cookie |
| Import cookies from Chrome, Arc, Brave, Edge, Comet |
| Set custom request headers (sensitive values auto-redacted) |
| Set user agent string |
| Configure dialog auto-accept/dismiss |
| Open headed Chrome with full state for manual interaction |
| Resume automation after manual handoff |
| Close browser and clean up |
Key Features
Cookie Import
Import cookies from your real browser into the headless session. Decrypts from the browser's SQLite cookie database using platform-specific safe storage keys (macOS Keychain).
pilot_import_cookies({ browser: "chrome", domains: [".github.com"] })Supports Chrome, Arc, Brave, Edge, and Comet. Use list_browsers, list_profiles, and list_domains to discover what's available.
Handoff / Resume
When headless mode hits a CAPTCHA, bot detection, or complex auth flow:
Call
pilot_handoff— opens a visible Chrome window with all your cookies, tabs, and localStorageSolve the challenge manually
Call
pilot_resume— automation continues with the updated state
Snapshot Diffing
Call pilot_snapshot_diff after an action to see exactly what changed on the page. Returns a unified diff. Useful for verifying actions worked, monitoring dynamic content, or debugging.
AI-Friendly Errors
Playwright errors are translated into actionable guidance:
Timeout → "Element not found. Run pilot_snapshot for fresh refs."
Multiple matches → "Selector matched multiple elements. Use @refs from pilot_snapshot."
Stale ref → "Ref is stale. Run pilot_snapshot for fresh refs."
Circular Buffers
Console, network, and dialog events are captured in O(1) ring buffers (50K capacity). Query with pilot_console, pilot_network, pilot_dialog. Never grows unbounded.
Architecture
pilot runs Playwright in the same process as the MCP server. No HTTP layer, no subprocess — direct function calls to the Playwright API over a persistent Chromium instance.
┌─────────────────────────────────────────────────┐
│ Your AI Agent (Claude Code, Cursor, etc.) │
│ │
│ ┌──────────────┐ stdio ┌─────────────┐ │
│ │ MCP Client │◄───────────►│ pilot │ │
│ └──────────────┘ │ │ │
│ │ Playwright │ │
│ │ (in-proc) │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Chromium │ │
│ │ (persistent)│ │
│ └─────────────┘ │
└─────────────────────────────────────────────────┘This is why it's fast. No network hops, no serialization overhead, no process spawning per action.
Requirements
Node.js >= 18
Chromium (installed via
npx playwright install chromium)
Credits
The core browser automation architecture — ref-based element selection, snapshot diffing, cursor-interactive scanning, annotated screenshots, circular buffers, and AI-friendly error translation — is ported from gstack by Garry Tan.
Built on Playwright by Microsoft and the Model Context Protocol SDK by Anthropic.
License
MIT
If pilot is useful to you, star the repo — it helps others find it.
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.