Browser Jet Pilot
Allows using OpenAI's API as the AI backend for the BrowserAgent to perform autonomous browser tasks.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Browser Jet PilotNavigate to news.ycombinator.com and extract the top story titles."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Browser Jet Pilot
Self-hosted MCP server for AI browser automation. Drop-in @browserbasehq/mcp alternative.
What's New
v1.0.0 — Base Release
API Key Authentication — Secure your HTTP transport with API key authentication using the
API_KEYenvironment variable andX-API-KeyheaderTesting Infrastructure — Comprehensive test suite powered by Vitest with coverage reports and UI mode
Developer Experience — ESLint, Prettier, and pre-commit hooks (Husky + lint-staged) for consistent code quality
Enhanced Security — Constant-time API key comparison using
crypto.timingSafeEqualto prevent timing attacksImproved CI/CD — Extended workflow with type checking, linting, testing, and security audits
Bug Fixes — Fixed SVG/MathML element handling in content extraction, proper browser cleanup on session end, and enhanced type safety
New Tools
browser_disable_shaders— Block WebGL, freeze CSS animations, and throttlerequestAnimationFrameto ~1 FPS for heavy shader-rendered pagesbrowser_restore_shaders— Restore shader operations (WebGL/RAF require page reload to fully restore)
Related MCP server: Scout
Description
Self-hosted MCP server for browser automation. Connect to your own Chromium instance via CDP — no Browserbase subscription needed.
Drop-in alternative to @browserbasehq/mcp that runs on your infrastructure.
Features
Connect to existing browser via Chrome DevTools Protocol (CDP)
Or launch a new one managed by Playwright
19 deterministic tools — no LLM needed for basic browser control (including shader control tools)
MCP stdio transport — works with Claude Desktop, Cursor, or any MCP client
MCP HTTP transport — expose as a service on a port with optional API key authentication
BrowserAgent — AI-powered autonomous browser agent (OpenAI, Anthropic, or custom)
BrowserSkill — standardized skill interface for agent frameworks
Tab management — multi-tab workflows
Screenshot support — base64 PNG output, element or full-page
Comprehensive testing — Vitest test suite with coverage and UI
Developer tooling — ESLint, Prettier, pre-commit hooks
A little notice
I want you to understand that this repository contains a highly powerful toolset—exceptionally powerful.
It is shared in good faith, with the expectation that you will use these tools responsibly and for educational purposes, without causing harm to others. There are no guardrails, restrictions, or feature flags built in that would limit or alter their original capabilities.
As the saying goes, a weapon is only as dangerous as the hands that wield it. Power itself is neutral—its consequences are defined entirely by the intent and discipline of the user.
I won’t go into detail about what could go wrong or the many ways these tools might be misused. If you found this repository while looking for that, you likely already know what you’re doing—and I won’t be the one to outline those paths. Instead, in the next part of this document, you’ll find guidance on how you should use these tools, along with the valuable and practical functionality they were designed to provide.
Keep in mind: these tools possess an extraordinary level of power, and that power is entirely in your hands.
If any damage occurs—whether through data leaks, automated actions, replay mechanisms, or otherwise—responsibility does not lie with the tools or this project. The sole accountable party is you. ¯_(ツ)_/¯
Quick Start
1. Have a browser running with CDP
# If using Xvfb + Chromium:
chromium --no-sandbox --remote-debugging-port=9222 --disable-gpu
# Or just connect to your existing VNC Chromium (make sure it has --remote-debugging-port=9222)2. Install and run
npm install
npm run build
# Stdio mode (for Claude Desktop, etc.)
node dist/index.js
# HTTP mode
node dist/index.js --port 31003. Configure Claude Desktop
Add to your claude_desktop_config.json:
Config file locations by platform:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.jsonLinux:
~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"browser": {
"command": "node",
"args": ["/absolute/path/to/dist/index.js"],
"env": {
"CDP_URL": "http://localhost:9222"
}
}
}
}Documentation site (Mintlify)
Project docs are now available in docs/ (Mintlify format).
# Install dependencies (includes Mint CLI as dev dependency)
npm install
# Validate docs build
npm run docs:validate
# Check internal links
npm run docs:links
# Run docs locally
npm run docs:devContinuous integration
GitHub Actions workflow is available at .github/workflows/ci.yml.
It runs on every push and pull request, and executes:
npm cinpm run buildnpm run typechecknpm run lintnpm run test -- --runnpm audit --audit-level=moderatenpm run docs:validatenpm run docs:linksDocker Compose startup + health wait
npm run reliability:check
Development
Available Scripts
Command | Description |
| Compile TypeScript to |
| Run in development mode with |
| Run the built server |
| Run BrowserAgent CLI |
| Run deterministic tool sequence |
| Run Vitest test suite |
| Run tests with coverage report |
| Run Vitest UI interface |
| Run ESLint |
| Fix ESLint issues automatically |
| Format code with Prettier |
| Check code formatting |
| Run full reliability test suite |
Pre-commit Hooks
The project uses Husky and lint-staged to ensure code quality:
TypeScript/TSX files: ESLint + Prettier
JSON, Markdown, YAML: Prettier
Install hooks after cloning:
npm installThe prepare script in package.json automatically sets up Husky.
Run with Docker
Quick start (Docker Compose)
docker compose up -d --build
docker compose ps
curl http://localhost:3100/healthzExpected:
MCP endpoint:
http://localhost:3100/mcpHealth endpoint:
http://localhost:3100/healthzContainer status:
healthy(docker compose ps)
Quick smoke test (tool flow):
npm run agent -- --server-url http://localhost:3100/mcp --sequence \
browser_start "browser_navigate?url=https://example.com" \
"browser_get_content?selector=body&type=text" browser_endFirst image build takes longer because Playwright Chromium binaries are installed in-container. Compose includes a built-in healthcheck against
/healthz.
Thorough reliability pass (health + lifecycle + extraction + multi-tab + 5x repeat):
npm run reliability:check
# or against a custom endpoint:
npm run reliability:check -- http://localhost:3100/mcpStop
docker compose downOption B: Plain Docker (no Compose)
docker build -t browser-jet-pilot:local .
docker run --rm -p 3100:3100 --shm-size=1g \
-e PORT=3100 -e HOST=0.0.0.0 -e LAUNCH=true \
browser-jet-pilot:localExternal Chromium (CDP mode)
docker run --rm -p 3100:3100 --shm-size=1g \
-e PORT=3100 -e HOST=0.0.0.0 -e LAUNCH=false \
-e CDP_URL=http://host.docker.internal:9222 \
browser-jet-pilot:localIf host.docker.internal is not available on your Linux Docker setup, add:
--add-host=host.docker.internal:host-gatewayTools
Session
Tool | Description |
| Connect to browser (or launch). Must call first. |
| Close the current session. |
Navigation
Tool | Parameters | Description |
|
| Go to a URL |
|
| Open a new tab |
| — | List all open tabs |
|
| Switch to tab by index |
| — | Get current URL, title, viewport |
Interaction
Tool | Parameters | Description |
|
| Click an element |
|
| Clear and fill an input |
|
| Type character by character |
|
| Select dropdown option |
|
| Hover over an element |
|
| Scroll page or element |
Observation
Tool | Parameters | Description |
|
| Take a screenshot (base64 PNG) |
|
| Extract text or HTML |
|
| Run JavaScript in the page |
|
| Wait for an element |
Shader Control
Tool | Parameters | Description |
|
| Block WebGL, freeze CSS animations, throttle requestAnimationFrame to ~1 FPS. Use on heavy shader-rendered pages before navigating. |
| — | Remove injected style element. WebGL/RAF need page reload to fully restore. |
Configuration
CLI Flags
Flag | Env Var | Default | Description |
|
| (stdio) | Port for HTTP transport |
|
|
| Bind address |
|
|
| CDP endpoint |
|
|
| Launch new browser instead of connecting |
|
|
| Viewport width |
|
|
| Viewport height |
Environment Variables
# .env file
CDP_URL=http://localhost:9222
LAUNCH=false
BROWSER_WIDTH=1280
BROWSER_HEIGHT=720
API_KEY=your-secret-api-key # Optional: secure HTTP transportAPI Key Authentication
When running in HTTP mode, you can optionally secure the server with an API key:
Set the
API_KEYenvironment variable:
export API_KEY=your-secret-api-keyInclude the key in requests via the
X-API-Keyheader:
curl -H "X-API-Key: your-secret-api-key" http://localhost:3100/mcpThe server uses constant-time comparison (crypto.timingSafeEqual) to prevent timing attacks when validating API keys.
Example: HTTP Transport
# Start the server
node dist/index.js --port 3100 --host 0.0.0.0
# It listens on http://0.0.0.0:3100/mcp
# Connect any MCP client via Streamable HTTP transportExample: Launch Mode
No existing browser? Launch one:
node dist/index.js --launch --browser-width 1920 --browser-height 1080Comparison with Browserbase MCP
Feature | Browserbase MCP | This Server |
Browser hosting | Browserbase cloud | Your container |
Cost | Per-session pricing | Free (your infra) |
| Stagehand + LLM | Not included (use deterministic tools) |
| Stagehand + LLM |
|
| Stagehand + LLM |
|
Screenshot | Via Stagehand | Native |
Tab management | Single page | Multi-tab |
Data residency | Browserbase servers | Your server |
Architecture
flowchart LR
subgraph App["Your Application"]
BS["BrowserSkill"]
BA["BrowserAgent (AI)"]
MC["MCP Client<br/>(Claude Desktop, Cursor, etc.)"]
end
BS -->|MCP JSON-RPC| MB["Browser Jet Pilot<br/>(stdio or HTTP)"]
BA -->|MCP JSON-RPC| MB
MC -->|MCP JSON-RPC| MB
MB --> PW["Playwright"]
PW -->|CDP| CH["Your Chromium / Chrome"]BrowserAgent (AI-Powered)
An autonomous agent that connects to the MCP server and uses an LLM to plan and execute browser tasks from natural language.
CLI Usage
# AI-powered mode (needs OPENAI_API_KEY or ANTHROPIC_API_KEY)
npm run agent -- --server-url http://localhost:3100/mcp \
"Go to start.gg and find the latest Tekken 7 tournament"
# Use Anthropic
npm run agent -- --server-url http://localhost:3100/mcp \
--ai-provider anthropic --ai-model claude-sonnet-4-20250514 \
"Navigate to example.com and extract all links"
# Deterministic mode (no AI, scripted tool calls)
npm run agent -- --server-url http://localhost:3100/mcp --sequence \
browser_start \
"browser_navigate?url=https://example.com" \
browser_screenshot \
browser_endProgrammatic Usage
import { BrowserAgent } from 'browser-jet-pilot/agent';
const agent = new BrowserAgent({
serverUrl: 'http://localhost:3100/mcp',
aiProvider: 'openai',
aiModel: 'gpt-4o',
// aiApiKey: 'sk-...', // or set OPENAI_API_KEY env
maxSteps: 20,
});
// AI-powered task
const result = await agent.run('Go to start.gg/tournament/12345 and get the bracket');
console.log(result.summary);
console.log(`Steps: ${result.steps.length}, Screenshots: ${result.screenshots.length}`);
// Deterministic sequence
const result2 = await agent.executeSequence([
{ tool: 'browser_start' },
{ tool: 'browser_navigate', args: { url: 'https://example.com' } },
{ tool: 'browser_screenshot' },
]);
await agent.disconnect();Agent Configuration
Option | Env Var | Default | Description |
| — | — | MCP server HTTP endpoint |
| — |
| MCP server command (stdio) |
| — |
| MCP server args (stdio) |
| — |
|
|
| — |
| LLM model name |
|
| — | API key |
| — | OpenAI/Anthropic default | Custom endpoint |
| — |
| Safety limit per task |
BrowserSkill (Agent Framework Integration)
A standardized skill wrapper that can be loaded by any agent framework. Provides quick helper methods and automatic screenshot saving.
Programmatic Usage
import { BrowserSkill } from 'browser-jet-pilot/skill';
const skill = new BrowserSkill({
serverUrl: 'http://localhost:3100/mcp',
saveScreenshots: true,
screenshotDir: './screenshots',
});
await skill.init();
// Execute an AI-powered browser task
const result = await skill.execute(
'Go to start.gg and extract tournament data',
{ workDir: './workspace' },
);
console.log(result.summary); // "Found 3 tournaments..."
console.log(result.files); // ['./screenshots/step-3-1712...png']
console.log(result.metadata); // steps, duration, tool calls
// Quick helpers (deterministic, no AI needed)
const page = await skill.goto('https://example.com');
const content = await skill.read('https://example.com', '#main-content');
const { base64, file } = await skill.capture('https://example.com', true);
const links = await skill.extract('https://example.com',
'Array.from(document.querySelectorAll("a")).map(a => ({text: a.innerText, href: a.href}))'
);
await skill.destroy();Skill Interface
interface SkillResult {
success: boolean;
summary: string;
files: string[]; // saved screenshot paths
data?: any; // parsed JSON from last step
metadata: {
steps: number;
screenshots: number;
totalDuration: number;
toolCalls: Array<{ tool: string; args: Record<string, unknown> }>;
};
}Requirements
Node.js >= 18
Chromium with
--remote-debugging-port=9222(or launch mode)For Playwright browser launch:
npx playwright install chromium
License
MIT
design, written and coded solely by head and <3 for you - 0xabadbabe (Sudo Qt - jet'aime)
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/0xABADBABE-ops/browser-jet-pilot'
If you have feedback or need assistance with the MCP directory API, please join our Discord server