hanzi-browse
Provides browser automation for Amazon, enabling tasks like product search, cart management, and checkout using site-specific interaction hints.
Provides browser automation for eBay, including listing search, bidding, and purchasing with optimized interaction patterns.
Provides browser automation for Figma, allowing agents to interact with design files, projects, and team collaboration features.
Provides browser automation for GitHub, enabling repository management, issue handling, pull requests, and code review workflows.
Provides browser automation for Gmail, supporting email management, keyboard shortcuts, and task execution like unsubscribing from marketing emails.
Provides browser automation for Google Docs, enabling document creation, editing, and collaboration with site-specific interaction recipes.
Provides browser automation for Indeed, allowing job search, browsing listings, and submitting applications through the browser.
Provides browser automation for Notion, enabling page and database interactions, content creation, and project management tasks.
Provides browser automation for ChatGPT (OpenAI's chat interface), enabling conversational interactions and task execution through the browser.
Provides browser automation for Reddit, supporting browsing, posting, and commenting within communities.
Provides browser automation for Slack, enabling messaging, channel management, file sharing, and workspace interactions.
Provides browser automation for Stack Overflow, allowing search, browsing questions, and interaction with Q&A content.
Provides browser automation for Target, enabling product search, browsing, and purchasing with tailored interaction hints.
Provides browser automation for Walmart, supporting product search, cart management, and checkout processes.
Provides browser automation for Zillow, enabling property search, rental exploration, and real estate browsing.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@hanzi-browseFind AI engineer jobs on LinkedIn in San Francisco"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Hanzi Browse
The context layer for browsing agents.
Your browsing agent keeps failing on real sites — X uses Draft.js, LinkedIn hides the connect button, Gmail needs keyboard shortcuts. Hanzi Browse ships 24 site playbooks — hints for the LLM, not brittle scripts — so it actually finishes the task.
Works with

Two ways to use Hanzi Browse
Same 24 site playbooks underneath. Two install paths depending on who's driving.
For your agent — a browser sub-agent for your coding agent
One command. npx hanzi-browse setup detects every AI agent on your machine (Claude Code, Cursor, Codex, and 9 more) and wires Hanzi Browse in as an MCP tool. Your main agent delegates browser work; a sub-agent runs the loop — read page → plan next action → click/type/scroll → observe → repeat until done — and returns a clean answer. Site playbooks auto-load by URL so the model already knows the quirks.
For your product — browser automation for your users, described in English
Your backend calls runTask({ task: "…" }). Your users' own Chrome executes it, signed in as themselves. Same 24 playbooks as the CLI, exposed as a REST API and @hanzi-browse/sdk. Free tools on tools.hanzilla.co are built on this SDK.
Get Started
npx hanzi-browse setupOne command does everything:
npx hanzi-browse setup
│
├── 1. Detect browsers ──── Chrome, Brave, Edge, Arc, Chromium
│
├── 2. Install extension ── Opens Chrome Web Store, waits for install
│
├── 3. Detect AI agents ─── Claude Code, Cursor, Codex, Windsurf,
│ VS Code, Gemini CLI, Amp, Cline, Roo Code
│
├── 4. Configure MCP ────── Merges hanzi-browse into each agent's config
│
├── 5. Install skills ───── Copies browser skills into each agent
│
└── 6. Choose AI mode ───── Managed ($0.05/task) or BYOM (free forever)Managed — we handle the AI. 20 free tasks/month, then $0.05/task. No API key needed.
BYOM — use your Claude Pro/Max subscription, GPT Plus, or any API key. Free forever, runs locally.
Examples
"Go to Gmail and unsubscribe from all marketing emails from the last week"
"Apply for the senior engineer position on careers.acme.com"
"Log into my bank and download last month's statement"
"Find AI engineer jobs on LinkedIn in San Francisco"Skills & Free Tools
Hanzi Browse has two distribution channels. Both use the same browser automation engine and site domain knowledge:
Skills — for users who run Hanzi Browse locally through their AI agent. The setup wizard installs skills directly into your agent (Claude Code, Cursor, etc.). Each skill teaches the agent when and how to use the browser for a specific workflow.
Free Tools — hosted web apps that anyone can try without installing anything. Each tool is a standalone app built on the Hanzi Browse API that demonstrates a use case. Every skill can become a free tool.
Skills
Installed automatically during npx hanzi-browse setup. Your agent reads these as markdown files.
Skill | Description |
| Core skill — when and how to use browser automation |
| Test your app in a real browser, report bugs with screenshots |
| Draft per-platform posts, publish from your signed-in accounts |
| Find prospects, send personalized connection requests |
| Run accessibility audits in a real browser |
| Extract structured data from websites into CSV/JSON |
| Twitter/X marketing workflows |
Open source — add your own.
Free Tools
Try them at tools.hanzilla.co. No account needed — just install the extension and go.
Tool | What it does | Try it |
X Marketing | AI finds relevant conversations on X, drafts personalized replies, posts from your Chrome |
Site Playbooks — the context layer
Both CLI and SDK rely on a shared set of site playbooks — verified interaction recipes for complex websites. They teach the LLM how async loading works on X, which selector hides LinkedIn's connect button, that Gmail responds to keyboard shortcuts, and how to sidestep anti-bot detection on ~20 other sites.
Hints for the LLM, not brittle scripts. The model stays in control; we just hand it the cheat sheet. When the DOM shifts, the agent adapts — no adapter to rebuild.
Currently supports 24 sites: X, LinkedIn, Gmail, GitHub, Notion, Figma, Slack, Reddit, Amazon, eBay, Walmart, Target, Zillow, Apartments.com, Craigslist, Indeed, Google Docs, Sheets, Calendar, Drive, ChatGPT, Claude.ai, Stack Overflow.
All playbooks live in server/src/agent/domain-skills.json as a single shared JSON array. To add a site, open a PR appending a { domain, skill } entry.
Build with Hanzi Browse
Embed browser automation in your product. Your app calls the Hanzi Browse API, a real browser executes the task, you get the result back.
Get an API key — sign in to your developer console, then create a key
Pair a browser — create a pairing token, send your user a pairing link (
/pair/{token}) — they click it and auto-pairRun a task —
POST /v1/taskswith a task and browser session IDGet the result — poll
GET /v1/tasks/:iduntil complete, or userunTask()which blocks
import { HanziClient } from '@hanzi-browse/sdk';
const client = new HanziClient({ apiKey: process.env.HANZI_API_KEY });
const { pairingToken } = await client.createPairingToken();
const sessions = await client.listSessions();
const result = await client.runTask({
browserSessionId: sessions[0].id,
task: 'Read the patient chart on the current page',
});
console.log(result.answer);API reference · Dashboard · Sample integration
Tools
Tool | Description |
| Run a task. Blocks until complete. |
| Send follow-up to an existing session. |
| Check progress. |
| Stop a task. |
| Capture current page as image. |
Pricing
Managed | BYOM | |
Price | $0.05/task (20 free/month) | Free forever |
AI model | We handle it (Gemini) | Your own key |
Data | Processed on Hanzi Browse servers | Never leaves your machine |
Billing | Only completed tasks. Errors are free. | N/A |
Building a product? Contact us for volume pricing.
Development
Prerequisites: Node.js 18+, Docker Desktop (must be running before make fresh).
First time (local setup)
git clone https://github.com/hanzili/hanzi-browse
cd hanzi-browse
make freshPerforms full setup: installs deps, builds server/dashboard/extension, starts Postgres, runs migrations, and launches the dev server (~90s).
Run the project
make devStarts the backend services (Postgres + migrations + API server) and serves the dashboard UI.
Dashboard (requires Google OAuth): http://localhost:3456/dashboard
Configuration
The defaults in .env.example are enough to run the server.
Optional services:
Google OAuth (dashboard sign-in) -- add
GOOGLE_CLIENT_ID/GOOGLE_CLIENT_SECRETto.envStripe (credit purchases) -- add test keys to
.envVertex AI (managed task execution) -- see
.env.examplefor setup stepsPostHog (analytics) -- add
POSTHOG_API_KEYto enable local CLI telemetry, dashboard analytics, managed backend analytics, and the example apps; optionally setPOSTHOG_HOST
Load the extension
Open chrome://extensions, enable Developer Mode, click "Load unpacked", and select the project root (the folder that contains manifest.json).
Verify everything works
After make dev is running and the extension is loaded, test both user paths:
Test 1: MCP / CLI mode (user path)
# In a separate terminal:
node server/dist/cli.js start "Go to example.com and tell me the page title"You should see a Chrome window open, the agent navigate to example.com, and return the page title. If this works, the relay + extension + agent loop are all connected.
Test 2: Managed API mode (developer path)
# 1. Check the API is running
curl http://localhost:3456/v1/health
# 2. Open the dashboard and sign in (requires Google OAuth configured)
open http://localhost:3456/dashboard
# 3. Create an API key from the dashboard, then:
curl -X POST http://localhost:3456/v1/browser-sessions/pair \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json"
# 4. Open the pairing URL in Chrome (from the response)
open "http://localhost:3456/pair/PAIRING_TOKEN"
# 5. After pairing, run a task
curl -X POST http://localhost:3456/v1/tasks \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"task": "Go to example.com and read the title", "browser_session_id": "SESSION_ID"}'
# 6. Check the result
curl http://localhost:3456/v1/tasks/TASK_ID \
-H "Authorization: Bearer YOUR_API_KEY"Test 3: Embed widget
Create a test HTML file and open it in Chrome:
<div id="hanzi"></div>
<script src="http://localhost:3456/embed.js"></script>
<script>
HanziConnect.mount('#hanzi', {
apiKey: 'YOUR_PUBLISHABLE_KEY',
apiUrl: 'http://localhost:3456',
onConnected: (id) => console.log('Connected:', id),
onError: (err) => console.log('Error:', err),
});
</script>You should see the pairing widget with step-by-step instructions.
Notes
Local vs CLI usage --
npx hanzi-browse setupis for packaged usage and may not work in a local clonePort conflicts -- if you see
EADDRINUSEon3456, stop existing processes or runmake stopNo Google OAuth? -- The dashboard sign-in won't work, but you can seed a test workspace directly in the database and use the API key for testing
Commands
Command | What it does |
| Full first-time setup (deps + build + DB + start) |
| Start everything (DB + migrate + server) |
| Rebuild server + dashboard + extension |
| Stop Postgres |
| Stop + delete database volume |
| Verify Node 18+ and Docker are available |
| Show all commands |
Contributing
We welcome contributions! See CONTRIBUTING.md for setup instructions.
Good first contributions: new skills, landing pages, site-pattern files, platform testing, translations. Check the open issues.
Community
Discord · Documentation · Twitter
Privacy
Hanzi Browse operates in different modes with different data handling. Read the privacy policy.
BYOM: No data sent to Hanzi Browse servers. Screenshots go to your chosen AI provider only.
Managed / API: Task data processed on Hanzi Browse servers via Google Vertex AI.
License
This server cannot be installed
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/hanzili/hanzi-browse'
If you have feedback or need assistance with the MCP directory API, please join our Discord server