Peekaboo 🫣 - Mac automation that sees the screen and does the clicks.

Peekaboo Banner

npm package License: MIT macOS 15.0+ (Sequoia) Swift 6.2 node >=22 Download macOS Homebrew Ask DeepWiki

Peekaboo brings high-fidelity screen capture, AI analysis, and complete GUI automation to macOS. Version 3 adds native agent flows and multi-screen automation across the CLI and MCP server.

Note: v3 is currently in beta (3.0.0-beta1) and has a few known issues; see the changelog for details.

What you get

Pixel-accurate captures (windows, screens, menu bar) with optional Retina 2x scaling.
Natural-language agent that chains Peekaboo tools (see, click, type, scroll, hotkey, menu, window, app, dock, space).
Menu and menubar discovery with structured JSON; no clicks required.
Multi-provider AI: GPT-5.1 family, Claude 4.x, Grok 4-fast (vision), Gemini 2.5, and local Ollama models.
MCP server for Claude Desktop and Cursor plus a native CLI; the same tools in both.
Configurable, testable workflows with reproducible sessions and strict typing.
Requires macOS Screen Recording + Accessibility permissions (see docs/permissions.md).

Related MCP server: Safari Screenshot MCP Server

Install

macOS app + CLI (Homebrew):
brew install steipete/tap/peekaboo
MCP server (Node 22+, no global install needed):
npx -y @steipete/peekaboo

Quick start

# Capture full screen at Retina scale and save to Desktop peekaboo image --mode screen --retina --path ~/Desktop/screen.png # Click a button by label (captures, resolves, and clicks in one go) peekaboo see --app Safari --json-output | jq -r '.data.session_id' | read SID peekaboo click --on "Reload this page" --session "$SID" # Run a natural-language automation peekaboo "Open Notes and create a TODO list with three items" # Run as an MCP server (Claude/Cursor) npx -y @steipete/peekaboo # Minimal Claude Desktop config snippet (Developer → Edit Config): # { # "mcpServers": { # "peekaboo": { # "command": "npx", # "args": ["-y", "@steipete/peekaboo"], # "env": { # "PEEKABOO_AI_PROVIDERS": "openai/gpt-5.1,anthropic/claude-opus-4" # } # } # } # }

Command	Key flags / subcommands	What it does
see	`--app` , `--mode screen/window` , `--retina` , `--json-output`	Capture and annotate UI, return session + element IDs
click	`--on <id/query>` , `--session` , `--wait` , coords	Click by element ID, label, or coordinates
type	`--text` , `--clear` , `--delay-ms`	Enter text with pacing options
press	key names, `--repeat`	Special keys and sequences
hotkey	combos like `cmd,shift,t`	Modifier combos (cmd/ctrl/alt/shift)
scroll	`--on <id>` , `--direction up/down` , `--ticks`	Scroll views or elements
swipe	`--from/--to` , `--duration` , `--steps`	Smooth gesture-style drags
drag	`--from/--to` , modifiers, Dock/Trash targets	Drag-and-drop between elements/coords
move	`--to <id/coords>` , `--screen-index`	Position the cursor without clicking
window	`list` , `move` , `resize` , `focus` , `set-bounds`	Move/resize/focus windows and Spaces
app	`launch` , `quit` , `relaunch` , `switch` , `list`	Launch, quit, relaunch, switch apps
space	`list` , `switch` , `move-window`	List or switch macOS Spaces
menu	`list` , `list-all` , `click` , `click-extra`	List/click app menus and extras
menubar	`list` , `click`	Target status-bar items by name/index
dock	`launch` , `right-click` , `hide` , `show` , `list`	Interact with Dock items
dialog	`list` , `click` , `input` , `file` , `dismiss`	Drive system dialogs (open/save/etc.)
image	`--mode screen/window/menu` , `--retina` , `--analyze`	Screenshot screen/window/menu bar (+analyze)
list	`apps` , `windows` , `screens` , `menubar` , `permissions`	Enumerate apps, windows, screens, permissions
tools	`--source native\|mcp` , `--server <name>`	Inspect native + MCP tools
config	`init` , `show` , `add` , `login` , `models`	Manage credentials/providers/settings
permissions	`status` , `grant`	Check/grant required macOS permissions
run	`.peekaboo.json` , `--output` , `--no-fail-fast`	Execute `.peekaboo.json` automation scripts
sleep	`--duration` (ms)	Millisecond delays between steps
clean	`--all-sessions` , `--older-than` , `--session`	Prune sessions and caches
agent	`--model` , `--dry-run` , `--resume` , `--max-steps` , audio	Natural-language multi-step automation
mcp	`serve` , `list` , `add` , `enable/disable` , `test`	Manage external MCP servers and serve Peekaboo

Models and providers

OpenAI: GPT-5.1 (default) and GPT-4.1/4o vision
Anthropic: Claude 4.x
xAI: Grok 4-fast reasoning + vision
Google: Gemini 2.5 (pro/flash)
Local: Ollama (llama3.3, llava, etc.)

Set providers via PEEKABOO_AI_PROVIDERS or peekaboo config add.

Learn more

Command reference: docs/commands/
Architecture: docs/ARCHITECTURE.md
Building from source: docs/building.md
Testing guide: docs/testing/tools.md
MCP setup: docs/commands/mcp.md
Permissions: docs/permissions.md
Ollama/local models: docs/ollama.md
Agent chat loop: docs/agent-chat.md
Service API reference: docs/service-api-reference.md

Development basics

Requirements: macOS 15+, Xcode 16+/Swift 6.2. Node 22+ only if you run the pnpm docs/build helper scripts (core CLI/app/MCP are Swift-only).
Install deps: pnpm install then pnpm run build:cli or pnpm run test:safe.
Lint/format: pnpm run lint && pnpm run format.