Scout
Supports CSS selectors for element extraction and interaction.
Provides a Gin-like API for building browser automation workflows.
Source code hosted on GitHub; no direct automation integration.
Installation method via Homebrew; no automation integration.
Execute JavaScript expressions on web pages.
Converts web pages to compact Markdown format.
Mentioned as a dependency not used; no integration.
Used for installing UI dependencies; no automation integration.
Integration with Ollama for local LLM-powered browser automation.
Integration with OpenAI API for AI-driven browser automation.
Detects React framework and reads component state.
No specific integration; SVG is a format.
Used in stealth patches; no integration.
A single scout binary gives you a full CLI, a 66-tool MCP server, and a Go library with Gin-like middleware composition.
brew install felixgeelhaar/tap/scoutQuick Start
# CLI — visible browser, one-shot commands
scout observe https://example.com # structured page snapshot
scout markdown https://news.ycombinator.com # page as compact markdown
scout screenshot https://github.com # save screenshot
scout extract https://example.com h1 # extract element text
scout frameworks https://react.dev # detect React, Vue, etc.
# MCP Server — give AI agents browser superpowers
claude mcp add scout -- scout mcp serve
# Browser UI — conversational browser automation
scout ui serve --provider=ollama --model=mistral
cd ui && npm install && npm run dev # open http://localhost:3000Install
# Homebrew
brew install felixgeelhaar/tap/scout
# Direct binary
curl -fsSL https://raw.githubusercontent.com/felixgeelhaar/scout/main/install.sh | bash
# Go
go install github.com/felixgeelhaar/scout/cmd/scout@latest
# As a library
go get github.com/felixgeelhaar/scoutMCP Server — 66 Tools
Single binary, zero runtime dependencies. Configure in any MCP client:
claude mcp add scout -- scout mcp serve # Claude Code{"mcpServers": {"scout": {"command": "scout", "args": ["mcp", "serve"]}}}Tool Categories
Category | Tools |
Navigation |
|
Interaction |
|
Forms |
|
Extraction |
|
Capture |
|
Network |
|
Tabs |
|
Frameworks |
|
Playback |
|
Smart Helpers |
|
Vision |
|
Batch |
|
Iframe |
|
Trace |
|
Diagnostics |
|
Utility |
|
All tools have MCP annotations (ReadOnly, OpenWorld, ClosedWorld, Idempotent) for smart auto-approval. Read-only tools like observe, extract, and screenshot run without permission prompts.
Runtime Configuration
Switch between headless and visible browser without restarting:
Agent: configure(headless: false) → browser window appears
Agent: navigate("https://...") → watch it work
Agent: configure(headless: true) → back to headlessBrowser UI
A conversational browser automation interface. Type natural language, watch the browser respond in real-time.
# Start the AG-UI server (Go backend)
scout ui serve --provider=ollama --model=mistral # local, no API key
scout ui serve --provider=claude # needs ANTHROPIC_API_KEY
scout ui serve --provider=openai --model=gpt-4o # needs OPENAI_API_KEY
scout ui serve --provider=groq --base-url=https://api.groq.com/openai --model=llama-3.3-70b-versatile
# Start the Vue frontend
cd ui && npm install && npm run dev # http://localhost:3000The UI streams AG-UI protocol events over SSE:
Chat panel with markdown rendering and quick-action pills
Live browser viewport with screenshot streaming and URL bar
Activity timeline showing tool calls in real-time
Stop button to cancel mid-stream
The Go server handles the agentic loop: LLM decides which scout tools to call, executes them, streams browser state deltas back to the frontend. Supports any OpenAI-compatible endpoint via --base-url.
Agent Package
High-level Go API for AI agents. Structured output, auto-wait, goroutine-safe.
session, _ := agent.NewSession(agent.SessionConfig{Headless: true})
defer session.Close()
// Navigate and observe
session.Navigate("https://example.com")
obs, _ := session.Observe() // links, inputs, buttons, text + action costs
// DOM diff — only what changed (saves 50-80% tokens)
session.Click("#submit")
_, diff, _ := session.ObserveDiff()
// diff.Classification: "modal_appeared"
// diff.Summary: "Modal/dialog appeared: Login required"
// Semantic form filling — no CSS selectors
session.FillFormSemantic(map[string]string{
"Email": "user-example", "Password": "secret",
})
// Visual grounding — click by number, not selector
result, _ := session.AnnotatedScreenshot() // numbered labels on elements
session.ClickLabel(7) // click element [7]
// Multi-tab coordination
session.OpenTab("pricing", "https://example.com/pricing")
session.SwitchTab("default")
// Framework detection (19 frameworks)
frameworks, _ := session.DetectedFrameworks() // ["react", "nextjs"]
state, _ := session.ComponentState("#app") // read React/Vue state
// Network capture — read API responses directly
session.EnableNetworkCapture("/api/")
captured := session.CapturedRequests("/api/users")
// Action replay — record once, replay without LLM
session.StartRecordingPlaybook("login-flow")
// ... do stuff ...
pb, _ := session.StopRecordingPlaybook()
agent.SavePlaybook(pb, "login.json")
// Later: session.ReplayPlaybook(pb) // 100x cheaper
// Persistent profiles
session.SaveProfile("session.json") // cookies + localStorage
session.LoadProfile("session.json")
// Content distillation (5 levels)
session.Markdown() // ~2-8KB compact markdown
session.ReadableText() // ~1-4KB main content only
session.AccessibilityTree() // ~1-4KB semantic tree
session.ObserveWithBudget(500) // fit in ~500 tokensCore Library
Gin-like Engine/Context/Group/HandlerFunc with middleware composition:
engine := browse.Default(browse.WithHeadless(true))
engine.MustLaunch()
defer engine.Close()
engine.Use(middleware.Stealth())
engine.Use(middleware.Retry(middleware.RetryConfig{MaxAttempts: 3}))
engine.Use(middleware.Timeout(30 * time.Second))
admin := engine.Group("admin", middleware.BasicAuth("admin", "secret"))
admin.Task("export", func(c *browse.Context) {
c.MustNavigate("https://app.example.com/admin")
table, _ := c.ExtractTable("#users")
c.Set("data", table)
})
engine.RunGroup("admin")Middleware
Category | Middleware |
Resilience |
|
Auth |
|
Anti-detection |
|
Network |
|
Utilities |
|
CLI
CLI defaults to visible browser (--headless to hide):
scout navigate <url> # page info as JSON
scout observe <url> # structured observation
scout markdown <url> # compact markdown
scout screenshot <url> [--output f] # save screenshot
scout pdf <url> [--output f] # save PDF
scout extract <url> <selector> # extract text
scout eval <url> <expression> # run JavaScript
scout form discover <url> # discover form fields
scout frameworks <url> # detect frameworks
scout watch <url> [--interval=5s] # live-watch page changes
scout pipe <command> [selector] # batch process URLs from stdin
scout record <url> [--output f] # interactive recording → playbook
scout mcp serve # start MCP server
scout version # print versionArchitecture
scout/
├── browse.go, engine.go, context.go # Gin-like API
├── page.go, selection.go # CDP page & element interaction
├── recorder.go # Video recording (screencast → MP4/GIF)
├── middleware/ # stealth, resilience, auth, network
├── agent/ # AI agent API (50+ methods)
│ ├── session.go # Session lifecycle, Navigate, Click, Type
│ ├── observe.go, diff.go # Observe, ObserveDiff, cost estimation
│ ├── content.go # Markdown, ReadableText, AccessibilityTree
│ ├── form.go # DiscoverForm, FillFormSemantic, MatchFormField
│ ├── annotate.go # AnnotatedScreenshot, ClickLabel
│ ├── network.go # EnableNetworkCapture, CapturedRequests
│ ├── spa.go # DetectedFrameworks, ComponentState, GetAppState
│ ├── tabs.go # OpenTab, SwitchTab, CloseTab, ListTabs
│ ├── playbook.go # StartRecording, ReplayPlaybook, SavePlaybook
│ ├── interact.go # Hover, DragDrop, SelectOption, ScrollTo
│ ├── profile.go # CaptureProfile, ApplyProfile, SaveProfile
│ ├── selector.go # Playwright :text() selector translation
│ ├── budget.go # ObserveWithBudget, EstimateTokens
│ ├── nlselect.go # SelectByPrompt, fuzzy NL element matching
│ ├── batch.go # ExecuteBatch, sequential multi-action
│ ├── vision.go # HybridObserve, FindByCoordinates
│ ├── trace.go # StartTrace, StopTrace, action tracing
│ ├── iframe.go # SwitchToFrame, SwitchToMainFrame
│ └── vitals.go # WebVitals (LCP/CLS/INP)
├── internal/cdp/ # WebSocket CDP client (context-aware)
├── internal/launcher/ # Chrome process management
├── cmd/scout/ # CLI + MCP server (66 tools)
└── docs/ # Landing page (GitHub Pages)License
MIT
This server cannot be installed
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/felixgeelhaar/scout'
If you have feedback or need assistance with the MCP directory API, please join our Discord server