Skip to main content
Glama

browser_snapshot

Capture an ARIA accessibility tree with stable refs for browser click/type actions. Compact by default with viewport scope, redaction, and size limits; use refs for interaction.

Instructions

Capture an ARIA-flavored accessibility tree with stable refs and pixel boxes. Compact by default: viewport-only, redacted, depth-limited, and capped to 12KB. Use refs for browser_click/browser_type. Pass mode='full' only when you intentionally need the uncapped page tree.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
tabIdNo
modeNocompact applies safe defaults; full skips compacting unless explicit options are passed.compact
scopeNoviewport drops nodes whose box is fully outside the visible area.viewport
maxBytesNoCap on the JSON-serialized result size. Excess subtrees are pruned and `truncated:true` is set. Use 0 with mode='full' for no cap.
maxDepthNoCompact mode depth limit before child subtrees are summarized.
textLimitNoCompact mode character limit for name/text fields.
includeBoxesNoCompact mode box retention. interactive keeps boxes only for actionable or semantic nodes.interactive
redactNoReplace likely-secret strings with [REDACTED]. Defaults to true in compact mode.
storeNoStore the raw snapshot in this MCP process for browser_snapshot_query/browser_snapshot_node drilldown.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses key behaviors: compact defaults, depth limits, caps, redaction, and viewport scoping. It also notes that passing mode='full' skips compacting. It does not discuss performance or auth, but for a browser tool these are standard.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is compact—only three sentences. The first defines the core purpose, the second lists compact defaults, and the third provides a key usage guideline. Every sentence is essential and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 9 parameters, high schema coverage, and no output schema, the description provides sufficient context: it explains the output (accessibility tree with refs/boxes) and how to apply it. The lack of output schema is mitigated by the clear purpose and usage hints. Minor gaps exist (e.g., what happens when store=false), but overall completeness is good.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is high (89%), so parameters are well-documented. The description adds value by summarizing defaults and advising on mode usage ('Pass mode='full' only when...'). It also clarifies output usage ('Use refs for browser_click/browser_type'), which goes beyond schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool captures an 'ARIA-flavored accessibility tree with stable refs and pixel boxes', distinguishing it from visual captures like browser_screenshot or text extraction. It also references sibling tools browser_snapshot_node and browser_snapshot_query for drilldown.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies default behavior (compact, viewport-only, redacted, depth-limited, 12KB cap) and advises when to use mode='full'. It also explains that refs in the snapshot are used for browser_click/browser_type, guiding post-capture actions. However, it does not explicitly contrast with alternative tools like browser_screenshot for visual needs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DevZonayed/Mochi'

If you have feedback or need assistance with the MCP directory API, please join our Discord server