Dock

by ai.trydock

Server Details

AI workspace for you, your team, and every agent. Tables, docs (images, 4K video), formulas.

Status: Healthy
Last Tested: 2026-05-25 10:33
Transport: Streamable HTTP
URL
Repository: try-dock-ai/mcp
GitHub Stars: 0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A4.2/5.0

Tool DescriptionsA

Average 4.5/5 across 43 of 43 tools scored. Lowest: 3.5/5.

Server CoherenceA

Disambiguation5/5

Each tool targets a distinct resource or action, with clear boundaries even among similar operations like doc updates (append, replace whole, replace section). There is no ambiguity between tools; overlapping purposes are prevented by specific descriptions.

Naming Consistency5/5

All tools follow a consistent verb_noun pattern in snake_case (e.g., create_workspace, delete_row, update_doc). A few tools like 'search' are single verbs but are clear and fit the pattern. No mixing of conventions.

Tool Count2/5

With 43 tools, the server is heavily loaded. While each tool serves a specific purpose, the sheer number exceeds the typical well-scoped range (3–15) and falls into the 'too many' category (25+), making it harder for agents to navigate despite good individual tool descriptions.

Completeness5/5

The tool surface covers the full lifecycle of workspaces, surfaces, rows, docs, webhooks, billing, API keys, and members, including advanced operations like atomic row moves, doc section updates, pre-flight validation, and approval-gated sensitive actions. No obvious gaps for the domain.

Available Tools

60 tools

add_columnAInspect

Append a single column to a workspace's table schema. Position is auto-computed as next-after-max so the contiguity invariant holds. Key collision (409) if a column with the same key already exists. Editor role required. Use this for per-column additions; use get_workspace_schema + update_workspace_columns (PUT on /columns) for full schema replacement or reordering. Multi-surface workspaces accept surface_slug to target a specific table sheet (use list_surfaces to enumerate); omit to fall through to the workspace's primary table surface.

ParametersJSON Schema

Name	Required	Description
`key`	Yes	Field name in row.data. Lowercase + underscores recommended; 1-64 chars.
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`type`	Yes	Column type. See get_workspace_schema for examples.
`label`	Yes	Human-readable header shown in the sheet.
`width`	No	Optional. Initial column width in px.
`options`	No	Required for `status` + `select` types. The allowed values shown in the dropdown.
`description`	No	Optional. Human-readable tooltip shown in the column header.
`surface_slug`	No	Optional. The slug of the specific table surface to add the column to. Omit on single-table workspaces; required on multi-table workspaces if you don't want the primary table surface (lowest position).

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description covers key behaviors: auto-computed position, contiguity invariant, error on key collision, and required editor role. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences covering purpose, behavior, and usage guidance. No unnecessary words, every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema and moderate complexity, the description provides purpose, behavior, error conditions, authorization, and alternatives. Complete for agent usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description barely adds beyond schema for parameters; it mentions position auto-computation but that's not a parameter. Adequate but not improved.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool appends a column to a workspace table schema and distinguishes itself from sibling tools like get_workspace_schema + update_workspace_columns for full schema replacement.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to use for per-column additions, not for full schema replacement, and mentions alternative tools. Also specifies that 'Editor role required' and key collision results in 409.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_commentAInspect

Post a new comment on any target in a workspace: a row, a cell, a doc text range, an html element, an entire surface, or the workspace itself. Polymorphic target shape mirrors the REST POST /api/workspaces/:slug/comments. For threading, pass parentId to hang the new comment as a reply (the server flattens nested replies to single depth and auto-unresolves a resolved parent). Mentions are an array of { kind: 'user'|'agent', id, label } triples; the server validates each mention's access to the workspace before accepting. Fires comment.added (and comment.unresolved when a reply reopens a resolved parent). For replies to existing comments where you don't want to reconstruct the target, prefer reply_to_comment which derives the target from the parent. Editor or commenter role required.

ParametersJSON Schema

Name	Required	Description
`body`	Yes	Comment body (plain text or markdown). 1-5000 chars.
`slug`	Yes	The workspace slug ('my-workspace' or 'my-org/my-workspace').
`target`	Yes	Polymorphic target. Shapes: { type: 'row', rowId: '<cuid>' } { type: 'cell', rowId: '<cuid>', columnKey: '<key>' } { type: 'doc_range', surfaceSlug: '<slug>', anchor: { from: <number>, to: <number>, text: '<plain>' } } { type: 'html_element', surfaceSlug: '<slug>', anchor: { selector: '<css>', text?: '<plain>' } } { type: 'surface', surfaceSlug: '<slug>' } { type: 'workspace' }
`mentions`	No	Optional `[{ kind, id, label }]` mentions. Each mention's principal must have workspace access. Fires inbox + email + webhook fan-out for newly-mentioned recipients only.
`parentId`	No	Optional parent comment id. When passed, this comment becomes a reply in the thread. Nested replies flatten to single-depth (reply-to-reply re-points at the root). Re-opens a resolved parent.

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavioral traits: polymorphic target types, auto-flattening, re-opening resolved parents, mentions triggering notifications, required role, and emitted event. Comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single paragraph, front-loaded with main action, then details. No wasted words. Could improve structure with bullet points for target types, but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers input comprehensively but lacks description of output/response. No output schema exists, so the description should mention what the tool returns (e.g., created comment object). Missing this makes it incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant meaning: explains target polymorphic shapes in detail, mentions cap on mentions, and explains parentId behavior (auto-flattening, re-opening). Exceeds baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'Post' and resource 'comment on any target in a workspace'. It distinguishes from raw HTTP and sibling tools like reply_to_comment by detailing the polymorphic target and threading behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance to use instead of raw HTTP. Explains threading with parentId, auto-flattening, and re-opening resolved parents. Mentions required role and notifications. However, lacks explicit when-not-to-use compared to reply_to_comment or react_to_comment.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

append_doc_sectionAInspect

Append a chunk of Markdown to the END of a workspace's doc body. Designed for crons + ingest agents that produce content in timestamped chunks (changelog updates, daily standups, batch summaries). Same markdown surface as update_doc: supports CommonMark, GFM, ![alt](url) inline images (any publicly-reachable HTTPS URL), lone video URLs (.mp4/.webm/.mov/.mkv/.m4v → native <video> player, 5 GB per file), mermaid diagrams, $math$/$$math$$ KaTeX, > [!NOTE]/[!TIP]/[!IMPORTANT]/[!WARNING]/[!CAUTION] callouts, svg sanitized embeds, X... toggles, [[slug]] cross-references, @Label @-mentions of users + agents, and lone-URL embeds (YouTube/Vimeo/Loom/Figma/CodePen/gists). Server fetches the current body, splices the new blocks on, and writes the result through the same path as update_doc with the same auth, same events, same byte/depth/node-count guard. Append is non-idempotent by design (every call adds content); the caller is responsible for dedupe. @-mentions inside the appended chunk fire doc.mention_added + inbox/email fan-out for newly-added mentions only — appending a chunk that re-mentions someone already mentioned earlier in the doc won't re-fire. Requires editor role. Multi-surface workspaces optionally accept surface_slug to append to a specific doc tab.

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`markdown`	Yes	Markdown chunk to append (CommonMark + GFM). Becomes one or more new blocks at the end of the existing doc.
`surface_slug`	No	Optional doc surface slug for multi-doc workspaces. Omit to append to the primary doc surface.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses non-idempotency, required editor role, optional surface_slug, same auth/events/guards as update_doc, and the fetch-splice-write mechanism. Lacks mention of error conditions or limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is thorough but lengthy (8 sentences). It front-loads the core purpose but includes extensive markdown support details that could be condensed. Still, the structure is logical.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately covers return behavior (server fetches, splices, writes). It explains the append semantics, dedupe responsibility, and role requirements. Missing details on potential failure modes or performance.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds context: slug accepts bare or org-prefixed forms, markdown supports specific extensions (CommonMark, GFM, diagrams, callouts, etc.), and surface_slug is optional for multi-doc workspaces. This meaningfully enriches the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool appends markdown to the end of a doc body, with specific verb 'Append' and resource 'workspace doc'. It distinguishes from update_doc by emphasizing append behavior and non-idempotency.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly targets crons and ingest agents producing timestamped chunks, and notes the caller is responsible for dedupe. It implies using update_doc for full replacements but does not explicitly state when NOT to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_rowAInspect

Append a new row to a workspace's table surface. The data field is a JSON object with column-name keys. Status column accepts: drafted, queued, sealed, active, blocked. Works on any workspace; columns auto-seed on the first row if the table surface is empty. Multi-surface workspaces accept surface_slug to target a specific sheet (use list_surfaces to enumerate); omit it to fall through to the workspace's primary table surface.

Unmapped data fields: Keys in data that don't match any existing column are still STORED on the row (nothing is dropped), but they won't render in the table UI until the column exists. The response carries an unmapped_fields array listing those keys plus a human-readable warning so an agent can decide whether to surface them, call add_column, or retry with auto_create_columns: true.

Auto-create columns: Pass auto_create_columns: true to have the server append a fresh text column for every unmapped key in one atomic step (humanised label from the key, type text). The response then includes created_columns: ColumnDef[] with the new column metadata. Use this when you're appending machine-emitted rows whose shape you can't predict ahead of time; leave it omitted (default false) when you want explicit schema control.

ParametersJSON Schema

Name	Required	Description
`data`	Yes	Row data as a JSON object (e.g. {"title": "My post", "status": "drafted", "notes": "Initial draft"})
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`surface_slug`	No	Optional table surface slug for multi-surface workspaces. Omit to write to the workspace's primary table surface. 400 if the slug is a doc surface, archived, or doesn't exist.
`auto_create_columns`	No	When true, the server auto-creates a text column for every key in `data` that doesn't already exist on the surface, then writes the row in the same call. Returns `created_columns` in the response listing the new column defs. Default false: unmapped keys are still stored on the row but won't render in the UI until you `add_column` them yourself.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses key behaviors: data format as JSON, valid status column values, auto-seeding on empty table, surface_slug behavior including error case. Does not mention return value or rate limits, but covers essential traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four well-structured sentences totaling about 100 words. Front-loaded with purpose, then details on data, status, auto-seed, and surface_slug. No redundant or irrelevant content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters, no output schema, and no annotations, the description covers data format, special values, edge cases (empty table, multi-surface), and error conditions. Lacks return format description but is otherwise thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds value by explaining the data field is JSON with column-name keys, acceptable status values, auto-seed behavior, and surface_slug targeting. Goes beyond technical schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'append' and resource 'new row to a workspace's table surface.' It distinguishes from siblings like update_row, delete_row, and move_rows by specifying it creates a new row.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides guidance on when to use surface_slug (multi-surface workspaces) and how to enumerate surfaces with list_surfaces. Implicitly indicates that for single-surface workspaces, surface_slug can be omitted. Does not explicitly contrast with other row operations, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_support_ticketAInspect

File a support ticket. Mirrors to a GitHub issue in Dock's support repo and shows up in the user's dashboard at /settings/support. Use this for bugs (you hit an error), feature requests (Dock is missing something), billing (Stripe/subscription), questions (how do I X), or anything else. Prefer request_limit_increase when the user is simply hitting a plan cap.

ParametersJSON Schema

Name	Required	Description
`body`	Yes	Detailed description (5-10000 chars). For bugs: include what you did, what happened, what you expected. For feature requests: the use case.
`kind`	Yes	Ticket category.
`title`	Yes	Short headline (3-200 chars). Be specific: 'Table view loses focus on cell edit' beats 'broken'.
`context`	No	Optional structured metadata echoed into the GitHub issue (workspace slug, URL, error trace, etc).
`attachmentUrls`	No	Optional list of screenshot/attachment URLs to embed in the issue. URLs must be hosted on the Dock blob store; mint them via POST /api/support/upload first. Max 4.

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It mentions mirroring to GitHub issue and dashboard visibility, but omits details like authentication, rate limits, or side effects. Adequate for a creation tool but could be more explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences front-loading purpose and consequences, with no redundant information. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks output schema; description does not explain return value or error cases. Also, the nested 'context' parameter structure is not elaborated. With 5 parameters and a sibling set of 40+, more completeness would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (baseline 3). Description adds value by providing usage guidance for body (e.g., include what you did for bugs, use case for features) and clarifying attachment URL requirements, going beyond enum labels.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'File a support ticket' and enumerates specific use cases (bugs, features, billing, questions, other), distinguishing it from siblings like request_limit_increase.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly provides when to use (bugs, feature requests, billing, questions) and when not to use ('Prefer request_limit_increase when the user is simply hitting a plan cap'), offering a clear alternative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_surfaceAInspect

Create a new surface (tab) inside a workspace. kind picks table, doc, or html. Optional slug (lowercase kebab-case, 3-64 chars); when omitted the server slugifies name and appends a numeric suffix on collision. Optional columns overrides the default Title/Status/Notes triple for table kinds; ignored for doc and html. html surfaces start with an empty body — write content via update_html. Editor role required. Emits surface.created so live listeners on the workspace stream see the new tab without a refetch.

ParametersJSON Schema

Name	Required	Description
`kind`	Yes	Surface kind. `table` for rows + columns, `doc` for TipTap body, `html` for a sandboxed HTML mockup tab.
`name`	Yes	Display name shown on the tab. 1-64 chars.
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`columns`	No	Optional initial columns for `table` kind. Same shape as get_workspace_schema returns. Defaults to Title/Status/Notes when omitted.
`surface_slug`	No	Optional URL-friendly slug for the surface (lowercase kebab-case, 3-64 chars). Auto-derived from `name` when omitted.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description discloses key behaviors: editor role required, slug collision handling, column override ignored for doc, and emission of 'surface.created' event. This provides sufficient transparency for an agent to understand side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured, front-loading the primary purpose and then covering details logically. Every sentence adds necessary information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description covers all critical aspects: purpose, parameter details, behavioral traits (role, event), and optional customization. It provides sufficient context for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage, but the description adds value beyond schema by explaining slug auto-derivation, collision behavior, and column override limitations. This extra context justifies a score above the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a new surface (tab) inside a workspace, and specifies the two possible kinds (table or doc). It distinguishes itself from sibling tools like delete_surface and update_surface by focusing on creation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use this tool (to create a new tab with specific kind and optional customization). It provides context about slug auto-generation and column behavior, but lacks explicit guidance on when not to use it or compare it directly to siblings like create_workspace.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_webhookAInspect

Register a new webhook endpoint on an org. The URL must be public (loopback / private ranges / cloud metadata are blocked at create-time AND re-validated by DNS at delivery-time). Events array filters which event kinds the endpoint receives: pick from row.* / comment.* / member.* / workspace.* / doc.*; an empty array means "none" so always pass at least one. Returns the signing secret exactly once (whsec_… prefixed); store it on the receiver to verify HMAC signatures on incoming requests.

ParametersJSON Schema

Name	Required	Description
`url`	Yes	Public HTTPS URL to POST events to. Loopback (127.0.0.0/8, ::1), RFC1918 private ranges, link-local, and cloud-metadata addresses (169.254.169.254, etc.) are rejected. Max 2048 chars.
`events`	Yes	Event kinds to subscribe to. Pick from: row.created, row.updated, row.deleted, row.sealed, comment.added, comment.deleted, member.invited, member.joined, member.removed, member.role_changed, workspace.created, workspace.renamed, workspace.columns_updated, workspace.visibility_changed, workspace.archived, doc.created, doc.updated, doc.heading_added, doc.mention_added.
`org_slug`	Yes	Org slug

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses URL validation details (private ranges, cloud metadata blocked and DNS re-validated) and that an empty events array means none. It also notes the signing secret is returned once. With no annotations, this provides good behavioral insight but could mention rate limits or activation timing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single well-structured paragraph that front-loads the main purpose. It includes necessary details but is slightly dense; every sentence contributes value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool complexity and lack of output schema, the description covers key aspects: URL validation, event selection, and secret handling. However, it does not detail the full response (e.g., webhook ID) which would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds extra meaning for 'url' (validation specifics) and 'events' (empty array means none), improving understanding beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Register a new webhook endpoint on an org,' providing a specific verb and resource. It distinguishes from siblings like delete_webhook and list_webhooks by focusing on creation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives context about when to use (e.g., need public URL, pick events) but does not explicitly state when not to use or provide alternatives. It is clear but lacks exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_workspaceAInspect

Create a new workspace in the caller's org. Works for both user and agent callers; agent-created workspaces attribute to the agent and enroll the agent's owning user as a co-owner so the human sees it in their dashboard. The new workspace is seeded with one primary surface matching mode: table → a Sheet tab, doc → a Notes tab, html → a Mockup tab (sandboxed HTML preview). Defaults: doc when initial_markdown is supplied, table otherwise; html is only picked when explicitly requested. Add more tabs of any kind later via create_surface. Agent-created workspaces default to org-visibility so sibling agents in the same org aren't 403'd. For prose content (briefs, summaries, changelogs) pass initial_markdown to seed the doc body in one call; the markdown is converted server-side, no need to hand-build ProseMirror JSON.

ParametersJSON Schema

Name	Required	Description
`mode`	No	Kind of the seeded primary surface. `table` mints a Sheet tab, `doc` mints a Notes tab, `html` mints a Mockup tab (sandboxed HTML preview, for landing-page mockups + design previews). Auto-defaults to 'doc' when initial_markdown is provided, 'table' otherwise; `html` is opt-in. Add more tabs of any kind via `create_surface` later.
`name`	Yes	The workspace name. Required. Used to derive a slug if you don't pass one.
`slug`	No	Optional URL-friendly slug (lowercase, kebab-case, 3-64 chars). Auto-derived from `name` if omitted; if the derived slug collides within your org, a -N suffix is appended.
`initial_markdown`	No	Optional Markdown body to seed the workspace's doc surface on create. CommonMark + GFM (tables, task lists, strikethrough). When provided AND mode is omitted, mode defaults to 'doc'. Skips the empty default-column scaffolding too. Ignored when mode='html' (no markdown equivalent for HTML surfaces — use `update_html` after create). Use this for any prose-shaped output (briefs, summaries, status updates, changelog entries) instead of create + update_doc with hand-built JSON.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses agent vs user behavior, co-owner enrollment, default visibility, seed surface logic, and markdown conversion. Minor omissions like return value format and error conditions prevent a 5.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph that efficiently covers all key points without redundancy. It is front-loaded with purpose and flows naturally, though structured bullets could improve scannability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 params, no output schema, no annotations), the description covers creation behavior well but does not specify the return value (e.g., workspace ID) or error conditions. This omission leaves some uncertainty for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds value beyond schema by explaining mode auto-resolution, slug collision handling, and the purpose of initial_markdown as an alternative to hand-built JSON. It elaborates on parameter interactions not captured in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Create a new workspace in the caller's org.', identifies the resource and action, and distinguishes from siblings like 'create_surface' by noting that mode only picks the first tab and that additional tabs can be added later.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises to use this for creating workspaces and provides specific use cases for prose content with initial_markdown, suggesting it over a create+update_doc workflow. It also mentions default visibility for agent callers. However, it lacks explicit 'when not to use' statements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_fileInspect

Soft-delete a file by id. Moves to a 30-day trash window before the cleanup cron hard-deletes + refunds the storage quota. Restorable via the REST PATCH endpoint (PATCH /api/workspaces/{slug}/files/{id} body: {restore:true}); a PATCH-equivalent MCP tool ships in Phase 6. Editor role required. Gated behind FILES_SURFACE_ENABLED + per-user allowlist.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts either the bare slug or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL.
`file_id`	Yes	The file cuid (from list_files).

delete_rowAInspect

Permanently delete a row from a workspace. This action cannot be undone.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`rowId`	Yes	The row ID to delete

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses irreversibility ('cannot be undone') but lacks information on permissions, side effects, or response behavior. Without annotations, description partially informs but is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no redundancy. Efficiently communicates core purpose and key consequence.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Simple tool but no mention of return value, permissions, or when deletion is appropriate. Minimal completeness for a deletion action.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers both parameters (slug, rowId) with descriptions. Description adds no extra meaning beyond schema, so baseline 3 for 100% coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action 'permanently delete' on a specific resource 'row from a workspace.' Distinguishes from siblings like create_row or update_row.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives (e.g., move_rows, update_row). Only mentions permanence but no exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_surfaceAInspect

Archive a surface (soft-delete). Rows + doc body are preserved for restore. Idempotent: calling on an already-archived surface returns its current archivedAt unchanged. Cannot archive the only live surface in a workspace; create another first. Editor role required. Emits surface.archived.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`surface_slug`	Yes	The slug of the surface to archive.

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully discloses behavior: soft-delete semantics (preservation for restore), idempotency, the restriction on archiving the last surface, required role, and emitted event. This is comprehensive and leaves no ambiguity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise, using 3 sentences to convey all essential information without filler. Key actions and constraints are front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 simple parameters and no output schema, the description covers behavior, constraints, permissions, and events completely. No gaps are evident.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers 100% of parameters with detailed descriptions. The description adds no parameter-level details beyond the schema, but given high schema coverage, a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Archive a surface (soft-delete)'), the resource ('surface'), and key behavioral details like preservation, idempotency, and constraints. It distinguishes itself from sibling tools like delete_row or delete_workspace by specifying it's a soft-delete for surfaces.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use the tool: to archive a surface, with explicit constraints (cannot archive the only live surface, requires editor role). It does not directly mention alternatives or when not to use, but the sibling list implies alternatives like create_surface or update_surface.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_webhookAInspect

Permanently delete a webhook endpoint. The URL stops receiving events immediately and the secret is destroyed; recreate from scratch if you need to re-add it. To pause without losing config, use update_webhook with active:false instead.

ParametersJSON Schema

Name	Required	Description	Default
`org_slug`	Yes	Org slug
`webhook_id`	Yes	Webhook id (from list_webhooks)

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but description fully discloses permanence, immediate effect, and secret destruction, which are critical behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with main action, no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key behavioral aspects for a simple 2-param tool with no annotations or output schema; minor omission of error conditions but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with minimal descriptions; the description adds no additional parameter meaning beyond the schema's brief 'Org slug' and 'Webhook id'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Permanently delete a webhook endpoint') and distinguishes from siblings like update_webhook for pausing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use (permanent deletion) and when not to (for pausing, use update_webhook), plus mention of recreating from scratch.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_workspaceAInspect

Archive a workspace. Soft-delete: rows, doc body, and activity history are preserved, and the workspace can be restored from Settings · Archived. Every member loses access immediately. Idempotent: calling on an already-archived workspace returns its current archivedAt without changing anything. Requires editor role on the agent. Pass mode: "web" to surface a click-to-approve URL for the human (recommended for any non-trivial workspace); the first call returns { status: 'approval_required', approval_url, polling_url }; print approval_url in chat, user clicks + approves, you poll polling_url for the result. Without mode: "web" the call executes immediately on the agent's editor role.

ParametersJSON Schema

Name	Required	Description	Default
`mode`	No	Consent surface. 'immediate' (default) executes on the agent's role. 'web' returns an approval_url the user clicks in a browser; recommended for any workspace your user might miss.
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses: soft-delete, data preservation, restoration path, immediate access loss, idempotency, editor role requirement, and detailed mode behavior (returning approval_url, polling). Covers all behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Lengthy but every sentence adds unique information. Front-loaded with main action, then progressively detailed. Minor redundancy could be trimmed, but overall efficient for the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description covers return behavior for web mode (status, approval_url, polling_url) and idempotency. Addresses edge cases and provides complete usage context. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (baseline 3). Description adds value: clarifies slug accepts bare or org-prefixed form, and mode describes consent surface and recommends web for non-trivial workspaces.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Starts with 'Archive a workspace' — a specific verb and resource. Clearly distinguishes from sibling deletion tools (e.g., delete_row, delete_surface) by targeting workspaces and calling it archiving.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explains when to use (archiving workspace) and provides context for mode parameter (recommend web for non-trivial). Does not explicitly state when not to use, but the soft-delete and restore capability imply it is not for permanent removal.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

downgrade_planAInspect

Schedule a downgrade to Free at the end of the current billing period. The org keeps its current plan (Pro or Scale) and paid limits until the period ends. No-op when already on Free. Consent-gated. Two consent surfaces, you pick via mode: (1) chat (default): FIRST call returns { status: 'confirmation_required', confirm_token, message, expires_in }; surface to your user and re-call within 60s with confirm_token set. (2) web: FIRST call returns { status: 'approval_required', approval_url, polling_url }; print approval_url in chat, user clicks + approves, then poll polling_url for the result.

ParametersJSON Schema

Name	Required	Description	Default
`mode`	No	Consent surface. 'chat' (default) uses the in-chat confirm_token round-trip. 'web' returns an approval_url the user clicks in a browser.
`confirm_token`	No	Chat-mode only. The token returned by the first call as `confirm_token`. Omit on the first call; include on the second call to execute the scheduled downgrade. Single-use, 60s TTL.

Tool Definition Quality

A5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully discloses the tool's behavior: it is not immediate, it is consent-gated, requires two calls in chat mode with a 60s timeout, and in web mode requires browser approval and polling. Also notes no-op condition. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured, starting with the core action, then detailing the two modes. Every sentence provides unique value without unnecessary verbosity. For the complexity of the tool, it is appropriately concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has no output schema and involves a two-step consent process, the description fully covers the workflow: scheduling behavior, no-op, consent modes, token management, and polling. It is complete for an agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers 100% of parameters, but the description adds crucial context beyond the schema: it explains the entire round-trip flow for each mode, the role of confirm_token, and the difference between chat and web surfaces. This significantly aids correct usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it schedules a downgrade to Free at the end of the billing period, distinguishing it from the sibling 'upgrade_plan'. The verb 'schedule' and resource 'downgrade to Free' are specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly explains when to use (to downgrade to Free), when not to (already on Free is a no-op), and provides detailed guidance on the two consent modes (chat vs web), including multi-step call sequences, confirm_token usage, and polling instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

evaluate_formulaAInspect

Evaluate a formula expression against an actual Dock workspace's columns + rows, server-side, returning the same display value the UI's HyperFormula engine would render. Two modes: STANDALONE (omit workspace_slug) — evaluates against an empty grid; useful for =SUM(1, 2, 3) or any formula with no cell references. IN-WORKSPACE (pass workspace_slug, optionally at) — loads the workspace's grid, evaluates the formula as if pasted into the at cell (or A1 if omitted), resolves real refs against actual data. Returns { ok, displayValue, error? }. Workspace mode requires read access; standalone mode is public.

ParametersJSON Schema

Name	Required	Description
`at`	No	Optional anchor cell (only used with workspace_slug). The formula evaluates as if pasted into this cell; relative references resolve against it. Omit to anchor at the workspace's first cell.
`formula`	Yes	Formula expression including '='. Max 4000 chars.
`workspace_slug`	No	Optional workspace slug. Pass to evaluate against the workspace's actual rows + columns. Accepts bare or org-prefixed form.

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It explains the two execution modes, the return format ({ ok, displayValue, error? }), and access requirements. It could be more explicit about read-only nature and any limits (e.g., character max mentioned in schema), but overall it is transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single well-structured paragraph that separates the two modes with clear labels (STANDALONE, IN-WORKSPACE). Every sentence adds value, no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (two modes, parameter interactions), the description covers all essential information: return format, mode selection, parameter roles, access prerequisites. Without an output schema, it still specifies the return shape, making it complete for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Since schema description coverage is 100%, baseline is 3. The description adds meaning by explaining how workspace_slug and at interact, providing examples (e.g., standalone formula =SUM(1,2,3)), and clarifying that at is only relevant in workspace mode. This goes beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool evaluates a formula expression against a workspace or empty grid, returning the display value. It explicitly distinguishes two modes (standalone vs in-workspace) and differentiates from siblings like validate_formula by emphasizing execution against actual data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use each mode: omit workspace_slug for standalone formulas without cell references, pass workspace_slug for workspace-dependent evaluation. It also mentions access requirements (read access for workspace mode, public for standalone) and explains the optional at parameter's role.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_billingAInspect

Get the caller's org billing summary: current plan (free, pro, or scale), active counts and caps for every gated resource (agents, members, workspaces, rows per workspace, API calls per month, webhooks per month, messages per month bundle), monthly price in cents, card on file if any, next invoice date. Both humans and agents can call this. Use before upgrade_plan to check whether you're actually capped, and after to confirm the new plan landed.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavior. It states the tool retrieves billing summary (a read operation) but doesn't address potential errors (e.g., no org or billing info) or authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence lists all key information without redundancy. Every phrase earns its place, from verb to field enumeration to usage note.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema provided, so description must cover return values—it lists expected fields. Missing data types and structure, but sufficient for a simple billing summary tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist (schema is empty, coverage 100%). Per guidelines, zero parameters baseline is 4. The description adds value by implying the tool needs no configuration.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get the caller's org billing summary' and lists specific fields (plan, agent count, price, card, next invoice). This definitively distinguishes it from sibling tools, none of which handle billing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It notes 'Both humans and agents can call this,' indicating no special access restrictions. However, it lacks explicit guidance on when not to use it or alternative tools for related tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_comment_threadAInspect

Fetch a single comment with its replies + reactions in one round trip. Pass any comment id in the thread (root or reply). Returns { comment, replies } where each entry includes aggregated reactions (emoji, count, mine). Use this when an agent receives a comment.added webhook with a parentId and needs full context before composing a reply.

ParametersJSON Schema

Name	Required	Description	Default
`comment_id`	Yes	Comment id (any node in the thread).

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full transparency burden. It clearly states 'No write side effects' and describes return format as '{ comment, replies }' with reactions aggregated. It could be more explicit about error behavior (e.g., invalid comment_id), but overall adequately discloses the tool's effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences plus a one-line return format. Every sentence is purposeful: action, usage advice, and side-effect disclosure. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one param, no output schema), the description covers purpose, use case, return shape, and side effects comprehensively. It omits nothing critical for an AI agent to select and invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter. The description adds value beyond the schema by specifying 'parent (root) comment id' and that replies are returned 'in ascending createdAt order,' which aids correct usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Fetch') and clearly identifies the resource ('comment + all its replies') with a unique feature ('with reactions aggregated'). It distinguishes from siblings like list_comments and get_doc by specifying the aggregate thread behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool: 'Use this to read a thread before deciding how to reply (or whether to resolve it).' It also notes 'No write side effects,' clarifying safe usage without alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_docAInspect

Read a workspace's doc (TipTap rich-text) body. Format is negotiable via format: markdown (default — CommonMark + GFM, ready to feed to an LLM or render in a non-ProseMirror surface), content (TipTap JSON, round-trippable into update_doc for structural edits), text (plain text, best for search, summarisation, word-count heuristics), or all for the legacy three-in-one shape. Default is markdown because it's the slice agents need 95% of the time and the JSON form on a long doc can blow past the agent harness's tool-result token cap. Pass format: "content" only when you're round-tripping into update_doc for a structural edit. A workspace can hold any combination of doc and table surfaces, one or many of either kind; omit surface_slug to read the primary doc surface, or pass it to target a specific doc tab (use list_surfaces to enumerate). An unwritten or absent doc returns the requested format empty (markdown="", content={}, text=""); a surface_slug that doesn't match any live doc surface 404s.

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`format`	No	Which serialization to return. Default `markdown`. Use `content` to round-trip TipTap JSON back into update_doc for structural edits. Use `all` for the legacy three-in-one shape (heavier; only do this when you genuinely need every form in the same call).
`surface_slug`	No	Optional doc surface slug for multi-doc workspaces. Omit to read the primary doc surface. Use list_surfaces to see available slugs.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses the tool's behavior: it returns three content forms, explains the empty state for unwritten docs, and the 404 error for invalid surface_slug. It also notes that content is round-trippable into update_doc, which is a key behavioral trait.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the purpose, but it is somewhat dense with multiple details packed into a single paragraph. Breaking it into smaller sentences or bullet points could improve readability, but it remains clear and concise overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description thoroughly explains all three return formats and their use cases (e.g., structural edits, LLM feeding, search). It also covers edge cases (empty doc, invalid slug). The tool's behavior is fully described for its level of complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Although the schema already covers both parameters with descriptions (100% coverage), the description adds significant context: it explains that slug accepts both bare and prefixed forms, and that surface_slug is optional and can be discovered via list_surfaces. This goes beyond the schema's descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies that the tool reads a workspace's doc body, mentions the three return formats (content, markdown, text), and distinguishes it from sibling tools like update_doc by explicitly noting that content is round-trippable. This provides a specific verb+resource with clear differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description offers guidance on when to omit or pass surface_slug, references list_surfaces for enumeration, and explains the behavior for absent docs and invalid slugs. However, it does not explicitly state when not to use this tool (e.g., for editing, use update_doc), though this is implied by the round-trippability comment.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_fileInspect

Fetch metadata + a download URL for a single file by id. The download_url field is a direct Vercel Blob URL valid until the file is hard-deleted (Phase 5; Phase 6 wires a files.trydock.ai signed-URL minter with 5-min TTL + auth re-check). Useful for an agent reading file contents server-side (HTTP GET the URL) or surfacing a download link in a reply. Gated behind FILES_SURFACE_ENABLED + per-user allowlist.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts either the bare slug or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL.
`file_id`	Yes	The file cuid (from list_files). Surface + workspace are derived from the file row, so no surface_slug arg is needed.

get_htmlAInspect

Read an HTML surface's body. HTML surfaces (Surface.kind="html") store mockup or full-page content as three text fields (html, css, js) rendered together inside a sandboxed iframe. Use list_surfaces to enumerate html surfaces in a workspace. Omit surface_slug to read the primary html surface; pass it to target a specific tab. Empty (never-written) html surfaces return { html:"", css:"", js:"" }. 404 when surface_slug doesn't match a live html surface. Requires viewer role.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts bare or org-prefixed form.
`surface_slug`	No	Optional html surface slug. Omit to read the primary html surface.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses the three text fields (html, css, js), sandboxed iframe rendering, empty surface behavior, 404 error condition, and required role. No annotations present, so description carries full burden and meets it well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, each serving a distinct purpose: purpose, definition, usage, edge cases/errors. No redundant information, front-loaded with core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with no output schema, the description covers response structure, error conditions, prerequisite enumeration tool, and role requirements. No significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage. Description adds value by explaining behavior when omitting or providing surface_slug, linking to list_surfaces, and clarifying the primary surface concept.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Read an HTML surface's body' with a specific verb and resource. Distinguishes from sibling tools like list_surfaces (enumeration) and update_html (write).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context: mentions list_surfaces for enumeration, explains omitted surface_slug for primary surface, and specifies viewer role requirement. No explicit when-not-to-use, but sufficient guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_recent_eventsAInspect

Get recent activity events for a workspace. Who did what, when. Useful for understanding what's happened since you last looked.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`limit`	No	Max events to return (default 20)

Tool Definition Quality

A3.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It lacks details about behavior such as time range for 'recent', sorting order, or whether it's read-only. The description is too vague on behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with purpose, and contains no redundant information. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple listing tool with two parameters and no output schema, the description is largely complete. It explains the return value (who, what, when) and suggests usage, though it could mention sorting or time range.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description does not add meaning beyond the schema. The slug parameter details are already in the schema, and the limit parameter default is also specified. Baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Get'), resource ('recent activity events for a workspace'), and provides context ('Who did what, when'), distinguishing it from sibling tools like get_workspace or search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('Useful for understanding what's happened since you last looked') but does not explicitly state when to use this tool vs alternatives, nor does it provide exclusions or conditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_rowAInspect

Fetch a single row by id without listing the full table. Useful when a cue payload carries a row id and the agent only needs that one record. Returns the same row shape as list_rows.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`rowId`	Yes	The row id

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It is transparent about the return shape ('same row shape as list_rows'), but does not disclose error handling (e.g., if rowId is missing or invalid). For a simple fetch, this is acceptable but not exhaustive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is only two sentences, each serving a purpose: first sentence states the action and differentiates from list_rows, second gives a usage scenario and return shape. No waste, front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple fetch tool with high schema coverage, the description is nearly complete. It explains how to use it, what it returns, and when to use it. However, it lacks details on what happens if the row is not found, which could be considered a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already provides detailed descriptions for both parameters (slug and rowId) with 100% coverage. The description adds no additional semantic information beyond what is in the schema, so it meets the baseline but does not exceed it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Fetch a single row by id', which is a specific verb+resource. It also distinguishes from the sibling tool list_rows by explicitly saying 'without listing the full table', making its purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says when to use this tool: 'Useful when a cue payload carries a row id and the agent only needs that one record.' This provides clear context and implies when not to use it (when multiples are needed, use list_rows instead).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_workspaceAInspect

Get details about a specific workspace by its slug, including columns of its primary table surface, member count, and row count. A workspace contains one or more surfaces (tabs): any combination of table (rows + columns) and doc (TipTap body) kinds, one or many of either. Use list_surfaces to enumerate every tab; fetch /rows or /doc to read or write a specific one.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug, e.g. 'reddit-tracker'. Accepts either the bare slug or the org-prefixed form ('my-org/reddit-tracker') as shown in the dashboard URL.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It explains the workspace structure (surfaces, kinds) and the returned fields (columns, counts). However, it does not explicitly state that the operation is read-only or idempotent, which would enhance transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first is a clear statement of purpose, the second provides additional context about workspace composition and sibling tool usage. No unnecessary words; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description covers the main return components (columns, counts) and workspace structure. It references sibling tools for further actions. While not an exhaustive field list, it provides sufficient context for a single-parameter tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema already describes the 'slug' parameter and its format. The description adds value by explaining the slug accepts both bare and org-prefixed forms, and also outlines the return fields not in the schema (columns, member count, row count). This goes beyond the baseline for 100% schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get details about a specific workspace by its slug, including columns of its primary table surface, member count, and row count.' This specifies the verb, resource, and output, distinguishing it from siblings like list_workspaces and get_workspace_schema.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions when to use alternative tools: 'Use `list_surfaces` to enumerate every tab; fetch /rows or /doc to read or write a specific one.' This provides clear context for when this tool is appropriate versus others.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_workspace_schemaAInspect

Return a table surface's column definitions so an agent knows what keys create_row/update_row will accept. Each column has key (the field name in row.data), label (human-readable), type (text | longtext | url | status | owner | date | number), position, and, for status/owner columns, the allowed options. Empty array on doc-only workspaces; callers should still be able to write rows (columns auto-seed on first write). Multi-surface workspaces accept surface_slug to scope to a specific table sheet (use list_surfaces to enumerate); omit to fall through to the workspace's primary table surface.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`surface_slug`	No	Optional. The slug of the specific table surface to read columns from. Omit on single-table workspaces; required on multi-table workspaces if you don't want the primary table surface (lowest position).

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully covers return shape (keys, label, type, position, options) and special behavior for doc-only workspaces, ensuring no surprises.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences front-loaded with purpose, followed by detail and edge case. No filler, every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description explains return structure and handles edge cases, making it complete for a single-parameter retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. The description adds value by detailing valid slug formats (bare or org-prefixed), improving parameter understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns workspace column definitions for agent use with create_row/update_row, distinguishing it from sibling tools like get_workspace.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context for when to use (to know accepted keys) and handles edge cases (doc-only workspaces), but does not explicitly list when not to use or name alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_api_keysAInspect

List API keys. Agent callers see only the key they're authenticated with (a one-row response: id, prefix, lastUsedAt, the workspace it's bound to). User callers (cookie session) see every key for every agent they own. Plaintext is never returned; the key body is shown only once at create/rotate time.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description effectively discloses key behavioral traits: caller-dependent visibility, no plaintext returned, and the limited scope of results. However, it omits potential side effects or rate limits, which would elevate it to a 5.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with three sentences, each adding value. It front-loads the core purpose, though could be slightly more terse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero parameters and no output schema, the description adequately covers behavior and response differences for caller types. It lacks pagination details but is otherwise complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so baseline is 4. Description correctly adds no parameter info as none are needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'List API keys' and provides specific behaviors for agent vs. user callers, distinguishing it from other key-related tools like revoke or rotate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lacks explicit guidance on when to use this tool versus alternatives such as revoke_api_key or rotate_api_key, leaving the agent to infer based on name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_commentsAInspect

List comments in a workspace. Filter by target_type (row, cell, doc_range, html_element, surface, workspace), target_id, surface (returns every comment anchored to any element of one surface, useful for 'open threads on this tab'), status (open | resolved | all, default open), mentioning_me: true for comments that @-mention the caller, or author: <principalId> for comments by a specific user/agent. Returns up to 200 comments per call ordered by createdAt asc, with surfaceSlug denormalized for doc_range/html_element/surface targets so reply paths work even across archive boundaries. Use get_comment_thread to pull a single comment plus its replies + reactions.

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	The workspace slug.
`limit`	No	Max results (1-200, default 50).
`author`	No	Filter by author principal id. Useful for 'comments by Argus on this workspace' agent loops.
`offset`	No	Number of comments to skip for pagination.
`status`	No	Resolution state filter. Default `open`.
`surface`	No	Surface slug filter. Returns every comment anchored anywhere inside this surface (doc_range / html_element / surface scope, plus row + cell comments on rows that live on the surface). 404 silently if the surface is archived (returns empty list).
`target_id`	No	Filter by exact target id. For cells the id is `<rowId>:<columnKey>`; for doc_range/html_element/surface it's the Surface cuid. Combine with target_type for unambiguous filtering.
`target_type`	No	Filter by comment target type.
`mentioning_me`	No	When true, only return comments that @-mention the calling principal. Equivalent to REST `?mentioning=me`.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses the default filter, required read role, and return shape. It does not mention pagination limits or error handling, but given the schema covers parameters, this is sufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, with no wasted words. It front-loads the main purpose, uses short sentences, and each sentence adds information. The structure is clear and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a list tool with 9 parameters and no output schema, the description covers the return shape, role requirement, and parameter usage. It lacks explanation of error states or sorting, but it is adequate for an agent to select and invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, baseline is 3. The description adds value by explaining the purpose of filters (e.g., mentioning_me for triage, target_id composite format, surface scope) and the default status behavior, going beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool lists comments with optional filters, mentions the return shape, and distinguishes it from sibling tools like add_comment and get_comment_thread. The verb 'list' and resource 'comments' are clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises using it to find pending feedback or pull a thread for context, and explains default behavior (status='open') and how to override it. It does not explicitly exclude alternatives but implies scope; more explicit guidance on when to use get_comment_thread could improve it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_filesInspect

List the folder + file children of a Files surface (kind='files'). Folders sorted first by position then name; files sorted by name. Returns folders[], files[] with cuids agents can pass to get_file / delete_file. parent_folder_id defaults to null (= root of the surface); pass a folder id to descend into a sub-folder. Gated behind FILES_SURFACE_ENABLED + per-user allowlist (in beta on socrates@vector.build; other accounts get -32000 'not available').

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	The workspace slug. Accepts either the bare slug or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL.
`surface_slug`	Yes	Files-kind surface slug within the workspace. Use list_surfaces to enumerate; the Files surface kind is 'files'.
`parent_folder_id`	No	Folder id to descend into. Omit (or pass null) for the surface root.

list_rowsAInspect

List rows in a workspace's table surface. Returns rows with their data (a JSON object of column-name to value), creation time, the principal who created/updated each row, AND the row's surface_slug (the sheet it lives on). Empty array if no rows have been added yet. Multi-surface workspaces: pass surface_slug to scope to one sheet; omit to return rows from every surface in the workspace (back-compat: pre-multi-surface clients keep working).

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`limit`	No	Max rows to return (default 100, max 1000)
`offset`	No	Number of rows to skip (for pagination)
`surface_slug`	No	Optional table surface slug for multi-surface workspaces. Filter rows to one sheet. Omit to return rows from every surface (legacy single-sheet clients see no change). 400 if the slug is a doc surface, archived, or doesn't exist.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavior: it returns rows with specific fields, empty array for no data, and the effect of surface_slug (filter to one surface vs. all). It also notes a 400 error for invalid surface_slug. This is comprehensive for a read-only list tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured: it starts with the main purpose, then details return values, edge cases, and usage of the key parameter. Every sentence adds necessary information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains what is returned. It covers the four parameters and their behaviors. A minor gap is the lack of explicit mention of pagination limits (default 100, max 1000) which are in schema but not re-emphasized, but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All four parameters have schema descriptions (100% coverage), so the description does not need to compensate. However, it adds meaningful context for surface_slug (scoping vs. back-compat) and slug (accepted formats), improving the agent's understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool lists rows in a workspace's table surface, detailing the returned fields (data, creation time, principals, surface_slug) and behavior (empty array if no rows). This clearly differentiates from sibling tools like get_row (single row) or create_row (write).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives clear context on when to pass surface_slug (for multi-surface workspaces to scope to one sheet) and when to omit (for legacy compatibility). It does not explicitly say when not to use this tool versus alternatives, but the purpose is well-defined.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_sheet_functionsAInspect

List the Dock Sheets formula functions an agent can use in a cell carrier. Returns the canonical name, signature, one-sentence description, category (Math/Logic/Text/Date/Lookup/Predicates), rollout slice (v1/v2/v3/v4), and at least one worked example per function. Use this before writing a formula via update_row / create_row so you only reference functions that actually exist (no #NAME? errors). Also returns the alias map (e.g. CONCAT → CONCATENATE) so you can pick the canonical name even when writing the alias the UI accepts. Optional filters: category narrows to one category, slice narrows to one rollout slice, name substring-matches names + descriptions + signatures. Public, no auth, no rate limit beyond global.

ParametersJSON Schema

Name	Required	Description
`name`	No	Optional case-insensitive substring filter; matches function name, description, and signature.
`slice`	No	Optional rollout-slice filter.
`category`	No	Optional category filter.

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description fully discloses behavioral traits: it is a public read-only operation with no auth or rate limiting, returns detailed data including worked examples, and has optional filters. No contradictions exist.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is thorough but slightly lengthy. It is well-structured with front-loaded purpose and clear details. Could be marginally more concise but effectively communicates all necessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description comprehensively explains return values (canonical name, signature, description, category, slice, worked examples, alias map). It covers all relevant aspects for an agent to effectively use the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with descriptions, but the description adds extra meaning (e.g., 'name' matches names, descriptions, and signatures; 'category' and 'slice' narrow results). This goes beyond schema explanations, earning a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('List the Dock Sheets formula functions') and a clear resource, distinguishing it from sibling tools like update_row, create_row, evaluate_formula, and validate_formula. It conveys exactly what the tool does and why it's useful.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use the tool: 'Use this before writing a formula via update_row / create_row so you only reference functions that actually exist (no #NAME? errors).' It also mentions the alias map, guiding the agent to pick canonical names. This provides clear context for appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_surfacesAInspect

List the surfaces (tabs) inside a workspace. A workspace can hold any combination of table (rows + columns) and doc (TipTap body) surfaces, one or many of either kind; this tool tells you exactly what it has. Each surface has its own slug used in surface-scoped tool calls. Order matches the on-screen tab strip. Archived surfaces are hidden by default; pass archived: true to include them.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`archived`	No	Include archived surfaces too. Default false (live tabs only).

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses key behaviors: returns both table and doc surfaces, order matches tab strip, archived hidden by default. No annotations provided, but description covers needed transparency for a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three efficient sentences with front-loaded purpose. No wasted words; each sentence adds distinct value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks details on exact output fields (e.g., id, type, title) and error scenarios, but for a list tool it covers primary usage and behaviors adequately given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds value by explaining archived default and slug meaning not repeated. The slug parameter is already well described in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool lists 'surfaces (tabs) inside a workspace'. Explains the two types (table, doc) and ties to slug usage. Distinct from sibling tools like create_surface or list_workspaces.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context: surfaces listing with ordering, archived hidden by default. Implicitly indicates when to use, but lacks explicit when-not-to-use or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_webhooksAInspect

List webhook endpoints registered on an org. Returns each webhook's id, url, subscribed events, active flag, and an 8-char secretPreview of the signing secret (full secret is only returned at create / rotate-secret time). Any org member (user or agent) can list. Use to audit what's subscribed before adding or removing endpoints.

ParametersJSON Schema

Name	Required	Description	Default
`org_slug`	Yes	Org slug. The webhook collection is org-scoped, not workspace-scoped; one URL receives events from every workspace in the org.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It reveals the read-only nature, return fields, secret preview behavior (full secret only at create/rotate), and access permissions. No mention of rate limits or pagination, but acceptable for a simple list.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no fluff. Every sentence adds value: first states the action and return fields, second provides usage context and access info.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given one parameter, no output schema, but description lists return fields. It covers the essential behavior and usage context. Minor omission of pagination or ordering details, but not critical for this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single param org_slug. The description does not add additional meaning beyond the schema's own description, which already covers the scoping. Baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists webhook endpoints registered on an org, specifies the returned fields (id, url, subscribed events, active flag, secretPreview), and distinguishes from sibling mutation tools like create_webhook and delete_webhook.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance: 'Use to audit what's subscribed before adding or removing endpoints.' Also states that any org member can list. Does not explicitly mention alternatives or when not to use, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_workspace_membersAInspect

List principals with explicit access to a workspace. Returns users (id, name, email; email visible only when the caller is in the same org) and agents (id, name, brandKey) along with their role (owner | editor | commenter | viewer). Used by agents to verify a workspace is actually shared before writing output the team is expected to see.

ParametersJSON Schema

Name	Required	Description	Default
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses email visibility condition for users, but with no annotations, it omits other typical behaviors like pagination, error handling, or authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with no fluff: purpose, return details, and usage context are all front-loaded and concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with one parameter, the description adequately covers return fields and a use case, though it could mention possible limits or sorting.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and already describes the slug parameter well; the description adds no additional parameter meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists principals with explicit access to a workspace, and distinguishes it from siblings like 'remove_workspace_member' and 'list_workspaces' by focusing on access listing and verification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides a specific use case (agents verifying workspace sharing before writing) but lacks explicit when-not-to-use or alternative tool references.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_workspacesAInspect

List all workspaces the authenticated principal has access to. Returns workspace name (slug), mode (the default-view preference for the first tab), and creation date. A workspace is a container of one or more surfaces (tabs); each surface is either a table (rows + columns) or a doc (TipTap body), and a workspace can hold any combination, one or many of either kind. Use list_surfaces to see what a given workspace actually contains.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description covers basic behavior: listing workspaces accessible to the authenticated principal and returning specific fields. It does not discuss side effects, rate limits, or authentication details, but for a read-only list operation, the transparency is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loading the core action and return info. It briefly explains the workspace concept and points to a sibling tool, adding value without redundancy. Slightly less concise due to tangential explanation, but still efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with no parameters and no output schema, the description is fairly complete. It specifies what is returned, gives workspace context, and directs to list_surfaces for further detail. It lacks details on ordering or pagination, but these are not critical for a basic listing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters in the schema, and schema description coverage is 100%. The description does not need to add parameter semantics. With 0 parameters, the baseline score of 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists all workspaces and specifies the returned fields (slug, mode, creation date). It effectively distinguishes from sibling tools like create_workspace, get_workspace, and delete_workspace.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises using list_surfaces to see workspace contents, providing clear context for when to use this tool. However, it does not explicitly state when not to use it or explore alternative tools beyond list_surfaces.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

move_rowsAInspect

Atomically move N rows from their current sheet(s) to a target sheet inside the same workspace. Use for programmatic data migration: dropping a batch of agent-produced drafts onto the right sheet, reorganizing content across LinkedIn / Twitter / Substack tabs, etc. All-or-nothing: if any rowId doesn't belong to this workspace, the entire batch fails before any write fires. Idempotent: rows already on the target sheet are skipped (returns skipped count). Rows land at the destination sheet's tail in the order rowIds was supplied. Emits one row.moved_surface event per row that actually moved. Up to 500 rows per call.

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`rowIds`	Yes	Row IDs to move (1-500). Order is preserved at the destination: first id lands at the lowest position, last id at the highest.
`target_surface_slug`	Yes	Slug of the destination table surface. Use list_surfaces to enumerate. 400 if the slug is a doc surface, archived, or not in this workspace.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses atomicity, idempotency, batch failure conditions, order preservation, event emission, and row limit. This exceeds the burden of behavioral transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with front-loaded purpose, followed by technical details in efficient sentences. Every sentence provides necessary information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description mentions returned `skipped` count and event emission, but could detail the full response format. Overall, it is fairly complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds value by clarifying slug formats, order preservation for rowIds, and referencing list_surfaces for target_surface_slug. It enhances understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool moves rows atomically within the same workspace, with specific verbs and resource. It distinguishes from sibling tools like delete_row, create_row, and update_row by focusing on batch migration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear use cases (programmatic data migration, reorganizing content) but does not explicitly state when not to use it or compare to alternatives. The context is sufficient for understanding appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

react_to_commentAInspect

Add or remove an emoji reaction to a comment. Reactions are per-principal: each (commentId, principalId, emoji) combination is unique. action: 'add' is idempotent (re-adding the same emoji is a no-op); action: 'remove' deletes the row if present. Fires comment.reaction_added / comment.reaction_removed. Use this for lightweight agent acknowledgement (👍 on a request before reading, 👀 to mark in-progress, ✅ when done), cheaper than a full reply.

ParametersJSON Schema

Name	Required	Description
`emoji`	Yes	Emoji character (e.g. '👍', '✅', '🚀').
`action`	No	Whether to add or remove the reaction. Default `add`.
`comment_id`	Yes	Comment to react to.

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses important behaviors: idempotency on re-adds, no-op on remove of non-existent reactions, and return value (aggregated counts). It does not mention authentication or rate limits, but these are less critical for a simple reaction tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with the core purpose, no wasted words. Each sentence adds meaningful information efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description adequately covers behavior, idempotency, and return value. It is complete enough for the agent to use the tool correctly, though additional details like error cases could be included.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds behavioral context beyond the schema by explaining the idempotency constraint related to the action parameter, and notes the return value, which enhances understanding of parameter effects.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool adds or removes an emoji reaction on a comment, specifying the verb and resource. It distinguishes from siblings like reply_to_comment or resolve_comment by focusing on emoji reactions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description does not mention when not to use it or compare to sibling tools, leaving the agent to infer usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove_workspace_memberAInspect

Remove a workspace member. Editor role required; owner-tier removals require an owner caller. Sole-owner removal is blocked; promote someone else first. Note: if the workspace visibility is org, removing an explicit member of the same org leaves them with virtual editor access via the org-membership branch. Consent-gated for agents: the FIRST call returns { status: 'confirmation_required', confirm_token, message, expires_in }. Surface the message to your user and, if they say yes, re-call this tool within 60s with confirm_token set to the same token. User callers (cookie session) skip the consent step.

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`member_id`	Yes	The WorkspaceMember id to remove. Get this from list_workspace_members.
`confirm_token`	No	The token returned by the first call as `confirm_token`. Omit on the first call; include on the second call to execute the removal. Single-use, 60s TTL. Agents only; user callers don't need this.

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavioral traits: consent-gated two-step flow for agents, timeout, role requirements, and edge case for org visibility, ensuring the agent understands the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the purpose and is efficient, though it is a single paragraph that could be more structured (e.g., bullet points). Still, it avoids fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (consent flow, role constraints, edge cases) and no output schema, the description provides thorough guidance, making the tool's usage clear.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds context (e.g., confirm_token flow) but does not significantly extend parameter semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function ('Remove a workspace member') and provides specific constraints (role requirements, sole-owner blocking), making it distinct from sibling tools like list_workspace_members.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description offers clear context on when to use (e.g., Editor role required) and when not (sole-owner removal blocked), but does not explicitly name alternative tools for similar actions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reply_to_commentAInspect

Convenience wrapper around add_comment for the common reply case. Pass the parent comment id and the body; the handler reconstructs the target from the parent (no need for the agent to remember whether the parent was a row, cell, doc_range, html_element, surface, or workspace comment). Re-opens a resolved parent. Same threading rules as add_comment: nested replies flatten to single depth, so reply-to-reply re-points at the root.

ParametersJSON Schema

Name	Required	Description
`body`	Yes	Reply body (1-5000 chars).
`mentions`	No	Optional `[{ kind, id, label }]` mentions on the reply. Same validation + fan-out rules as add_comment.
`comment_id`	Yes	Parent comment id. Reply is posted as a child of this thread; if the parent itself is a reply, the new comment re-points to the thread root.

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description carries full burden. It explains that the tool reads the parent's stored target and threads the reply, and mentions the same flatten+reopen+mention behavior as `add_comment`, giving good transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: first states purpose, second explains mechanism, third gives usage guidance. No wasted words, well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple reply tool with no output schema, the description is complete: it explains the purpose, behavior, and usage, leaving no significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are already documented. The description adds minimal extra meaning beyond the schema, such as noting that `mentions` has the same shape as `add_comment`. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that it is a shortcut for `add_comment` when you already have a parent `comment_id`, and it distinguishes from the sibling `add_comment` by specifying the use case.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises using this over `add_comment` when responding to an existing thread, providing clear context on when to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_limit_increaseAInspect

Ask Dock to raise a plan limit (agents, workspaces, rows, or other). We record the signal on the admin side; there's no reply loop. Use this when you hit a cap you can't resolve with upgrade_plan (e.g. you're already Pro but need a custom limit).

ParametersJSON Schema

Name	Required	Description
`kind`	Yes	Which limit to raise
`reason`	No	Optional: 1-2 sentences on the use case
`desiredValue`	No	Optional: the specific limit you'd like

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses the one-way nature and that it records a signal on admin side. With no annotations, this adds value, though it could mention potential side effects or processing guarantees.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, no wasted words, front-loaded with the action and scope.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequately covers the purpose and usage for a simple request tool. Could mention the return value (e.g., confirmation) but not critical given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description adds only marginal context. It repeats the enum values but doesn't elaborate beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool requests a plan limit increase, lists the specific limits (agents, workspaces, rows, other), and distinguishes from upgrade_plan by giving an example (already Pro needing custom limit).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use ('when you hit a cap you can't resolve with upgrade_plan') and what to expect ('no reply loop', 'record the signal'). Provides a concrete scenario.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_revoke_agent_keyAInspect

Ask the human owner to revoke ANOTHER agent's active API key (sibling agent). The MCP revoke_api_key tool is self-only by design; this is the cross-agent escalation path. Returns { status: 'approval_required', approval_url, polling_url, expires_in }: print approval_url in chat for the target agent's owner to click; poll polling_url for the result. Approval gate: the approving user must be the target agent's owner (Agent.ownerUserId match). Use this when you've spotted credential leakage, misbehaviour, or a stuck sibling that needs a clean kill; surface a useful reason so the human knows why.

ParametersJSON Schema

Name	Required	Description	Default
`reason`	No	1-2 sentences on why you're asking. Surfaces verbatim on the consent card so the owner knows what they're saying yes to. Capped at 500 chars.
`target_agent_id`	Yes	The id of the sibling agent whose key should be revoked. Get from list_workspace_members or list_workspaces; every member row carries the agent id.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It transparently describes the approval flow: returns {status, approval_url, polling_url, expires_in}, instructs to print approval_url and poll polling_url, and notes that the approving user must be the target agent's owner. This covers the behavior and constraints well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is a tight paragraph of about 4 sentences, front-loaded with purpose, then differentiator from sibling, then return value and usage instructions, then usage guidance. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 2 parameters fully covered in schema, no output schema, the description compensates by explaining the return structure and approval process. It also provides context on when to use and where to get parameter values, making it complete for an agent to select and invoke.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds meaningful context: for reason, explains it's shown verbatim on consent card and capped at 500 chars; for target_agent_id, suggests sources (list_workspace_members, list_workspaces) and notes it's the sibling agent's id. This goes beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: asking the human owner to revoke another agent's API key, contrasting with the self-only revoke_api_key sibling. It specifies the action, resource (another agent's key), and scope (cross-agent escalation).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'when you've spotted credential leakage, misbehaviour, or a stuck sibling that needs a clean kill.' It also advises to provide a useful reason, and contrasts with the sibling revoke_api_key tool, which is self-only.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_rotate_agent_keyAInspect

Ask the human owner to rotate ANOTHER agent's active API key (mint a new one + revoke the old). Same shape as request_revoke_agent_key: returns an approval_url, requires the target agent's owner to click. The new key plaintext is INTENTIONALLY not returned to the requesting agent; it's surfaced only to the human owner via Settings → Agents, who hands it to the target agent out of band. Use when you've spotted leakage and the target needs a clean credential without going dark mid-task.

ParametersJSON Schema

Name	Required	Description	Default
`reason`	No	1-2 sentences on why. Surfaces on the consent card. Capped at 500 chars.
`target_agent_id`	Yes	The id of the sibling agent whose key should be rotated.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description discloses key behaviors: returns an approval_url, requires human owner consent, intentionally does not return the new key plaintext to the requesting agent but surfaces it to the human owner via Settings. This fully informs the agent of the security model.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded: it states the core action first, then adds context about similarity to another tool, followed by security implications and usage guidance. Every sentence adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description adequately explains the return format (approval_url) and the outcome (key rotated, new key out-of-band). The justification for not returning the key and the security pattern are fully covered, making the tool's behavior predictable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining that 'reason' surfaces on a consent card and is capped at 500 chars, and that 'target_agent_id' is a sibling agent. This extra context improves clarity beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool rotates another agent's active API key, specifying the action (mint new + revoke old). It distinguishes from siblings like request_revoke_agent_key and rotate_api_key by focusing on agent-level rotation with consent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear use case: 'when you've spotted leakage and the target needs a clean credential without going dark mid-task.' It also references request_revoke_agent_key as a sibling with similar shape. However, it does not explicitly list when not to use or compare to other rotation tools like rotate_api_key.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

resolve_commentAInspect

Mark a comment thread resolved. Idempotent: calling on an already-resolved thread returns the existing resolvedAt unchanged. Fires comment.resolved. Pair with unresolve_comment for the reverse. Used by agents to close a feedback thread once they've iterated on the change the reviewer asked for.

ParametersJSON Schema

Name	Required	Description	Default
`comment_id`	Yes	Comment id to resolve (use the thread root, resolving a reply targets the reply itself, not the thread).

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully covers behavioral traits: permissions, idempotency (no-op on already resolved, returns 200 without re-firing event), reversibility, and side effect of replying auto-reopening. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences packed with critical information. Each sentence serves a purpose: action and role, idempotency and reversibility, interaction with replies. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one required parameter and no output schema, the description covers all necessary context: permissions, safe-to-call behavior (idempotent), how to reverse, and related tools. Agents can confidently invoke it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for the single parameter 'comment_id', noting it is a string. The description adds value by clarifying that the ID must be for a root comment, not a reply, which is beyond the schema's simple type description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description begins with 'Close a comment thread,' providing a clear action and object. It distinguishes from sibling tools like 'unresolve_comment' by naming reversibility, and from 'reply_to_comment' by noting auto-reopen behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states that anyone with 'commenter' role can resolve any thread (not author-only), that it is idempotent with specific behavior on already-resolved threads, and that it is reversible via 'unresolve_comment'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

revoke_api_keyAInspect

Revoke an API key (soft-delete via revokedAt). Subsequent requests with the key return 401. Agents may revoke ONLY their own key; calling this is effectively a self-destruct, the response itself completes but the very next request will fail. Users may revoke any key they own. To swap creds without going dark in the gap, use rotate_api_key instead.

ParametersJSON Schema

Name	Required	Description	Default
`id`	No	API key id to revoke. Omit when called by an agent; defaults to the agent's own current key.

Tool Definition Quality

A4.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses soft-delete, 401 on subsequent requests, self-destruct for agents. Could mention irreversibility or restoration possibilities but is informative.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single paragraph, front-loaded, every sentence adds value, no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers behavior, consequences, alternative, and user types. Adequate despite no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Adds context: omission defaults to agent's own key, enhancing the schema description significantly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states action 'Revoke an API key' and effect 'soft-delete via revokedAt' and 'return 401'. Distinguishes from sibling tool rotate_api_key.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly explains when to use this vs rotate_api_key, and differentiates agent vs user capabilities.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rotate_api_keyAInspect

Atomically mint a new API key with the same agent / workspace / scopes / name and revoke the old one. Returns the new plaintext (key) once; store it before discarding the response. Subsequent requests with the OLD key return 401, so swap creds before retrying. Agents may rotate ONLY their own key (omit id to default to it); users may rotate any key they own. Use this for routine credential hygiene or after a suspected leak.

ParametersJSON Schema

Name	Required	Description	Default
`id`	No	API key id to rotate. Omit when called by an agent; defaults to the agent's own current key. Required for user callers to disambiguate when more than one key exists.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses atomicity, one-time return of plaintext, 401 on old key, and ownership rules. No contradictions with missing annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences, front-loaded with key action, no redundant words. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and one parameter, description fully covers return value, lifecycle, caller guidance, and failure mode. Complete for complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers the single parameter with 100% coverage. Description adds valuable context beyond schema: when to omit (agents) vs required (users), and default behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool atomically mints a new API key and revokes the old one, with specific details on scopes and naming. Distinct from siblings like revoke_api_key and request_rotate_agent_key.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use (routine hygiene or after leak), how to call (agents omit id, users require id), and consequences (old key invalid, must store new key).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rotate_webhook_secretAInspect

Mint a fresh signing secret for a webhook. The new secret is returned exactly once; copy it to the receiver before the next event lands. After this call, deliveries are signed with the new secret only; receivers still validating against the old one will reject (401) until updated. Use after a suspected leak or as part of routine rotation hygiene.

ParametersJSON Schema

Name	Required	Description	Default
`org_slug`	Yes	Org slug
`webhook_id`	Yes	Webhook id (from list_webhooks)

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses key behaviors: the secret is returned only once, immediate effect on signing, and that old secrets cause 401 errors. This provides complete transparency for a critical security operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences with no wasted words. The first sentence immediately states the core action, followed by critical usage details. Perfectly front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description explains what is returned (new secret once) and the downstream effect (deliveries signed with new secret, old receivers reject). Together with schema coverage, it provides complete context for correct tool usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear property descriptions. The description adds only minor context (e.g., 'from list_webhooks' for webhook_id) but does not significantly enhance understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb 'Mint' and resource 'fresh signing secret for a webhook,' clearly distinguishing from sibling tools like create_webhook or update_webhook which deal with different aspects of webhook management.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'after a suspected leak or as part of routine rotation hygiene.' It doesn't explicitly list alternatives but the context makes it clear this tool is only for secret rotation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

searchAInspect

Search across everything the caller can already touch: workspace names, row cell values, and doc sections/paragraphs. Returns ranked hits (score 0-1) with a navigable URL per hit so the agent can open the exact row or doc section. Access-gated; never returns hits from workspaces the caller can't open. Use when the user references something by keyword ("find my launch-plan workspace", "which row mentions Redis?"). Faster than listing workspaces and iterating.

ParametersJSON Schema

Name	Required	Description
`q`	Yes	Search query. Case-insensitive substring match.
`kind`	No	Narrow to one surface. 'all' (default) searches workspace names + row cells + doc sections. 'workspace' is fastest when the user is naming something, 'row' targets table data, 'doc-section' targets headings and paragraphs in doc-mode.
`limit`	No	Max hits to return (default 20, max 100).
`offset`	No	Hits to skip for pagination (default 0).

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses key behaviors: returns ranked hits (score 0-1), navigable URL per hit, access-gated (never returns from inaccessible workspaces). This adequately informs the agent of read-only nature and response structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded with the tool's purpose. It is efficient but could be slightly trimmed without losing meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains return format (ranked hits, score, URL) and covers all parameters. It also mentions pagination via limit/offset, providing sufficient context for a search tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All four parameters have schema descriptions (100% coverage). The description adds value by elaborating on the 'kind' parameter (e.g., 'workspace is fastest when the user is naming something') and clarifying the search behavior of 'q' (case-insensitive substring match, already in schema).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs and resources: 'Search across...workspace names, row cell values, and doc sections/paragraphs.' It clearly distinguishes from sibling tools like list_workspaces or get_row by emphasizing multi-surface search and ranked hits.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states when to use: 'when the user references something by keyword' and provides a comparison: 'Faster than listing workspaces and iterating.' It implies not to use for exact lookups but doesn't explicitly list alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_messageAInspect

Send a direct message to another agent or human in the messaging substrate. Wires through cue.dock.svc, the same path the /live UI uses, so the recipient sees this message in their drawer (and, once they have a Dock-connected agent worker running, their agent harness's inbox). Address format is <agent_slug>@<user_slug>: flint@socrates targets the flint agent owned by user socrates; self@<user_slug> targets a human's synthetic self-agent (use this to message a human directly when you don't know which of their agents to ping). Use this when an agent legitimately needs to ask a teammate (human or agent) for help, hand off work, or follow up async; don't use it as a chat-ops side-channel for things that belong in workspace events. Sender identity follows the caller: agent callers send AS themselves, user callers send AS their self-agent (self@<their_slug>). Body cap is 32,000 chars. Returns { messageId, threadId, to } on success. The recipient is resolved against the substrate's identity space, NOT against your accessible workspace set, this is messaging, not workspace write access. Pre-cue.dock.svc-deploy environments return cue_not_configured (caller treats as 'messaging not deployed yet').

ParametersJSON Schema

Name	Required	Description
`to`	Yes	Recipient address in the form `<agent_slug>@<user_slug>`. Examples: `flint@socrates` (agent), `self@govind` (human's self-agent — use to DM a person directly).
`body`	Yes	Message text. Plain string, 1-32000 chars. `@<slug>` mentions inside the body CC the named agent on the message.
`replyTo`	No	Optional cue message id to thread under. When set, the recipient's drawer renders this as a reply with an inline parent-preview. Get the id from a prior `send_message` response or from the recipient's inbox listing.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully covers behavioral traits: wiring path, recipient visibility, address format, sender identity, character limit, return format, identity resolution, and error conditions. This is comprehensive and leaves no ambiguity about how the tool behaves.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is information-dense and front-loaded, but slightly long. Every sentence adds value, though it could be streamlined slightly without loss of clarity. Still, it is well-structured and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, but the description explains the return format ({ messageId, threadId, to }). It covers all aspects: purpose, usage, behavior, parameters, errors, and return values. For a moderately complex tool with 3 parameters, the description is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds significant meaning beyond the schema: examples of address formats, explanation of body mentions, and details on replyTo threading behavior. It enriches the parameter understanding well.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: sending direct messages to agents or humans in the messaging substrate. It specifies the verb (send), resource (direct message), and target audience, and distinguishes it from sibling tools by noting that it is not for workspace events or chat-ops side-channels.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool (e.g., asking a teammate for help, handoff, async follow-up) and when not to use it (not for workspace events). It also provides guidance on addressing format and error handling in pre-deploy environments, making the usage context clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

share_workspaceAInspect

Invite a human (by email) to a workspace at a specified role. If the email already belongs to a Dock user they're added immediately and a notification email is sent; if not, a 7-day invite token is minted that auto-accepts on magic-link sign-in. Editor role required on the workspace. Emits member.joined (existing user) or member.invited (new user). Use update_workspace_member to change a role afterwards, remove_workspace_member to revoke.

ParametersJSON Schema

Name	Required	Description
`role`	No	Role to grant. Defaults to `editor`. Owner-tier transitions require an owner caller.
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`email`	Yes	Email address of the human to invite.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full responsibility for behavioral disclosure. It discloses conditional behavior (immediate add vs. invite token), required role (editor), and emitted events (member.joined/member.invited). While it doesn't cover idempotency or error handling, it provides significant insight beyond the schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences: first states core purpose, second explains conditional behavior, third states requirement, fourth lists events and alternatives. It is front-loaded, concise, and every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately covers the tool's behavior, prerequisites, and side effects. However, it does not describe the return value or common error conditions (e.g., duplicate invite). For a mutation tool, this is a minor gap; overall, it is fairly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the baseline is 3. The description adds value by explaining the email parameter's behavior based on user existence and the role parameter's default and owner-tier constraint. This extra context enhances understanding beyond the schema's type/enum descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action: 'Invite a human (by email) to a workspace at a specified role.' It differentiates from sibling tools like update_workspace_member and remove_workspace_member by explicitly naming them for subsequent actions. The verb 'invite' and resource 'human to workspace' are specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: 'Editor role required on the workspace' and directs when to use alternatives ('Use update_workspace_member to change a role afterwards, remove_workspace_member to revoke'). It also distinguishes between inviting existing vs. new users, helping the agent choose the correct context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

unresolve_commentAInspect

Re-open a previously-resolved comment thread. Idempotent on already-unresolved comments. Fires comment.unresolved with reason: 'manual'. (Auto-unresolve on reply fires the same event with reason: 'reply' and is handled by add_comment / reply_to_comment.)

ParametersJSON Schema

Name	Required	Description	Default
`comment_id`	Yes	Comment id to re-open.

Tool Definition Quality

A4.3/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses idempotency (no-op if already open) and event emission details (reason: 'manual' vs 'reply'), providing rich behavioral context beyond the bare action. Since no annotations exist, this fully covers transparency needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, each adding essential information. Front-loaded action statement followed by behavioral notes. No superfluous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description covers all relevant behavioral traits: purpose, idempotency, event side-effect. Complete for agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter 'comment_id' described as 'The comment id to re-open.' The description adds no new parameter-level detail beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb and resource: 'Re-open a resolved comment thread.' It clearly distinguishes from sibling tools like resolve_comment.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions idempotency but does not explicitly state when to use this tool versus alternatives like resolve_comment or reply_to_comment. Usage context is implied but not fully guided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_docAInspect

Replace a workspace's doc body. Takes EITHER TipTap JSON (content) OR Markdown (markdown): pass markdown when you're producing prose from scratch (CommonMark + GFM is the format every LLM emits natively), pass TipTap JSON when you need structural edits to an existing doc (round-trip from get_doc, mutate, write back). Beyond CommonMark + GFM, the markdown layer recognizes:

→ inline image. Use ANY publicly-reachable URL (HTTPS preferred — HTTP fires browser mixed-content warnings; data: URIs are rejected by allowBase64: false). Renders block-feeling via CSS (max-width 100%, rounded corners, drop shadow) even though the underlying node is inline. The alt text is the accessible label and shows in place of the image if the URL fails to load — always include it. To attach a user-uploaded file, hit POST /api/workspaces/:slug/upload-image from the human-side UI first to get a Vercel Blob URL, then reference that URL in the doc markdown.
A lone video-file URL on its own line (extension .mp4 / .m4v / .webm / .mov / .mkv, signed-params + timestamp fragments tolerated) → native HTML5 <video controls preload="metadata"> player. Source URL is referenced directly: no iframe, no transcoding, no quality loss. Vercel Blob is the canonical hosting (5 GB per file, served with HTTP range requests so 4K masters stream cleanly), but ANY publicly-reachable HTTPS URL works. Sample shape: a paragraph containing only https://cdn.dock.ai/2025-launch-walkthrough.mp4. Mid-paragraph URLs stay as plain links — surrounding prose disqualifies the auto-promotion (matches the oEmbed convention).
```mermaid fenced code → diagram (15 sub-types: flowchart, sequence, gantt, ER, state, class, mindmap, timeline, pie, quadrant, sankey, XY-chart, packet, block, journey)
$x$ inline math, $$x$$ block math (LaTeX, KaTeX-rendered, scripts/href disabled)
> [!NOTE] / [!TIP] / [!IMPORTANT] / [!WARNING] / [!CAUTION] GFM-style callouts
```svg fenced code → sanitized SVG embed (the universal escape hatch for custom diagrams; scripts and event handlers stripped at write time)
XBODY → collapsible toggle
[[slug]] / [[org/slug]] / [[slug#tab]] / [[slug#row-id]] / [[slug|display]] → cross-references to another workspace, surface, or row. Resolved against your accessible workspace set; targets you can't see render as plain text on the reader's side (no info leak). Every cross-ref creates a Backlink row so the target's 'referenced from' sidebar shows this doc.
@Label → @-mention of a user or agent. <kind> is agent or human; <id> is the principal id. Optional query params ?org=<slug> (agents) or ?email=<addr> (humans) for renderer hints. Mentioning a human writes a doc_mention row to their inbox + sends a deep-link email; mentioning an agent fires the doc.mention_added webhook so the agent service can wake up and reply. Re-saving a doc that already mentions someone does NOT re-fire — only newly-added mentions notify (computed from a diff against the previous body). Use this from agent code to ping a teammate when a doc you wrote needs their eyes.
A lone URL on its own line from a safelisted provider (YouTube, Vimeo, Loom, Figma, CodePen, GitHub gists) → sandboxed iframe embed. Other URLs stay as regular links. Surrounding prose disqualifies the auto-embed.

Per-format caps: max 50 Mermaid diagrams (30 KB source each), max 500 math expressions (8 KB source each), max 50 SVG blocks (100 KB source each post-sanitize), max 200 cross-refs per doc, max 500 @-mentions per doc, max 20 embeds per doc, max 20 videos per doc (5 GB per file at upload time), max 200 images per doc. See /docs/doc-formats for examples. Last-write-wins; no CRDT merge. Emits doc.updated + doc.heading_added + doc.mention_added events as applicable. Requires editor role. Multi-surface workspaces optionally accept surface_slug to write to a specific doc tab; omitted writes the primary doc surface. Append-only updates have a dedicated append_doc_section tool that doesn't require fetching the body first.

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`content`	No	TipTap document JSON: `{ type: 'doc', content: [ ... ] }`. Use this when round-tripping from get_doc to preserve formatting. Mutually exclusive with `markdown` (content wins if both are passed).
`markdown`	No	Markdown body (CommonMark + GFM). Converted server-side to TipTap JSON via the same converter that powers PUT /api/workspaces/:slug/doc. Use this when authoring prose from scratch; no need to hand-build ProseMirror nodes.
`surface_slug`	No	Optional doc surface slug for multi-doc workspaces. Omit to write the primary doc surface. Use list_surfaces to see available slugs.
`if_unmodified_since`	No	Optional precondition. ISO 8601 timestamp (typically the `updatedAt` you read via `get_doc`). When set and the doc has changed since this cutoff, the write is rejected with `code: -32602`, message describing the conflict, and `data: { conflict: true, currentUpdatedAt, precondition }` so your agent can refetch + merge instead of silently clobbering a concurrent write. Without this, two agents PUTting near-simultaneously will both succeed and the last write wins (the previous content vanishes). Use this in multi-agent co-authoring flows; skip it for greenfield writes where you know you're the only writer.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden and thoroughly discloses behavior: last-write-wins semantics, emitted events (doc.updated, doc.heading_added, doc.mention_added), permission requirements, detailed input format constraints and limits, cross-reference backlink creation, and auto-embed rules. This is comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is overly long and includes many detailed format specifications (e.g., Mermaid sub-types, math constraints) that could be referenced externally. While well-structured, it could be more concise by pointing to documentation for edge cases without losing essential guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema and annotations, the description covers all essential aspects: core function, parameter differentiation, behavioral traits, constraints, related events, permission requirements, and references to additional resources. It is sufficient for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. The description adds value by clarifying the purpose of each parameter: 'content' for round-tripping from get_doc, 'markdown' for prose from scratch, slug format variants, and optional surface_slug. It also explains mutual exclusivity and conversion behavior, going beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Replace a workspace's doc body.' and distinguishes between content formats (TipTap JSON vs Markdown) and explicitly calls out the sibling tool `append_doc_section` for append-only updates, making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use guidance: 'pass markdown when you're producing prose from scratch... pass TipTap JSON when you need structural edits to an existing doc'. It also mentions the alternative `append_doc_section` tool and specifies the required 'editor role', setting clear prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_doc_sectionAInspect

Replace a single section of a workspace's doc body, identified by its heading text. The targeted edit complement to update_doc (full replacement) and append_doc_section (append-only at the end). Use this when the agent maintains a recurring section (e.g., a 'Status' block in a launch-prep doc, an 'Outcomes' block in a meeting note) and only needs to refresh that one piece. Without it, agents are forced into 'GET → splice → PUT' which costs tokens, costs latency, and races against any concurrent human edit elsewhere in the doc (last-write-wins clobbers). Section semantics: the FIRST heading whose plain text matches heading exactly (case-sensitive on trimmed text) is found, and everything from that heading up to the next heading at the same OR shallower level is replaced. So a ## Outcomes section ends at the next ## … or # …; nested ### … subsections stay part of the replaced range. Returns 404 when no matching heading exists; strict by design so a misremembered heading fails loudly. markdown is the FULL replacement, INCLUDING the heading line: pass it back as-is to keep the heading, change it to rename or rewrite the heading, change the heading level, or omit the heading entirely (collapses the section into the prior one). Empty markdown deletes the section. Same markdown surface as update_doc / append_doc_section (CommonMark + GFM + ![alt](url) images + lone-URL videos (mp4/webm/mov/mkv/m4v) + Mermaid + KaTeX + callouts + SVG + details + cross-refs + @-mentions + URL embeds). Identity / attribution / events / doc-guard all flow through the same writeDocBody path as the other doc endpoints, so @-mentions in the new section fire doc.mention_added for newly-added mentions just like update_doc does. Requires editor role. Multi-surface workspaces optionally accept surface_slug to target a specific doc tab.

ParametersJSON Schema

Name	Required	Description
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace').
`heading`	Yes	Plain text of the heading to find (case-sensitive, trimmed). For `## Outcomes`, pass `Outcomes`. Hash marks and surrounding whitespace are stripped from the comparison automatically by the markdown converter. Use `get_doc` first if you need to enumerate the headings actually present.
`markdown`	Yes	FULL replacement markdown for the section, including the heading line if you want to keep / rename / restructure it. Empty string deletes the section.
`surface_slug`	No	Optional doc surface slug for multi-doc workspaces. Omit to target the primary doc surface.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses section semantics (heading matching, end of section detection), strict 404 on no match, markdown parameter behavior (including deletion), markdown surface compatibility, identity/attribution/events/doc-guard flow, required editor role, and optional surface_slug. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is well-structured and every sentence adds value, but it is somewhat lengthy. Front-loaded with purpose and use case, then parameter details. Excellent but not maximally concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, so description must explain behavior. It covers error cases (404), permissions, markdown compatibility, and relationship to other endpoints. Fully addresses the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds significant value beyond schema: for `heading` explains hash marks are stripped and advises using `get_doc` first; for `markdown` explains inclusion of heading line and deletion on empty; for `slug` explains accepted forms; for `surface_slug` explains optionality.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States verb 'Replace' and resource 'a single section of a workspace's doc body, identified by its heading text.' Explicitly distinguishes from siblings `update_doc` (full replacement) and `append_doc_section` (append-only).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use: 'Use this when the agent maintains a recurring section... and only needs to refresh that one piece.' Mentions alternatives and explains why they are inferior (token cost, latency, race conditions). Advises using `get_doc` if heading is uncertain.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_htmlAInspect

Write an HTML surface's body. Pass any of html / css / js; omitted fields stay unchanged. Pass empty string to clear. The surface renders in a sandboxed iframe on a separate origin (render.trydock.ai) with no access to Dock cookies, storage, or parent DOM — you have free rein inside that boundary. Use any web technology the browser supports: external CDN fonts and CSS (Google Fonts, Tailwind CDN, Fontsource), JS libraries (three.js, GSAP, Chart.js, anime.js), inline <script>, Web Workers, WebGL, video, audio, canvas, dynamic DOM, complex CSS animations. Per-field caps: html 50 KB, css 100 KB, js 100 KB, total 250 KB. The sanitizer strips a small set of style smells: inline on*= event-handler attributes, javascript: and data:text/html URIs, <meta http-equiv> tags; use addEventListener and <script> instead. Requires editor role.

ParametersJSON Schema

Name	Required	Description
`js`	No	JS source. Stored alongside html/css; at v2 you'll typically inline `<script>` in the html field instead (same execution context, fewer round-trips). 100 KB cap.
`css`	No	CSS source. Applied inside the sandbox iframe. At v2 you can also `<link rel="stylesheet">` external stylesheets from any HTTPS CDN — useful for Tailwind CDN, Google Fonts, icon kits.
`html`	No	HTML body. Sanitized server-side (smells stripped, but `<script>` and `<link>` allowed at v2 — load any CDN, write inline scripts, dynamic DOM). Use `validate_html` first for a pre-flight check.
`slug`	Yes	The workspace slug.
`surface_slug`	No	Optional html surface slug. Omit to write the primary html surface.

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavioral traits: sanitization stripping scripts/event handlers/etc., per-field size caps (html 50KB, css 100KB, js 100KB, total 250KB), JS storage without execution, and the requirement for editor role. It also notes that validate_html can be used to preview sanitization effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, then details constraints and behavior. Every sentence adds value, though it is slightly longer than necessary. Still efficient for the amount of information conveyed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, parameters, behaviors, and constraints comprehensively. It lacks explicit mention of the return value, but given the tool updates a surface (typically returns success or updated object), this is a minor gap. Overall sufficient for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds contextual semantics beyond parameter descriptions: partial updates (omitted fields unchanged), clearing via empty string, and hint to use validate_html. This significantly enhances the agent's understanding of parameter usage, going beyond the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool writes an HTML surface's body and lists the fields (html, css, js). It distinguishes from siblings like get_html (read) and validate_html (pre-flight) by focusing on writing/updating body content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (to update body content) and mentions partial updates and clearing, but does not explicitly contrast with update_surface (which may handle other properties) or provide when-not-to-use guidance. Sibling tools are not directly compared.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_rowAInspect

Update specific fields of an existing row. Only the fields provided in data are updated; others are preserved. Setting surface_slug to a different sheet than the row currently lives on MOVES the row to that sheet (position recomputes to the new sheet's tail unless position is also set). Same surface as current → no-op move.

Unmapped data fields: Keys in data that don't match any existing column on the row's surface are still STORED on the row, but they won't render in the table UI until the column exists. The response carries an unmapped_fields array plus a human-readable warning. Pass auto_create_columns: true to have the server append a fresh text column for every unmapped key in one atomic step; the response then also includes created_columns: ColumnDef[]. Default false: store-but-don't-render is the safe choice for explicit schema management.

ParametersJSON Schema

Name	Required	Description
`data`	Yes	Partial row data with fields to update (e.g. {"status": "sealed"}). Pass an empty object {} when the call is purely a move (surface_slug change with no field updates).
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`rowId`	Yes	The row ID to update
`position`	No	Optional. Override the row's position. When moving across surfaces, omit to land at the new surface's tail; pass a number to land at a specific slot.
`surface_slug`	No	Optional. When set to a different surface than the row currently lives on, moves the row to that surface and emits a `row.moved_surface` event. Same-surface is a no-op. 400 if the slug is a doc surface, archived, or not in this workspace.
`auto_create_columns`	No	When true, the server auto-creates a text column for every key in `data` that doesn't already exist on the surface, then applies the update in the same call. Returns `created_columns` in the response. Default false: unmapped keys are still merged into row.data but won't render in the UI until you `add_column` them yourself.

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden. It explains partial update semantics and move behavior, but omits details like the triggering of a row.moved_surface event (present only in schema descriptions) and does not discuss permissions or error states beyond what's in schemas.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no superfluous words. Front-loaded with the primary purpose, then succinctly covers the nuanced move behavior. Every sentence adds necessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the tool's behavior well but does not mention the return value (likely the updated row) since no output schema exists. Error conditions are only in parameter descriptions, not the main description. Could be more complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the interaction between surface_slug and position ('position recomputes to the new sheet's tail unless position is also set'), which is not in the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Update specific fields of an existing row' with a specific verb and resource. It distinguishes from siblings like create_row and delete_row by emphasizing partial updates and the unique move behavior via surface_slug, which is not covered by move_rows but shows overlap.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for use: partial updates and moving rows via surface_slug. However, it does not explicitly exclude alternatives like move_rows or explain when to prefer this tool over move_rows for moves, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_surfaceAInspect

Rename, reslug, reorder, OR replace the column schema of a surface. Pass any subset of name, new_surface_slug, position, columns. Position is 0-based and is normalised across siblings so positions stay contiguous. Editor role required. Emits surface.updated.

Column schema (columns): table surfaces only. Pass a full ColumnDef[] to REPLACE the existing schema atomically (no per-column add/remove churn, no row data loss — existing row.data keys that are no longer mapped are preserved on disk and surface in future writes' unmapped_fields). Each ColumnDef = { key, label, type, position, width?, hidden?, description?, options? }. Type ∈ text | longtext | url | status | owner | date | number; options is required on status/owner. Reject 400 with a table-only error if the surface is a doc or html kind. Use get_workspace_schema first to fetch the current shape, mutate it, send it back.

ParametersJSON Schema

Name	Required	Description
`name`	No	New display name. 1-64 chars.
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`columns`	No	Optional. Full replacement ColumnDef[] for the surface's table schema. Table surfaces only — doc/html surfaces 400 with a table-only error. Each item: `{ key, label, type, position, width?, hidden?, description?, options? }`. type ∈ text\|longtext\|url\|status\|owner\|date\|number. Existing row.data keys not present in the new schema are preserved on disk but stop rendering in the UI (they'll surface as `unmapped_fields` on the next row write). Use get_workspace_schema → mutate → send the full array back; this is a REPLACE, not a merge.
`position`	No	0-based index in the tab strip. Other surfaces shift to keep positions contiguous.
`surface_slug`	Yes	The current slug of the surface to update.
`new_surface_slug`	No	New slug for the surface (lowercase kebab-case, 3-64 chars). Must be unique within the workspace.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description covers mutability, position normalization, event emission, and required role. It does not disclose rate limits or idempotency, but provides sufficient behavioral context for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with no wasted words. All information is front-loaded and dense, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 5-parameter update tool with no output schema, the description covers purpose, parameters, behavior, and prerequisites. It lacks mention of return values, but is otherwise complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions. The description adds context about subsets and position normalization, enhancing understanding beyond schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool renames, reslugs, or reorders a surface, with specific fields like name, surface_slug, position. It clearly distinguishes from siblings like create_surface by focusing on updates.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It specifies when to use the tool (updating existing surfaces), mentions optional subset of fields, explains position normalization, and notes required editor role. However, it does not explicitly state when not to use or suggest alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_webhookAInspect

Toggle a webhook's active flag on or off. Inactive webhooks are skipped at delivery time (no retry queue, no log row) but the endpoint config is preserved so flipping back is one call. Use to silence a noisy receiver during maintenance without losing its URL + secret + event subscription.

ParametersJSON Schema

Name	Required	Description
`active`	Yes	true to enable delivery, false to silence.
`org_slug`	Yes	Org slug
`webhook_id`	Yes	Webhook id (from list_webhooks)

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses that inactive webhooks are skipped, no retry queue, no log row, and endpoint config preserved. This provides behavioral context beyond the parameter schema, though it doesn't mention any potential side effects beyond what's intended.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each adding value: first states the action, second explains the effect when inactive, third gives a concrete use case. No wasted words, front-loaded with primary purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple toggle tool with 100% schema coverage and no output schema, the description is complete. It covers purpose, effect of both states, and a recommended use case, making it fully actionable for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The description reinforces the meaning of active flag (enable/silence) but doesn't add new information beyond what schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action: 'Toggle a webhook's `active` flag on or off.' This is a specific verb+resource pair that distinguishes it from siblings like create_webhook, delete_webhook, and rotate_webhook_secret.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Use to silence a noisy receiver during maintenance without losing its URL + secret + event subscription.' It explains effect of toggling off (no retry, no log row) and that config is preserved, making it easy to flip back. It does not explicitly mention when not to use, but the use case is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_workspaceAInspect

Rename a workspace, change its slug, switch its default-view mode, or flip its visibility (private | org | unlisted | public). Pass any subset of name, new_slug, mode, visibility; fields you omit are left unchanged. Slug renames preserve old URLs via WorkspaceSlugAlias so previously-shared links keep resolving. Visibility flips disconnect every live SSE subscriber so reconnects re-authenticate against the new visibility. Editor role required. Emits workspace.renamed and/or workspace.visibility_changed. Visibility WIDENING (private → org/unlisted/public, org → unlisted/public, unlisted → public) is consent-gated: pass consent_mode: "web" to return an approval_url the user clicks; otherwise the call returns consent_required and you must re-issue with consent_mode set. Visibility narrowing + non-visibility updates execute immediately on the agent's role.

ParametersJSON Schema

Name	Required	Description
`mode`	No	New default-view preference for the workspace's first tab. Optional. Doesn't add or remove surfaces; use `create_surface` / `delete_surface` to change the actual tab set.
`name`	No	New display name. Optional.
`slug`	Yes	The current workspace slug
`new_slug`	No	New URL slug (lowercase kebab-case, 3-64 chars). Optional. Must be unique within the org. Old slug stays redirectable via the alias table.
`visibility`	No	New visibility. Optional. `private` = explicit members only; `org` = every org member gets virtual editor; `unlisted` = anyone with the URL can view; `public` = listed and viewable to all. Widening transitions are consent-gated; see `consent_mode`.
`consent_mode`	No	Required when `visibility` widens audience. Pass 'web' to surface a click-to-approve URL the user opens in their browser; first call returns { status: 'approval_required', approval_url, polling_url }, you print approval_url in chat and poll polling_url for the result.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavioral traits: slug renames preserve URLs via aliases, visibility changes disconnect SSE subscribers, editor role required, consent-gating for widening, and emitted events. This is comprehensive and leaves no hidden side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively long but every sentence adds crucial information. It is front-loaded with the main actions and then elaborates on edge cases. Slightly verbose but justified by complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's 6 parameters, no output schema, and complex behaviors (consent, events, role), the description covers all necessary aspects: parameters, side effects, role requirements, consent process, and emitted events. It is complete for agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds significant value by explaining the consent flow for visibility widening, the behavior of mode (doesn't affect surfaces), and the implications of changes. This enriches the schema's descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as updating workspace properties (rename, slug, default-view mode, visibility) and distinguishes it from siblings like update_workspace_member. It uses specific verbs and resources, making its purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides detailed guidance on parameter usage (any subset, omit unchanged), consent requirements for visibility widening, and immediate execution for other updates. It implicitly advises when to use this tool versus other workspace-related tools, though it does not explicitly compare to siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_workspace_memberAInspect

Change an existing workspace member's role. Editor role required to caller. Owner-tier transitions (promoting to or demoting from owner) require an owner caller. Demoting the sole owner is blocked; promote someone else to owner first. No-op when the role is unchanged. Emits member.role_changed with from/to roles.

ParametersJSON Schema

Name	Required	Description
`role`	Yes	New role.
`slug`	Yes	The workspace slug. Accepts either the bare slug ('my-workspace') or the org-prefixed form ('my-org/my-workspace') as shown in the dashboard URL; both resolve to the same workspace.
`member_id`	Yes	The WorkspaceMember id to update. Get this from list_workspace_members.

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden for behavioral disclosure. It details caller role requirements, blocked operations (demoting sole owner), no-op behavior, and event emission (`member.role_changed`). This is comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five concise sentences, each adding unique value. No fluff, front-loaded with the main purpose, then covers constraints, edge cases, and side effects efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description covers caller prerequisites, blocked scenarios, no-op behavior, and emitted event. This is complete for a mutation tool without needing return value details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds context about role transitions (promoting/demoting owner) but does not directly enhance parameter meaning beyond schema descriptions. Schema already covers role, slug, and member_id adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Change an existing workspace member's role', which is a specific verb and resource. It distinguishes itself from siblings like 'remove_workspace_member' (which deletes) and 'share_workspace' (which invites new members).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit conditions for usage: requires editor role for caller, owner-tier transitions need owner caller, blocking demoting sole owner, and no-op when role unchanged. It lacks explicit comparison to alternatives but gives clear context on when to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

upgrade_planAInspect

Move the caller's org to Pro ($19/mo flat, 10 agents, 20 members, 200 workspaces, 5k rows per workspace) or Scale ($49/mo flat, 30 agents, 60 members, 1,000 workspaces, 50k rows per workspace). The bill doesn't change as you add agents. If the org has no card on file, returns a Stripe Checkout URL for the human. If a card exists, a live plan switch (Pro ↔ Scale) is consent-gated. Two consent surfaces, you pick via mode: (1) chat (default): FIRST call returns { status: 'confirmation_required', confirm_token, message, expires_in }; surface the message to your user and re-call within 60s with confirm_token set. (2) web: FIRST call returns { status: 'approval_required', approval_url, polling_url, expires_at }; print the approval_url in chat for your user to click and approve in their browser, then poll polling_url for the result. No-card and same-plan paths execute on the first call (no money changes hands).

ParametersJSON Schema

Name	Required	Description
`mode`	No	Consent surface. 'chat' (default) uses the in-chat confirm_token round-trip. 'web' returns an approval_url the user clicks in a browser. Use 'web' if you're headless or your user prefers a click-to-approve flow.
`plan`	No	Target plan. Defaults to 'pro'.
`confirm_token`	No	Chat-mode only. The token returned by the first call as `confirm_token`. Omit on the first call; include on the second call to execute the plan flip. Single-use, 60s TTL, bound to {org, caller, operation, params}.

Tool Definition Quality

A5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but the description fully discloses billing flat-rate, consent-gated switch, consent surfaces, token details, and special cases like no-card and same-plan paths.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Though lengthy, the description is well-structured with core action up front, followed by detailed flows; every sentence serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Description thoroughly covers responses, edge cases, and consent mechanisms, making it complete even without an output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters, yet description adds defaults, token purpose, and mode use cases, providing meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool upgrades the org to Pro or Scale plans with specific verb 'Move the caller's org' and distinguishes from sibling tool 'downgrade_plan'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance on when to use chat vs web mode, consent flows, and conditions for immediate execution, effectively differentiating from alternatives like downgrade_plan.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_doc_markdownAInspect

Pre-flight check on markdown BEFORE writing it via update_doc / append_doc_section. Returns { ok, errors, warnings, parsed } with parsed counts per format type (imageCount, videoCount, mermaidCount, mathCount, svgCount, calloutCount, crossRefCount, mentionCount, embedCount, detailsCount, headingCount, byteSize, nodeCount, depth) plus structured DocGuardError-equivalent errors (cap breaches) and non-blocking warnings (cross-refs that don't resolve, mention ids that don't resolve, oversize sources, cap-approaching counts). NEVER writes anything; pure parse + analysis. Use when iterating on rich-format markdown to catch problems before burning a write. Cross-ref + mention resolution is gated on caller's accessible workspace set, so unresolved tokens surface in warnings.

ParametersJSON Schema

Name	Required	Description	Default
`markdown`	Yes	Markdown body to validate. Same surface as update_doc: CommonMark + GFM plus mermaid / math / callouts / svg / details / cross-refs / embeds.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully discloses that the tool has no side effects ('NEVER writes anything'), details the return structure (ok, errors, warnings, parsed counts), and explains error and warning categories. It also reveals dependencies (cross-ref resolution gated on accessible workspace set).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single focused paragraph that front-loads the purpose and then concisely details return values and usage. Every sentence provides essential information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single parameter, no output schema), the description is remarkably complete. It covers the return format, error/warning semantics, dependencies, and usage scenario. No missing information that would hinder agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a schema description. The description adds value by specifying that the markdown content supports the same surface as update_doc and lists supported formats (mermaid, math, callouts, etc.), which goes beyond the schema's basic 'Markdown body to validate' description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool is a pre-flight check for markdown before writing, with a specific verb ('validate') and resource ('doc_markdown'). It distinguishes itself from sibling tools like update_doc and append_doc_section by explicitly stating it 'NEVER writes anything; pure parse + analysis.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'Use when iterating on rich-format markdown to catch problems before burning a write.' It references the sibling write tools and clarifies that cross-ref resolution depends on workspace access, providing clear context for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_formulaAInspect

Parse-check a formula expression server-side without writing anything. Returns { ok, error?, rewrittenFormula?, referencedFunctions, unknownFunctions }. Use BEFORE update_row / create_row when the formula references functions or syntax you're not 100% sure of: a =SUMIFS(...) with the wrong arg order or a misspelled =AVERAG(...) will round-trip into the cell as a stored carrier with no value, and the user will see #NAME? or #VALUE? on next view. Catch it here. unknownFunctions flags any identifier that isn't in the Dock Sheets catalog (including likely typos); referencedFunctions lists the canonical post-alias names the engine will see. Cheap, public, no auth, no workspace context needed.

ParametersJSON Schema

Name	Required	Description	Default
`formula`	Yes	Formula expression to validate, including the leading '='. Example: '=SUMIF(B2:B10, ">0")'. Max 4000 chars.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully discloses that no data is written, what the return object contains (ok, error, rewrittenFormula, referencedFunctions, unknownFunctions), and explains the meaning of these fields. It also states it requires no authentication and no workspace context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the main purpose and return shape. It is a single paragraph of six sentences, each providing essential information. Slight improvement could be gained by using bullet points for clarity, but it is still concise and effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains the return values and their significance. It covers the primary use case, related risks, and how the tool interacts with siblings. The single parameter is fully documented, making the description complete for this simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter, and the description adds valuable context: the need for a leading '=' in the formula, an example, and a 4000 character limit, exceeding the schema's minimal description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Parse-check', resource 'formula expression', and qualifier 'server-side without writing anything'. It distinguishes from sibling tools like update_row/create_row by explicitly stating its role as a pre-validation step.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises 'Use BEFORE update_row / create_row when...' and provides a concrete scenario with a misspelled function. It also notes the tool is 'cheap, public, no auth, no workspace context needed', clarifying when to invoke it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_htmlAInspect

Pre-flight check on html / css / js BEFORE writing via update_html. Returns { ok, errors, warnings, parsed } where parsed has byte counts per field and dropped (true if the sanitizer would strip anything from html). Errors cover cap breaches (html_too_large, css_too_large, js_too_large, total_too_large) and sanitizer rejection (html_sanitize_rejected, html_sanitize_empty). At v2 the sanitizer accepts <script> and <link> — those used to be smells but are now first-class agent markup; isolation lives in the opaque render iframe, not the sanitizer. The smells still stripped: inline on*= attributes, javascript:/data:text/html URIs, <meta http-equiv> tags. NEVER writes anything. Use when iterating on a payload so you don't burn a write on something the surface would reject.

ParametersJSON Schema

Name	Required	Description
`js`	No	JS to validate (optional).
`css`	No	CSS to validate (optional).
`html`	No	HTML to validate (optional).

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Explicitly states 'NEVER writes anything,' and describes the return structure including errors and warnings. With no annotations, this fully covers the behavioral traits (read-only, no side effects).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is dense but front-loaded with purpose. Every sentence contributes, though it could be slightly more concise. The structure (purpose, return, errors, usage) is logical.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description fully explains the return format, error types, and side-effect behavior. It also mentions the relationship to update_html, making it complete for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage but parameters are only described as 'optional' in the schema. The description adds meaning by explaining validation context (size limits, sanitizer check, byte counts) and that html is sanitizer-checked. Adds significant value beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a pre-flight check for html/css/js before using update_html. It specifies the resource and action, and distinguishes from the sibling tool update_html by mentioning it directly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use it: 'Use when iterating on agent-generated mockups so you don't burn a write on a payload the surface will reject.' Also implies not to use when you are ready to write, directing to update_html instead.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Server Details

Available Tools

Discussions

Your Connectors

Resources