vibedeploy

by be.vibedeploy

Server Details

Deploy and host AI-built websites on EU infrastructure, straight from your AI agent.

Status: Healthy
Last Tested: 2026-07-21 17:14
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A4/5.0

Tool DescriptionsA

Average 4.4/5 across 39 of 39 tools scored. Lowest: 3.3/5.

Server CoherenceA

Disambiguation4/5

Tools are mostly distinct but some overlap exists. For example, deploy_site, update_site, begin_deploy/add_files/commit_deploy all handle deployment but differ in mode (replace vs patch vs multi-call). File editing tools (add_files, write_source_files, update_file_content) target different contexts (staging, source, dist) which is clear from descriptions. A few tools like deploy_from_url and deploy_site share similar purposes but one is from URL. Overall, an agent can distinguish them with careful reading.

Naming Consistency5/5

All tool names follow a consistent verb_noun pattern in snake_case, e.g., deploy_site, add_custom_domain, list_sites, read_file. Even compound names like deploy_from_url and build_and_deploy preserve the pattern. There is no mixing of camelCase or other conventions.

Tool Count3/5

With 39 tools, the server is on the high side. The domain is broad (deployment, file management, custom domains, forms, analytics, etc.), so many tools are justified. However, the count exceeds the typical 15-tool threshold for 'heavy' and enters the 'too many' range per guidelines. Still, it is not an extreme mismatch (50+ would be 2).

Completeness4/5

The tool surface covers the full lifecycle of site deployment and management: create, read, update, delete operations on sites, files (both dist and source), custom domains, DNS, forms, and analytics. Minor gaps exist, such as no explicit rollback tool (though snapshots provide backup) and no team management tools. Overall, it is well-rounded for its stated purpose.

Available Tools

39 tools

abort_deployAbort a staging sessionA

Idempotent

Inspect

Discard a staging session and its scratch dir. Live site is untouched. Returns immediately; cleanup is best-effort and the sweeper will retry if it fails.

ParametersJSON Schema

Name	Required	Description	Default
`deployId`	Yes	Session id to abort.

Output Schema

ParametersJSON Schema

Name	Required	Description
`status`	Yes
`deployId`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint and destructiveHint, but the description adds critical behavior: immediate return, best-effort cleanup, and retry by sweeper. This goes beyond the annotations to inform the agent of asynchronous and non-blocking behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no wasted words. The first sentence defines the action and scope, the second adds important behavioral context. Each sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and an output schema (presumably standard), the description fully covers what the tool does, its side effects, and return behavior. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a single parameter deployId described as 'Session id to abort.' The description adds no further parameter-level details, but the schema itself is sufficient. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool discards a staging session and its scratch dir, with the specific verb 'discard' and resource scope. It distinguishes from siblings like commit_deploy by clarifying that it aborts rather than finalizes a deploy.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use this tool (to cancel a staging session) and that it does not affect the live site, providing clear context. It does not give explicit when-not or alternative tools, but the purpose is unambiguous given the name and sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_custom_domainAttach a custom domain (step 1 of 2)A

Idempotent

Inspect

Start attaching a user-owned domain to an existing site. Returns a TXT record the user must add at their DNS provider. Idempotent: calling twice with the same (siteName, domain) returns the existing record instead of creating a duplicate. After the TXT is published (typically within minutes; up to 24h), call verify_custom_domain with the returned recordId. The site itself must already exist on a platform subdomain (e.g. {name}.vibedeploy.be or {name}.vibedeploy.eu). Call deploy_site first if it doesn't.

ParametersJSON Schema

Name	Required	Description	Default
`domain`	Yes	The user-owned hostname to attach (e.g. 'tester.subsite.site'). Must be a valid FQDN.
`siteName`	Yes	The VibeDeploy site name to attach the domain to (e.g. 'tester').

Output Schema

ParametersJSON Schema

Name	Required	Description
`domain`	Yes
`status`	Yes	pending_verification on first attach; verified if the domain was already set up earlier.
`nextCall`	No	Structured hint for the next tool call (e.g. verify_custom_domain). Lets an agent chain without parsing instructions.
`recordId`	Yes	Pass this to verify_custom_domain after the TXT is in place.
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`instructions`	Yes	Plain-English instructions for the user.
`alreadyAttached`	No	True when the call returned an existing record instead of creating one (idempotent path).
`dnsAutoConfigured`	No	True when the verification TXT was written automatically because the domain is managed through VibeDeploy's Gandi account. The caller can call verify_custom_domain immediately without waiting for the user to add a TXT manually. Absent / false means the user has to add the record at their own DNS provider before verify will succeed.
`verificationRecord`	No	Only present when status is pending_verification.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses key behaviors: returns a TXT record, idempotent, timing (minutes to 24h), prerequisite (site must exist on platform subdomain). These go beyond annotations, which already indicate idempotency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, each earning its place. The purpose is front-loaded, and the description efficiently covers idempotency, prerequisites, return value, and next steps.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the two-step process and complexity, the description covers all necessary context: return type (TXT record), retry behavior, timing, prerequisite, and follow-up action. The presence of an output schema further reduces burden.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description does not add significant meaning beyond the schema's parameter descriptions, though it provides context for the overall process.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Start attaching a user-owned domain to an existing site.' It specifies the verb (attach), resource (custom domain), and distinguishes it from sibling tools like 'verify_custom_domain' and 'remove_custom_domain'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use (attach domain) and when not to use (if site doesn't exist on a subdomain, must call deploy_site first). It also provides the next step (verify_custom_domain) and notes idempotency for repeated calls.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_file_chunkAppend one chunk of a single file to a staging sessionAInspect

Stream a single file across multiple calls when its content exceeds the per-MCP-call output budget. LAST RESORT — try these first: (1) add_files with encoding:'gzip+base64' fits ~250 KB of text source in ONE call (gzip locally, base64, send — no chunking, no ordering hazards); (2) begin_deploy's uploadUrl takes a 100 MB tarball in one HTTP POST if your sandbox can reach mcp.vibedeploy.be; (3) deploy_from_url if the files are fetchable from a public URL. Only chunk when none of those work. When you DO chunk, gzip+base64 each chunk too — it quadruples the source bytes per chunk. Mark the first chunk with isFirst=true (truncates + mkdir) and the last with isLast=true (returns assembled size). Send chunks for the same path serially — concurrent chunks interleave and corrupt the file.

ParametersJSON Schema

Name	Required	Description
`path`	Yes	Target path inside the site root, e.g. 'portaal-admin.html'. Same path validation as add_files.
`isLast`	Yes	True on the FINAL chunk. Triggers an assembled-size stat and refreshes session file count. Mid-stream chunks set false.
`content`	Yes	This chunk's bytes. Either raw UTF-8 (default) or base64-encoded — set encoding accordingly. PRACTICAL CHUNK SIZE: bounded by your LLM client's tool-output token budget, NOT by VibeDeploy's server. Empirically ~80 KB of base64 (≈60 KB raw bytes) per chunk is the safe upper bound for current Claude / GPT clients before tool output gets truncated. The server itself accepts up to 100 MB per call (Caddy cap) and 500 MB cumulative across the session. If you keep hitting truncation: split into smaller chunks, OR sidestep tool-output entirely via `deploy_from_url` (publish a tarball to github raw / gist / S3 → 1 tool call) or POST to begin_deploy's uploadUrl from your code-execution sandbox if it can reach mcp.vibedeploy.be.
`isFirst`	Yes	True on the FIRST chunk of a file. Truncates any existing scratch entry at this path and creates parent directories. Subsequent chunks must set false.
`deployId`	Yes	Session id returned by begin_deploy.
`encoding`	No	utf8 (default), base64 (binary files), or gzip+base64 (compress this chunk's bytes locally first; server gunzips before append). Encoding is per-chunk — you can mix across chunks of the same file (e.g. gzip+base64 for big text chunks, base64 for binary tail).
`expectedByteOffset`	No	Optional alignment check. The byte offset where THIS chunk should start in the assembled file: 0 for isFirst, otherwise the sum of all prior chunks' decoded bytes for this path. If the server's actual offset disagrees, the call fails with MISALIGNED_CHUNK before any bytes are written — catches the classic 'split base64 on a 4-char boundary that wasn't a byte boundary' bug. Omit to skip the check.

Output Schema

ParametersJSON Schema

Name	Required	Description
`path`	Yes
`isLast`	Yes
`deployId`	Yes
`fileSize`	No	Assembled file size on the pod after this chunk. Returned only when isLast=true so the caller can verify the concat succeeded.
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalBytes`	Yes	Session-wide cumulative bytes across all add_files / add_file_chunk calls.
`totalFiles`	No	Session-wide file count after this chunk. Returned only when isLast=true.
`bytesWritten`	Yes	Decoded bytes written by THIS chunk.
`remainingBudget`	Yes	Bytes still available before hitting the 500 MB cap.

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Goes far beyond annotations by describing chunking behavior (truncation on isFirst, corruption on concurrent sends, assembled-size stat on isLast), server limits (100 MB per call, 500 MB cumulative), client-side truncation risks, and error details (MISALIGNED_CHUNK). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with purpose and alternatives, then detailed usage. It is verbose but every sentence adds value given the complexity. Could be slightly trimmed but remains effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers behavior, alternatives, encoding, chunking strategy, error cases, and client/server limits. Given the tool's complexity (7 params, 5 required, output schema exists), it provides comprehensive context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds significant value beyond schema: practical chunk size bounds, encoding suggestions, alignment check purpose, and mixing encodings. While schema already documents each parameter, the description provides operational context that aids correct usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: streaming a single file across multiple calls when content exceeds the per-MCP-call output budget. It distinguishes itself from siblings by explicitly naming alternatives like add_files with gzip+base64, begin_deploy's uploadUrl, and deploy_from_url, making it a specific fallback.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use and when-not-to-use guidance: it labels itself as a 'LAST RESORT' and lists three preferred alternatives with reasoning. It also gives step-by-step chunking instructions (gzip+base64, markers, serial sending).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_filesAdd files to a staging sessionA

Idempotent

Inspect

Append files to an open staging session. Call as many times as needed; commit_deploy applies them all at once. Validates path/extension/encoding on every call so a bad file fails fast. Same 500 MB cap as single-call deploys, but cumulative across the session. LARGE TEXT FILES: a file that looks too big to inline (100-250 KB of HTML/CSS/JS) usually still fits in ONE call — gzip it locally, base64 the result, send with encoding:'gzip+base64' (text compresses 3-5×, so ~250 KB of source ≈ ~70 KB on the wire). Prefer that over add_file_chunk: one call, no ordering hazards. Only chunk when a single file exceeds ~250 KB of source even after gzip, or when you have no way to gzip locally. If your environment can run shell but can't reach this host, gzip+base64 via add_files is the fastest path; if it CAN reach this host, begin_deploy's uploadUrl (tarball POST, 100 MB) beats everything.

ParametersJSON Schema

Name	Required	Description	Default
`files`	Yes	Files to append to the staging scratch dir. Same wire shape as deploy_site/update_site — array form supports binary via encoding:'base64'; map form is utf8-only. Re-adding a path overwrites the previously staged version. Cumulative cap across the whole session: 500 MB.
`deployId`	Yes	Session id returned by begin_deploy.

Output Schema

ParametersJSON Schema

Name	Required	Description
`deployId`	Yes
`warnings`	No
`filesAdded`	Yes	Files written by this call.
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalBytes`	Yes	Total bytes staged so far across all add_files calls.
`totalFiles`	Yes	Total files now in the scratch dir.
`remainingBudget`	Yes	Bytes still available before hitting the 500 MB cap.

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond annotations: it mentions validation on every call (fast fail), cumulative 500 MB cap across the session, overwrite behavior on re-adding paths, and detailed encoding instructions. No contradiction with annotations found.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively long but well-structured, with a clear introductory sentence followed by focused paragraphs on validation, size caps, and encoding guidance. Every sentence contributes value, though some minor redundancy exists (e.g., 'Call as many times as needed' is implicit in the append concept).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (2 parameters, one with nested objects and encoding options) and the presence of an output schema, the description covers all necessary aspects: usage, alternatives, error handling (validation), size limits, and encoding strategies. It leaves no obvious gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds extra context on parameter usage, especially for the 'files' parameter (e.g., encoding choices, overwrite semantics). While the schema already provides detailed descriptions for encoding, the tool-level description adds strategic guidance (e.g., prefer gzip+base64 over chunking).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action: 'Append files to an open staging session.' It uses a specific verb (append) and resource (staging session). It also distinguishes from sibling tools like add_file_chunk and begin_deploy, providing explicit alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool vs alternatives. It recommends add_files over add_file_chunk for most cases, and compares with begin_deploy's uploadUrl. It also details when to use different encodings and when chunking is necessary.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_editsMulti-file find/replace in one callA

Destructive

Inspect

Apply find/replace edits across MANY files in one tool call. Batch sibling of update_file_content. Per-file edit semantics identical (count: 1 default, -1 = all, positive int asserts exact count). Whole call is atomic across files: validation runs first, writes only proceed if every edit's count check passes.

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`files`	Yes	Files + edits to apply. Up to 25 files / 200 total edits per call. All-or-nothing: if any edit's match count differs from its expected count, NOTHING is written.
`target`	No	Tree to edit, dist (default) or source. Same tree applies to every file in this call.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`files`	Yes
`siteId`	Yes
`target`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalEdits`	Yes

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds critical behavioral details beyond annotations: atomicity across files, count semantics (default 1, -1 for all, positive for exact), and validation-before-write order. No contradiction with annotations (destructiveHint=true is consistent).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each serving a distinct purpose: defining the tool, explaining per-file edit behavior, and describing atomicity. No redundant or extraneous text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (batch atomic edits with count checks), the description provides all necessary context for correct invocation. It covers batch limits, atomicity, and count semantics. Output schema exists but is not shown; the description does not need to explain return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Description clarifies count parameter semantics that go beyond the schema's min/max. Also, the schema's description for files mentions the 25-file/200-edit limit, but the description reinforces all-or-nothing behavior. Schema coverage is 67%, and the description compensates effectively.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it's a 'Multi-file find/replace in one tool call' and 'Batch sibling of update_file_content', specifying the verb (apply) and resource (edits across files). This distinguishes it from the single-file sibling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description labels it as a batch sibling of update_file_content, strongly implying single-file edits should use the sibling. However, it does not explicitly state when not to use or list alternatives, though the sibling tool is named.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

begin_deployBegin a multi-call deploy sessionAInspect

Opens a staging session for a multi-call deploy. Use when the site is too large to fit in a single deploy_site/update_site call. Pair with add_files (one or more times) OR a single tarball upload to the returned uploadUrl, then commit_deploy. Active session limit per token: 5. Default TTL: 1 hour.

ParametersJSON Schema

Name	Required	Description	Default
`mode`	Yes	How commit_deploy will apply the staged files. 'replace' wipes the live site and atomic-renames the staged set into place. 'patch' layers staged files on top of the live site (kept files = live + staged; deletes via commit_deploy's `delete` array).
`name`	Yes	Site name to deploy to. Must already exist; multi-call sessions don't auto-create sites — use deploy_site for that, or call this against an existing site.

Output Schema

ParametersJSON Schema

Name	Required	Description
`mode`	Yes
`siteId`	Yes
`status`	Yes
`deployId`	Yes	Pass this id to add_files / commit_deploy / abort_deploy / list_deploys.
`siteName`	Yes
`expiresAt`	Yes	ISO timestamp. The session will be auto-expired and the scratch dir cleaned at this time.
`uploadUrl`	Yes	POST a tar(.gz) archive to this URL to stage many files in one HTTP call — bypasses the per-tool-call output budget that bounds add_files. The URL already embeds a single-purpose upload_token narrowly scoped to THIS staging session, so no Authorization header is needed when using it. Example: `tar -czf - -C dist . \| curl --data-binary @- -H "Content-Type: application/octet-stream" "<uploadUrl>"`. After upload, call commit_deploy normally. Body limit: 100 MB (gzipped). TIP: pair with list_file_hashes BEFORE staging, so you can skip files that haven't changed.
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`uploadToken`	Yes	Same single-purpose token already embedded in uploadUrl, exposed separately if you'd rather pass it via Authorization: Bearer header than as a query parameter. Valid only for POST /upload/<this deployId>. Cannot be used for /mcp tool calls or any other deploy session.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate non-read-only (readOnlyHint=false) and non-idempotent (idempotentHint=false). The description adds valuable behavioral details: session opening, active session limit, and default TTL. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, each delivering essential information: core action, use case, workflow pairing, and constraints. Information is front-loaded and no extraneous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the tool's purpose, when to use it, the required workflow, and active session limits. Given that an output schema likely documents return values, it is nearly complete. Could mention error handling or session lifecycle, but it is already robust.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and parameter descriptions in the schema are detailed. The description adds minimal extra meaning beyond the schema (e.g., clarifying 'replace' behavior). With high coverage, baseline is 3; the description does not significantly enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'opens' and the resource 'staging session' for a multi-call deploy. It distinguishes from sibling tools like deploy_site and update_site by explicitly noting the tool is for sites too large for a single call.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives explicit usage context ('when the site is too large to fit in a single deploy_site/update_site call') and outlines the required workflow (pair with add_files or tarball upload, then commit_deploy). It also notes constraints (5 active sessions, 1 hour TTL). It could mention abort_deploy as an alternative for cancellation, but the guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

build_and_deployBuild editable source and ship the result as the new distAInspect

Run a build inside a hardened one-shot pod against the site's editable source tree (write source first via write_source_files / list_source_files autoPromote), then atomically swap the build output into the live dist. Reuses the same build pod the GitProject git-deploy flow uses, so the same isolation guarantees apply: no SA token, no DB/Vault reach, NetworkPolicy-restricted egress. The first run writes the chosen buildCommand/outputDir into Site.sourceManifest; subsequent calls can omit those fields.

ParametersJSON Schema

Name	Required	Description
`name`	No
`siteId`	No
`rootPath`	No	Subdirectory inside the source tree where package.json lives. Empty string = source root. Useful for monorepos.
`outputDir`	No	Override for which directory to ship as the new dist. If omitted, uses the manifest, then auto-detects (dist > build > out > public).
`buildCommand`	No	Override for the build script's `npm run build` step (e.g. 'npm run build:prod' or 'pnpm vite build'). If omitted, uses the manifest stored on the site, then falls back to 'npm run build'.
`saveManifest`	No	If true (default), persists the merged manifest back onto the site so future builds default to these settings. Set false to do a one-off build without changing the saved manifest.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`siteId`	Yes
`buildLog`	Yes	Combined orchestrator + builder log; truncated to ~32 KB to fit MCP responses.
`manifest`	Yes
`outputDir`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalBytes`	Yes
`filesDeployed`	Yes

Tool Definition Quality

A4.3/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond annotations: it details the isolated pod environment, atomic swap, and manifest mutation. It does not contradict the annotations, and provides a thorough account of what happens during execution.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized, each sentence adds meaningful information, and it front-loads the core action. While slightly technical, it remains efficient without being overly verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 optional parameters, output schema present, many siblings), the description covers workflow, prerequisites, isolation, and manifest behavior. It does not explicitly address failure modes, but is otherwise thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 67% schema coverage, the description adds value by explaining the manifest persistence for buildCommand and outputDir, clarifying how these parameters behave across calls. However, it does not compensate for the undocumented 'name' and 'siteId' parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: building from an editable source tree and atomically deploying. It distinguishes from sibling tools by mentioning the hardened one-shot pod, editable source, and manifest persistence, making it specific and unique.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises writing source first via write_source_files and explains the manifest behavior for first vs subsequent runs. However, it does not explicitly state when to use this tool versus other deploy tools like deploy_site or begin_deploy, nor does it provide exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_domain_availabilityCheck domain availability and indicative priceA

Read-onlyIdempotent

Inspect

Check whether a domain can be registered and get an INDICATIVE retail price. IMPORTANT: this is a read-only lookup — it does NOT buy, register, reserve, or pay for any domain, and it changes nothing. The returned price is GROSS (includes 21% VAT) and indicative only. Set alternatives:true to also check the same name across other common TLDs (be, com, net, eu, nl, io, dev, app). Requires a valid team token but is not tied to a specific site.

ParametersJSON Schema

Name	Required	Description	Default
`domain`	Yes	The domain to check, e.g. "example.com".
`alternatives`	No	When true, also check the same second-level name across a canonical TLD set (be, com, net, eu, nl, io, dev, app) and return each one's availability + indicative price.

Output Schema

ParametersJSON Schema

Name	Required	Description
`note`	Yes	Reminder that this is an indicative gross price and not a purchase.
`domain`	Yes
`currency`	Yes	ISO currency code for the price (e.g. EUR).
`available`	Yes	Whether the domain can be registered right now.
`priceCents`	Yes	Gross (incl. 21% VAT) indicative retail price in cents, or null if unavailable / no price is published.
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`alternatives`	No	Present only when alternatives:true was requested.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint false. Description adds that the price is gross (21% VAT) and indicative, and that no changes are made. This provides useful context beyond annotations, though the core safety is already covered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise with two sentences that are front-loaded. It uses a clear structure with an important note highlighted. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With annotations and an output schema present, the description provides sufficient context: read-only nature, price details, alternatives behavior, and requirements. The tool is well-explained for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant value by explaining the effect of the alternatives parameter and listing the TLDs checked. The domain parameter includes an example format. This enriches the schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool checks domain registration availability and returns an indicative price. The verb 'check' and resource 'domain availability' are specific, and the tool is distinct from siblings which deal with site management and deployment.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explicitly states the tool is a read-only lookup that does not buy or register domains. It also explains when to use the alternatives parameter. While it does not explicitly mention when not to use this tool versus others, the context makes it clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

commit_deployCommit a staged deployA

Destructive

Inspect

Atomically apply a staging session's files to the live site. Runs preflight + secret/malware scan against the complete staged set; on failure the session stays open and can be re-attempted or aborted. For replace-mode against a site with existing files, requires confirm:"I-want-to-replace-all-files".

ParametersJSON Schema

Name	Required	Description
`delete`	No	Patch-mode only: site-relative paths to remove from the live site as part of this commit. Useful for renames (write new path via add_files, delete old path here).
`dryRun`	No	If true, preview what commit would do without touching the live site or scratch dir. Returns the diff (filesDeployed, deletedFiles) plus would-be confirmation gate / preflight outcomes. Skips the secret/malware scan to keep the preview fast — the real commit will still scan. Recommended before any replace-mode commit on a populated site.
`confirm`	No	Required only for replace-mode commits against a site that already has files. Pass exactly "I-want-to-replace-all-files" to acknowledge that the live files will be deleted and replaced with the staged set.
`deployId`	Yes	Session id returned by begin_deploy.

Output Schema

ParametersJSON Schema

Name	Required	Description
`url`	Yes
`mode`	Yes
`dryRun`	No	True if this was a dry-run; nothing was committed.
`siteId`	Yes
`deployId`	Yes
`warnings`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`deletedFiles`	Yes
`filesDeployed`	Yes

Tool Definition Quality

A4.3/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds significant context beyond annotations: atomic apply, preflight and malware scan, session persistence on failure, and the confirm gate. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, clear and to the point. Slightly dense but not overly verbose. Could be broken into steps for readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key aspects for a destructive tool: atomic commit, scans, failure handling, and replace-mode requirement. Output schema exists, so return value explanation not needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so description minimal per-parameter detail. Description adds overall context but doesn't explain parameters beyond schema. Baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool atomically applies a staging session's files to the live site, distinguishing it from sibling tools like abort_deploy and begin_deploy.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides context on when to use (for committing staged files), mentions preflight and scan, failure behavior, and the special confirm requirement for replace-mode. Does not explicitly mention alternatives but is clear overall.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_snapshotCreate a manual snapshot (backup) for a siteA

Destructive

Inspect

Take a point-in-time Longhorn snapshot of a site's served files. This is an additive backup — it does not change anything served. It records a manual-snapshot history entry and runs retention cleanup. Viewers cannot create snapshots. Returns NO_VOLUME if the site has no volume yet (it has never been deployed).

ParametersJSON Schema

Name	Required	Description
`name`	No
`label`	No	Optional human-readable label for this backup.
`siteId`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`created`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`snapshotName`	Yes	The name of the snapshot that was created.

Tool Definition Quality

A4.3/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond annotations: it clarifies the tool is additive, non-destructive, records a history entry, runs retention cleanup, and returns NO_VOLUME if the site has no volume. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences, tightly packed with essential information, no redundant phrases, and front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, behavior, user restrictions, and an error case. Missing parameter details slightly reduces completeness, but the presence of an output schema partly compensates.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 33% (one param has a description). The tool description does not explain any parameters or their roles, leaving the agent without guidance on how to use name, label, or siteId beyond the minimal schema info.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool takes a point-in-time Longhorn snapshot of a site's served files, explicitly distinguishing it from destructive operations by noting it's additive and does not change anything served. It differentiates from sibling tools like list_snapshots and delete_site.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context for when to use the tool (to create a manual backup) and adds a constraint (viewers cannot create snapshots). However, it does not explicitly compare with alternatives or give a when-not-to-use scenario.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_siteDelete a site (soft, 7-day recovery)A

DestructiveIdempotent

Inspect

Soft-delete a site. confirm=true is required. The site moves to status 'deleted' immediately (its hostname is freed and it stops serving), and is fully purged after the team's recovery window by a sweeper. Use this for the normal 'remove this from my dashboard' flow. The response field 'accepted' is true when the soft-delete is recorded; the response also includes 'purgesAt' so you can tell the user when recovery becomes impossible.

ParametersJSON Schema

Name	Required	Description
`name`	No
`siteId`	No
`confirm`	Yes	Must be exactly true to actually delete the site.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`siteId`	Yes
`status`	Yes
`accepted`	Yes
`purgesAt`	Yes	ISO timestamp when the soft-delete becomes a hard purge (~7 days from now).
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that deletion is soft with a recovery window, that confirm=true is required, immediate effects (hostname freed, stops serving), and future purge. Also notes response includes 'accepted' and 'purgesAt' fields. This adds significant context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with no fluff. Key information is front-loaded: 'Soft-delete a site. confirm=true is required.' Efficient and scannable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers behavior, response, and usage flow. Lacks mention of which identifier (name or siteId) is required or error handling, but output schema exists for details. Adequate for typical use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Reinforces that confirm must be exactly true, which adds meaning beyond schema's const constraint. Does not explain name or siteId, but they are typical identifiers; with low schema coverage (33%), the description compensates somewhat.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'soft-delete a site' and explains the effect (status deleted, hostname freed, stops serving). Distinguishes from sibling tools like delete_source_file or remove_custom_domain by focusing on site-level deletion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use this for the normal remove this from my dashboard flow', giving clear usage context. Does not mention when not to use or alternatives, but the description implies it's the standard deletion method.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_source_fileDelete a file from the source treeB

DestructiveIdempotent

Inspect

Remove one file from the site's editable source tree. The served dist is unchanged.

ParametersJSON Schema

Name	Required	Description
`name`	No
`path`	Yes	Source-relative path to delete.
`siteId`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`path`	Yes
`siteId`	Yes
`existed`	Yes	True if the file was present and removed; false if it didn't exist (no-op).
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare destructiveHint=true and readOnlyHint=false. The description adds the key behavioral detail that only the source tree is affected, not the served dist. However, it omits error conditions or side effects for missing paths, making the transparency adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is only two sentences, front-loaded with the core action. Every word serves a purpose, with no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having an output schema, the description lacks completeness for a destructive tool with many siblings. It does not explain required parameters, success/error outcomes, or how it interacts with other file operations, leaving significant gaps for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 33% schema description coverage, the description should compensate by explaining the parameters. It does not mention 'name', 'path', or 'siteId' beyond what the schema provides. The term 'one file' is too vague to clarify parameter usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Remove' and the resource 'file from the site's editable source tree', distinguishing it from sibling tools like 'delete_site' or 'update_file_content'. It specifies the scope (source tree only) and notes that the served dist is unchanged.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides minimal usage context, only implying that the tool is for source files. It does not explicitly state when to use it versus alternatives like 'delete_site' or 'read_source_file', nor does it offer exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

deploy_from_urlDeploy a site from a public archive URLA

Destructive

Inspect

Publish a website to a live URL from a public archive link. Point this at a tar(.gz) archive on github / gist / S3 and the server fetches and deploys it, no upload from your side. Server-side fetch of a tar(.gz) archive from a public HTTPS URL, then deploy its contents. Sidesteps the case where your code-execution sandbox can reach github / gist / S3 etc. but not mcp.vibedeploy.be's upload endpoint. Equivalent to begin_deploy → POST uploadUrl → commit_deploy in one call. Hostname allowlist enforced; see the archiveUrl description.

ParametersJSON Schema

Name	Required	Description
`mode`	Yes	How the archive's files apply: replace wipes the live dist; patch merges them in.
`name`	Yes	Site name to deploy to.
`archiveUrl`	Yes	Public HTTPS URL of a tar(.gz) archive. The server fetches it (max 100 MB, 60s timeout), parses the tarball, and deploys its files. Allowed hosts: github.com / raw.githubusercontent.com / gist.github.com / gist.githubusercontent.com / gitlab.com / bitbucket.org / codeberg.org / .amazonaws.com / .r2.cloudflarestorage.com / .backblazeb2.com / .workers.dev / *.pages.dev / transfer.sh / 0x0.st / mcp.vibedeploy.be. Use this when your runtime sandbox can reach the host above but can't reach mcp.vibedeploy.be's upload endpoint directly.

Output Schema

ParametersJSON Schema

Name	Required	Description
`url`	Yes
`mode`	Yes
`name`	Yes
`siteId`	Yes
`archiveUrl`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`bytesFetched`	Yes
`filesDeployed`	Yes

Tool Definition Quality

A5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate destructiveHint=true, and the description confirms destructive behavior. It adds extra behavioral details: server-side fetch, size/timeout limits, host allowlist, and equivalence to a multi-step process. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (5 sentences) but packs essential information: purpose, mechanism, use case, constraints, and alternatives. Front-loaded with key verb and resource. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 required params, full schema coverage, output schema exists, and annotations present, the description is complete. It covers when to use, constraints, and the one-call nature, providing all necessary context for correct tool invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage, but description adds significant value beyond schema: for archiveUrl it lists allowed hosts, max size, timeout; for mode it reinforces the 'replace' vs 'patch' semantics. The description also explains the one-call equivalence.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The title and description clearly state the tool's purpose: deploy a site from a public archive URL. It uses specific verbs and resources and distinguishes itself from siblings like begin_deploy, commit_deploy by explaining it's equivalent to the three-step process in one call.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool: when the sandbox can reach archive hosts but not the upload endpoint. It names alternatives (begin_deploy → upload → commit) and mentions the hostname allowlist constraint.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

deploy_siteDeploy a site (create or full-replace)A

DestructiveIdempotent

Inspect

Publish a website to a live URL. Deploy a static site or single-page app you built (with AI or by hand) to your platform subdomain (e.g. {name}.vibedeploy.be or {name}.vibedeploy.eu) with automatic SSL, and optionally a custom domain. The fastest way to get a localhost project or an AI-generated site online. DESTRUCTIVE on existing sites: replaces every file on the named site with the supplied set. Files not in this call are deleted. For a new site, creates and provisions it. For an existing site, requires confirm: "I-want-to-replace-all-files" to proceed; without confirm the call is rejected before anything is touched. Use update_site (default mode:'patch') if you want to add or change individual files without removing the rest. Use dryRun:true to preview the diff. LARGE FILES: don't split a big text file across a placeholder deploy + chunked follow-ups — a 100-250 KB HTML/CSS/JS file fits in THIS call when sent with encoding:'gzip+base64' (gzip locally, base64 the result; text compresses 3-5×). The site is published at your platform subdomain (e.g. {name}.vibedeploy.be or {name}.vibedeploy.eu). After deploy, call add_custom_domain to also serve at a user-owned hostname.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Site subdomain. Lowercase, 3-63 chars, alphanumeric + hyphens. Must not start or end with a hyphen.
`files`	Yes	Either an array of {path, content, encoding?} entries OR a path->content map. Total payload <= 500 MB.
`dryRun`	No	If true, validate input + introspect what would change but don't write or delete. Returns the same shape with `dryRun: true` and `deletedFiles` showing what would be removed. Strongly recommended before any deploy_site against an existing site.
`confirm`	No	Required when the named site already exists. Pass exactly "I-want-to-replace-all-files" to acknowledge that every existing file will be deleted and replaced with this new fileset. Omit on first deploy of a new site. If you want to add or change files without removing the others, use update_site instead — it defaults to patch mode.

Output Schema

ParametersJSON Schema

Name	Required	Description
`url`	Yes	Live URL of the deployed site.
`dryRun`	No	True if this was a dry-run; nothing was written or deleted.
`siteId`	Yes
`created`	Yes	True if the site was created by this call.
`warnings`	No	Surfaced issues that did not block the deploy. Common types: DOTFILE_PUBLIC (a .well-known/* file is served publicly, confirm intent), or secret-scanner findings (AWS Access Key, Stripe Key, JWT Token, etc.).
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`deletedFiles`	Yes	Files that existed before this call and were removed by it. Empty for brand-new sites.
`filesDeployed`	Yes

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description discloses destructive behavior (replaces all files, deletes unmentioned ones) and required confirm flag, going beyond the destructiveHint annotation. Also covers idempotency (dryRun) and large file behavior without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with purpose and destruction warning, well-structured with clear sections. Slightly verbose but every sentence adds value; could be mildly more concise without losing important guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Complete for a complex, destructive tool with 4 params. Covers new vs existing, confirm, dryRun, large files encoding, and post-deploy steps. Output schema exists so return values not required; description covers all other context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds extra context: files array vs object use cases, binary support, confirm exact string requirement, dryRun preview behavior. Enhances understanding beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool deploys a website to a live URL, specifying it's for static sites or single-page apps. It distinguishes from siblings like update_site (patch mode) and accurately identifies the verb 'deploy' and resource 'site'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly guides when to use this tool vs update_site, recommends dryRun for preview, explains confirm requirement for existing sites, gives large file encoding advice, and mentions post-deploy custom domain call. Covers when-not and alternatives thoroughly.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_accountRead plan, limits, and current usageA

Read-onlyIdempotent

Inspect

Return the team's plan, its limits, and current usage. Use this BEFORE deploy_site or add_custom_domain to know whether a deploy would trip a plan limit, instead of provoking PLAN_LIMIT_EXCEEDED. Also returns the per-token MCP rate-limit ceiling (live remaining is in X-RateLimit-Remaining response header).

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Output Schema

ParametersJSON Schema

Name	Required	Description
`plan`	Yes	Effective plan name: Free, Freemium, Maker, Studio, Business, Ultimate.
`team`	Yes
`usage`	Yes
`limits`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`planExpiresAt`	Yes	ISO timestamp when the plan downgrades to Free, or null if no expiry set.

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint true, destructiveHint false, idempotentHint true. The description adds behavioral context about rate-limit ceiling and header, which goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, then usage, then additional info. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists, the description does not need to explain return values. It covers purpose, usage, and extra rate-limit info, which is complete for this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters, so schema coverage is 100%. Baseline 4 applies. Description does not need to add param info.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns the team's plan, limits, and usage. It uses specific verb 'Return' and resource 'team's plan, limits, usage'. It distinguishes from siblings by mentioning usage for deploy_site and add_custom_domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use: 'Use this BEFORE deploy_site or add_custom_domain to know whether a deploy would trip a plan limit'. It also implies when not to use: 'instead of provoking PLAN_LIMIT_EXCEEDED'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_forms_configGet forms-relay config for a siteA

Read-onlyIdempotent

Inspect

Read the form-to-email relay config of a site, plus the resolved delivery mode, the active From address, and (for a custom sender domain) the DNS records to publish and their verification status. Submissions: POST JSON to the returned endpoint with Content-Type: application/json (UTF-8). Flat object of form fields (strings/numbers/booleans; checkbox groups may be arrays of strings, joined with ', '). Max 30 fields, 5000 chars/field, 20000 total. Response: {success:true,data:{ok:true}} or {success:false,error:{code,message}}. Rate limit: 10 submits per IP per 10 minutes. Include a hidden honeypot input (default "_gotcha") and leave it empty.

ParametersJSON Schema

Name	Required	Description	Default
`siteName`	Yes	The site whose forms-relay config to read.

Output Schema

ParametersJSON Schema

Name	Required	Description
`notes`	Yes
`enabled`	Yes
`delivery`	Yes	platform \| verified-domain \| verified-domain-pending \| custom-relay.
`endpoint`	Yes	URL the site's form should POST to.
`siteName`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`formsConfig`	Yes	Stored config (smtpRelay.password redacted to hasPassword).
`activeSender`	Yes	The From that will actually be used right now.
`senderDomain`	No	DNS records to publish + verification status (verified-domain path).

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the description adds context about what specific data is returned (endpoint, delivery mode, DNS records). The submission instructions are peripheral but do not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy and includes extensive submission instructions (rate limits, honeypot, etc.) that are not directly about the tool itself. These details could be moved to an output schema or separate documentation, reducing front-loaded clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity (one required parameter, read-only), the description covers the purpose and key return values adequately. The output schema likely provides additional structure, so the description is complete enough despite containing extraneous submission details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter, siteName, is described in the schema with identical wording ('The site whose forms-relay config to read'). Since schema coverage is 100%, the description adds no extra value beyond the schema's description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Read the form-to-email relay config of a site', providing a specific verb and resource. It distinguishes from siblings like 'set_forms_config' and 'verify_forms_sender_domain' by its read-only nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It clearly indicates the tool is for reading config, but does not explicitly state when not to use it or provide alternative tools. However, the context of siblings implies when to use other tools. The inclusion of submission instructions could confuse the tool's intended usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_siteGet site detailsA

Read-onlyIdempotent

Inspect

Return name, url, plan, last deploy time, and recent deploy history.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No
`siteId`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`url`	Yes
`name`	Yes
`plan`	Yes
`files`	Yes	Number of files currently served by the site (live count from the pod, excluding lost+found and _staging). After update_site(mode:'patch'), this may be larger than the most recent deploy's fileCount because patch keeps the existing files. After update_site(mode:'replace'), it equals the most recent deploy's fileCount.
`siteId`	Yes
`status`	Yes	Lifecycle state. Sites are usable only in 'active'. 'deleted' is the soft-delete recovery bucket (returned until the team's restore window expires and the sweeper purges the row). 'deleting' is the transient state of an in-flight hard-delete request.
`history`	Yes
`bandwidth`	Yes
`filePaths`	No	Site-relative paths of every file currently on the pod (same scope as `files`). Lets a caller see what's there before deciding which paths to patch or delete, without having to download the site. Omitted when the live introspection step fails (e.g. pod not ready) — `files` then falls back to the most recent deploy's fileCount.
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`lastDeployAt`	Yes

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true. The description adds value by listing the returned fields (name, url, plan, etc.), providing clarity beyond the annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with no unnecessary words. It conveys the essential information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While an output schema exists (reducing need to explain return values), the description fails to address how parameters are used—e.g., whether both are required, which takes priority, or behavior with missing parameters. This gap leaves the tool incomplete for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage for parameters. The description does not explain the purpose or usage of 'name' and 'siteId', failing to add meaning beyond the schema. For a tool with undocumented parameters, the description should compensate but does not.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Return' and specifies the resource as site details, listing specific fields. It distinguishes from sibling tools like 'list_sites' which returns all sites, and 'get_site_analytics' which is for analytics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for retrieving details of a specific site, which is clear from context. However, it does not explicitly state when to use this tool versus alternatives, such as using 'list_sites' for multiple sites, nor does it provide exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_site_analyticsGet site traffic analyticsA

Read-onlyIdempotent

Inspect

Return a privacy-safe traffic summary for a site over the last period days (default 7): total page views, distinct-visitor count, top pages, daily counts, device/browser breakdowns, and Web Vitals averages. Never exposes raw visitor IPs or user-agents.

ParametersJSON Schema

Name	Required	Description
`name`	No	Site name. Provide this or siteId.
`period`	No	Number of days to aggregate over (1-90). Defaults to 7.
`siteId`	No	Site id. Provide this or name.

Output Schema

ParametersJSON Schema

Name	Required	Description
`devices`	Yes
`browsers`	Yes
`topPages`	Yes	Up to 10 most-viewed paths, descending.
`pageViews`	Yes	Total page views in the window.
`webVitals`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`dailyCounts`	Yes	Page views per day.
`uniqueVisitors`	Yes	Distinct-visitor COUNT (by IP). Raw IPs are never returned.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, idempotentHint, and non-destructive behavior. The description adds privacy safety details (no raw IPs or user-agents) and lists return components, providing valuable behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two concise sentences with no wasted words, front-loaded with the primary purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (not shown), the description adequately covers all needed information: purpose, parameters, safety, and return summary. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds the default value for 'period' and hints at mutual exclusivity of 'name' and 'siteId', providing added meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a privacy-safe traffic summary for a site over a specified period, specifying the verb 'Return' and the resource, and distinguishes from siblings as no other tool deals with analytics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on what the tool does and mentions default period, but does not explicitly state when to use it versus alternatives or when not to use it. However, since no sibling tool provides similar functionality, the implicit usage is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_custom_domainsList custom domains on a siteA

Read-onlyIdempotent

Inspect

Return all custom domains attached to a site. Each entry has a recordId you can pass to verify_custom_domain or remove_custom_domain.

ParametersJSON Schema

Name	Required	Description	Default
`siteName`	Yes	The site whose custom domains to list.

Output Schema

ParametersJSON Schema

Name	Required	Description
`domains`	Yes
`siteName`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint; the description adds that the tool returns all custom domains with a recordId, which is consistent and provides extra context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundancy, action front-loaded, every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple tool, existing output schema, and annotations, the description fully covers necessary information for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage for the only parameter (siteName), the description adds no additional parameter meaning beyond the schema, meriting the baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Return') and resource ('custom domains attached to a site'), and distinguishes from siblings by mentioning recordId usage in related tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context by noting that recordId can be used with verify_custom_domain or remove_custom_domain, but does not explicitly state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_deploysList staging sessionsA

Read-onlyIdempotent

Inspect

Return staging sessions for the team this token belongs to. Defaults to currently-active ones (open + committing). Up to 50 rows.

ParametersJSON Schema

Name	Required	Description	Default
`status`	No	Filter by status. Default lists 'open' and 'committing' (the actionable ones). Pass an explicit status to inspect history.

Output Schema

ParametersJSON Schema

Name	Required	Description
`sessions`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, covering safety. The description adds value by noting the 50-row limit and default status filter, which are useful but not critical behavioral details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two short, direct sentences with no unnecessary words. Information is efficiently presented.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (1 optional parameter, no required params, output schema present), the description fully covers purpose, default behavior, and usage. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a single parameter. The description enhances it by explaining why the default is 'open' and 'committing' (actionable ones) and suggesting the use of explicit status for history. This adds practical context beyond enum values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool returns staging sessions for the team, with default filtering to active ones (open + committing) and a row limit of 50. This is specific and distinguishes from sibling tools like begin_deploy or commit_deploy.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains default behavior and how to inspect history by passing an explicit status. It does not mention alternatives or when not to use, but the sibling context makes it clear this is the primary list tool for staging sessions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_dns_recordsList DNS records for a siteA

Read-onlyIdempotent

Inspect

Read the DNS records VibeDeploy tracks for a site (the records it created/manages on your behalf), oldest first. Returns each record's host, type, and value. Any team member, including viewers, can read DNS records. This tool is read-only and does NOT create, change, or delete any DNS record.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No	Site name to look up DNS records for.
`siteId`	No	Site id to look up DNS records for. Provide name or siteId.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`siteId`	Yes
`records`	Yes	DNS records VibeDeploy tracks for this site, oldest first. Read-only — DNS changes are not made through this tool.
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description reinforces annotations by stating read-only and non-destructive, adding that records are managed on behalf of the user and ordering. Annotations already cover safety; description adds context about scope and permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences, no fluff. Key action stated first, then details, then safety note. Optimal structure for agent consumption.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with rich annotations and output schema, description covers ordering, permissions, and scope. No missing critical information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. Description does not add new meaning beyond 'provide name or siteId'. Baseline score of 3 is appropriate as schema already documents parameters well.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it reads DNS records managed by VibeDeploy for a site, specifies ordering (oldest first) and returned fields (host, type, value). Distinguishes from sibling tools like list_custom_domains which deal with custom domains, not DNS records.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States when to use: to read DNS records. Notes that any team member can use it. Lacks explicit when-not-to-use or alternatives, but context of siblings shows no other DNS reading tool, so guidance is adequate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_file_hashesList SHA-256 hashes of every file on a siteA

Read-onlyIdempotent

Inspect

Return SHA-256 + size for every file currently served. Use BEFORE re-deploying to skip files whose content hasn't changed: hash your local files, diff against this list, and only ship the differences via update_site mode:'patch' or begin_deploy → add_files. For SPAs with content-hashed bundle names this typically reduces a full-site redeploy to a handful of files.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No	Site name or custom domain. Same lookup rules as get_site.
`siteId`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`files`	Yes
`siteId`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalBytes`	Yes
`totalFiles`	Yes

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. Description adds context about returning data for all files served but no further behavioral traits like rate limits or failure modes. Consistent with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: first states purpose, second gives primary use case, third provides concrete example. Efficient and front-loaded, but the example could be merged with the use case.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given two parameters, output schema present, and comprehensive annotations, the description is complete. It explains the tool's role in a deployment workflow and how to use it with sibling tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 50%: name parameter has description referencing get_site lookup rules; siteId has no description. Description does not elaborate on siteId. Partially compensates for missing schema details but leaves a gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it returns SHA-256 and size for every file currently served, using specific verb+resource. It distinguishes from siblings like list_source_files by focusing on hashes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises using before redeploying to skip unchanged files and references alternative deployment methods (update_site mode:'patch', begin_deploy → add_files). Provides clear when-to-use and how-to-use context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_historyList deploy/snapshot history for a siteA

Read-onlyIdempotent

Inspect

Return the most recent 50 deploy and snapshot history entries for a site, newest first. Includes the source (how it was triggered), an optional label, the associated Longhorn snapshot name (if any), the file count, and the number of secrets detected. Any team member can read history.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No
`siteId`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`entries`	Yes	Most recent 50 deploy/snapshot history entries, newest first.
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A3.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds that it returns most recent 50 entries, ordering, and included fields. It also adds access context ('Any team member'). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, with the first sentence stating the core purpose and output, and the second adding access context. It is front-loaded and contains no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains what is returned (most recent 50 entries, fields) and ordering, but does not mention pagination, how to get more than 50, or which parameter is required (likely siteId). The output schema exists but is not shown; description is adequate but has gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning no descriptions for parameters 'name' and 'siteId'. The description does not explain what these parameters represent or when they are needed. It fails to compensate for the lack of schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'return' and resource 'deploy and snapshot history entries' with specific scope ('most recent 50', 'newest first'). It distinguishes from sibling tools like list_deploys and list_snapshots by combining both types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'Any team member can read history' which gives permission context but does not explicitly state when to use this tool versus alternatives like list_deploys or list_snapshots. No when-not or alternatives guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_sitesList your sitesB

Read-onlyIdempotent

Inspect

List sites for the team this connection belongs to.

ParametersJSON Schema

Name	Required	Description	Default
`includeDeleted`	No	If true, include soft-deleted sites still in their plan-specific recovery window (status: 'deleted'). Defaults to false: deleted sites can't accept deploys, so an agent rarely wants them in a working list. Use true when you specifically need the recovery view.

Output Schema

ParametersJSON Schema

Name	Required	Description
`sites`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, destructiveHint. The description adds no additional behavioral traits beyond the purpose. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single-sentence description. Parameter guidance is detailed but lives in the schema. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With output schema present, return values need no explanation. Description covers core purpose. Could mention team scope or pagination, but simple list tool with one optional param seems adequately specified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter 'includeDeleted', which already includes explanation. The tool description does not add meaning beyond the schema, so baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb+resource (list sites). Scope 'for the team this connection belongs to' adds specificity. Does not explicitly differentiate from sibling tools like 'get_site' or 'list_deploys', but the name is unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or alternatives guidance for the tool overall. However, the 'includeDeleted' parameter description provides context on when to use it vs default, offering implicit usage guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_snapshotsList snapshots for a siteA

Read-onlyIdempotent

Inspect

List the Longhorn volume snapshots for a site. Snapshots are point-in-time backups of the site's served files. Any team member can list snapshots. Returns NO_VOLUME if the site has no volume yet (it has never been deployed).

ParametersJSON Schema

Name	Required	Description	Default
`name`	No
`siteId`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`snapshots`	Yes	Longhorn snapshot objects for the site's volume (name, created timestamp, size, etc.).
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A3.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true and destructiveHint=false. The description adds useful context: permission info ('Any team member') and a specific error return (NO_VOLUME when no volume exists), which goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences with no wasted words. The main action is first, then context, then error condition. Perfectly front-loaded and concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given an output schema exists (not shown), return value explanation is not needed. The description covers purpose, permissions, and a failure mode, but lacks parameter explanations. For a simple list tool, it is adequate but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It does not explain the parameters 'name' and 'siteId' at all—neither their purpose nor whether they are optional. The agent has to guess their meanings.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'Longhorn volume snapshots for a site'. It explains what snapshots are (point-in-time backups) and distinguishes from siblings like 'create_snapshot' by focusing on listing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'Any team member can list snapshots' indicating no special permissions, and describes a failure condition (NO_VOLUME). However, it does not provide explicit when-not-to-use guidance or comparisons to alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_source_filesList the editable source tree for a siteA

Read-onlyIdempotent

Inspect

Return SHA-256 + size for every file in the site's editable source tree (the platform's copy of the pre-build code, not the served dist). Use BEFORE editing so you know which paths exist and which haven't changed since the last build. autoPromote:true will mirror the served dist into source for static-only sites whose source tree is empty (does nothing if the dist looks built).

ParametersJSON Schema

Name	Required	Description
`name`	No
`siteId`	No
`autoPromote`	No	If true and the site has no source tree yet but its dist looks static, copy dist → source on the fly. Default: false.
`forcePromote`	No	If true, mirror dist → source EVEN when dist looks built (e.g. minified Vite output). Use when the original source isn't recoverable and you're willing to edit the build artefact directly. Sets manifest.noBuild=true automatically when no package.json is in the dist, so subsequent build_and_deploy short-circuits to a direct source→dist copy. forcePromote implies autoPromote.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`files`	Yes
`siteId`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalBytes`	Yes
`totalFiles`	Yes
`autoPromoted`	No	Set to true when this call ran the auto-promote (dist → source) before listing. Lets the caller learn the source tree was just synthesised from the served dist.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, and non-destructive behavior. The description adds valuable context about the return values (hash and size) and the special behaviors of autoPromote and forcePromote beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences efficiently convey purpose, usage, and special behavior without redundancy. The structure is front-loaded with core functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 parameters, output schema, annotations), the description fully explains behavior, return values, and edge cases (autoPromote/forcePromote), making it comprehensive for a listing tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 50% of parameters with descriptions for autoPromote and forcePromote. The tool description expands on these behaviors but does not add meaning for name and siteId, which remain generic strings.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns SHA-256 and size for every file in the editable source tree, distinguishing it from sibling reading tools like 'read_source_file' by focusing on metadata instead of content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises use before editing to know existing paths and detect changes, providing clear context. However, it does not explicitly mention alternatives like 'read_source_file' for reading content.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_fileRead a file from a deployed siteA

Read-onlyIdempotent

Inspect

Return the bytes of one file currently served by the site. Use this to inspect or edit existing content (call read_file → modify → update_site mode:'patch') so a new chat can iterate on a site without re-uploading. Files larger than 5242880 bytes can't be read in one call. Use list of paths from get_site.filePaths to discover what's available.

ParametersJSON Schema

Name	Required	Description
`name`	No	Site name (subdomain) or custom domain. Same lookup rules as get_site.
`path`	Yes	Site-relative path of the file to read (e.g. 'index.html', 'assets/main.css'). No leading slash, no '..'.
`siteId`	No	Alternative to name. One of name\|siteId is required.
`maxBytes`	No	Per-file size cap. Default 1048576, hard max 5242880. If the file is larger, the call fails with FILE_TOO_LARGE rather than returning truncated bytes — splitting source mid-token would corrupt downstream edits.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`path`	Yes	Echoes the input path, normalized (leading slash stripped, backslashes converted).
`size`	Yes	Size in bytes of the file on the pod.
`siteId`	Yes
`content`	Yes
`encoding`	Yes	How to interpret `content`. utf8 means the file is text and `content` is the raw text. base64 means the file is binary (image/font/etc.) and `content` is base64 — decode before use.
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnly, idempotent, and non-destructive. The description adds crucial details: size limit (5242880 bytes), failure mode (FILE_TOO_LARGE rather than truncation), and rationale for not splitting bytes. This goes well beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with the core purpose. Every sentence adds distinct value: purpose, usage pattern, size constraint, and file discovery. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, usage, size limits, error behavior, and path conventions. Output schema likely handles return value details. Only minor gap: no mention of encoding or base64 return, but output schema probably covers that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the size limit behavior and path conventions (no leading slash, no '..'), which are not fully captured in the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it returns the bytes of one file and distinguishes from siblings like read_files (plural) and read_source_file. The use case for inspection and editing is clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a concrete usage pattern (read_file → modify → update_site) and mentions using get_site.filePaths for discovery. It doesn't explicitly state when to avoid using this tool, but the context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_filesRead multiple files from a deployed site in one callA

Read-onlyIdempotent

Inspect

Batched version of read_file. Pass up to 50 paths; each is fetched independently with the same per-file rules as read_file. The whole batch is capped at 8388608 bytes total — once that's exhausted, remaining paths fail with BATCH_BUDGET_EXCEEDED so the agent can re-request them in another call.

ParametersJSON Schema

Name	Required	Description
`name`	No
`paths`	Yes	Site-relative paths to read (1..50). Order is preserved in the response.
`siteId`	No
`maxBytesPerFile`	No	Per-file cap. Default 1048576, hard max 5242880.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`files`	Yes	One entry per requested path, in the same order. Each entry is independent: a missing file or oversized file fails its own entry but does not abort the whole batch. If the cumulative byte budget is exhausted partway through, remaining entries fail with code BATCH_BUDGET_EXCEEDED.
`siteId`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalBytes`	Yes	Sum of bytes returned across successful entries.
`budgetExceededAt`	Yes	Index of the first path that was skipped due to the cumulative byte budget, or null if everything fit.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds important behavioral details: batch limit (50), total byte cap (8388608), per-file rules same as read_file, and explicit error handling (BATCH_BUDGET_EXCEEDED). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loaded with key information (batched version, limits, error). Every sentence adds value without redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of annotations and an output schema, the description is largely complete. It covers limits, error handling, and references per-file rules from read_file. It does not repeat output schema details, which is appropriate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 50% (paths and maxBytesPerFile have descriptions; name and siteId do not). The description adds context about the batch behavior and total byte cap but does not explain the name or siteId parameters. It partially compensates for the coverage gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a batched version of read_file, specifying the verb (read), resource (files from a deployed site), and scope (batch of up to 50 paths). It distinguishes itself from the sibling read_file by being a batch operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear guidance on usage: pass up to 50 paths, each fetched independently with the same per-file rules as read_file, and a total byte cap. It implies when to use (batch reading) but does not explicitly state when not to use or compare to other siblings beyond read_file.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_source_fileRead a file from the editable source treeA

Read-onlyIdempotent

Inspect

Return the bytes of one source file (the platform's editable copy of the pre-build code), letting an AI in any future chat fetch and edit content without needing the original local files. Use list_source_files first to discover paths. For the served dist, use read_file instead.

ParametersJSON Schema

Name	Required	Description
`name`	No
`path`	Yes	Site-relative path inside the source tree, e.g. 'src/App.tsx'.
`siteId`	No
`maxBytes`	No	Per-file size cap. Default 1048576, hard max 5242880.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`path`	Yes
`size`	Yes
`siteId`	Yes
`content`	Yes
`encoding`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the tool is read-only. The description adds useful context: it returns bytes, targets the editable source tree, and mentions the maxBytes parameter for size caps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, each focused and purposeful. First sentence states the core action, second provides usage context. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main purpose and usage, and an output schema exists to explain return values. However, it omits explanations for two parameters (name, siteId) and does not mention that the tool reads only a single file, which is clear from the schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 50% (descriptions for path and maxBytes). The description reinforces path's meaning but does not clarify name or siteId parameters, which lack schema descriptions. This adds some value but does not fully compensate for missing parameter info.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies 'Return the bytes of one source file', clearly stating the verb and resource. It distinguishes from siblings by referencing list_source_files for discovery and read_file for the served dist.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use this tool versus alternatives: 'For the served dist, use read_file instead.' Also provides prerequisite: 'Use list_source_files first to discover paths.'

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_source_filesRead multiple source files in one callA

Read-onlyIdempotent

Inspect

Batched read across the editable source tree (up to 50 paths). Each entry is independent: a missing/oversized file fails its own slot but doesn't abort the batch. Cumulative cap 8388608 bytes; remainder fails with BATCH_BUDGET_EXCEEDED.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No
`paths`	Yes
`siteId`	No
`maxBytesPerFile`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`files`	Yes
`siteId`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalBytes`	Yes
`budgetExceededAt`	Yes

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral details beyond annotations: per-entry independence, error handling, cumulative byte cap, and batch budget error, which are not covered by readOnlyHint or idempotentHint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise, front-loaded sentences with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key behavioral aspects (batch independence, error mode, budget limit) and output schema exists, but could mention that result is per-path. Still sufficiently complete for a read tool with annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%. Description explains 'paths' and cumulative cap but does not detail 'name', 'siteId', or 'maxBytesPerFile', partially compensating for missing schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'batched read across the editable source tree' with a specific resource and limit of 50 paths, distinguishing it from siblings like single-file reads.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the batched read behavior and error isolation but does not explicitly say when to use this tool versus siblings like 'read_source_file' for single files.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rehost_imagesRehost a site's external images locallyA

DestructiveIdempotent

Inspect

Download the external images a site references (e.g. from the old site it was rebuilt from), store them on this VibeDeploy site under assets/img/, and rewrite the HTML references to local paths so the site no longer depends on the original. Call this once AFTER deploying a site rebuilt with the Website Converter. Auto-detects the external image URLs from the site's own HTML; downloads are SSRF-guarded, size/count/time capped, and applied atomically (patch mode). Images already hosted on vibedeploy.be are skipped.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Site name (subdomain) whose external images should be downloaded and rehosted locally.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`siteId`	Yes
`failures`	Yes	External image URLs that could not be rehosted (left untouched in the HTML).
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`bytesHosted`	Yes
`imagesHosted`	Yes	Number of external images downloaded and stored on the site.
`htmlFilesUpdated`	Yes	Number of HTML files whose img references were rewritten to local paths.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses SSRF-guarded downloads, size/count/time caps, atomic application, and skipping of images already hosted on vibedeploy.be. Annotations confirm idempotent (hint true) and non-destructive (false), with no contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each adding essential information. No fluff. Front-loaded with main action, then constraints and context. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, output schema exists), the description covers all necessary behavioral and usage aspects. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter (name) with 100% schema coverage. Description adds context beyond schema by clarifying that it's the site whose external images should be processed, and that the site must already be deployed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action (download, store, rewrite HTML references), resource (site's external images), and scope (local rehosting). It distinguishes itself from sibling tools by stating it auto-detects images and is for use after deploying a site rebuilt with the Website Converter.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit instruction to call once after deploying a site rebuilt with the Website Converter. No explicit when-not-to-use or alternatives, but the context makes it clear this is for external image rehosting, not for other tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove_custom_domainDetach a custom domainA

DestructiveIdempotent

Inspect

Remove a custom domain from a site. The site itself is unaffected; only the custom hostname is detached. The {name}.vibedeploy.be subdomain keeps serving the site.

ParametersJSON Schema

Name	Required	Description
`domain`	No	The custom domain to remove (e.g. 'tester.subsite.site'). Provide this OR recordId.
`recordId`	No	The recordId returned by add_custom_domain. Provide this OR domain.
`siteName`	Yes	The site to detach the domain from.

Output Schema

ParametersJSON Schema

Name	Required	Description
`domain`	Yes
`removed`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds context beyond annotations: it clarifies that the site itself is unaffected despite destructiveHint=true, and that the vibedeploy.be subdomain continues serving. This addresses potential concerns about destructiveness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no filler, front-loading the purpose and providing essential behavioral details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 params (1 required), input schema with full coverage, and an output schema, the description adequately explains the core action, effects on the site, and idempotent nature implied by annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed descriptions for each parameter (e.g., explanation of mutual exclusivity of domain vs recordId). The description does not add significant extra meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Title 'Detach a custom domain' and description clearly state the action: remove a custom domain from a site. It specifies the scope (only the hostname is detached, site unaffected) and distinguishes from sibling tools like add_custom_domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description makes clear when to use the tool (to detach a custom domain) and implies that the site remains served via the default subdomain. However, it does not explicitly list when not to use it or provide alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_filesGrep across a site's filesA

Read-onlyIdempotent

Inspect

Search for a literal string or basic regex across all files in either the served dist or the editable source tree. Use this BEFORE batch-reading files to find candidates — saves the 'read 14 batches just to find which 3 files matter' round trip. Pass target: "source" to search the editable tree (requires Site.sourceStored=true).

ParametersJSON Schema

Name	Required	Description
`glob`	No	Filename glob filter, e.g. '.js' or '.{js,html}'. Applied via find before grep so we don't read non-matching files.
`name`	No
`regex`	No	When true, the pattern is interpreted as a basic regular expression. Default: false (literal substring match).
`siteId`	No
`target`	No	Where to search. 'dist' (default) searches the served files. 'source' searches the editable source tree (requires Site.sourceStored=true).
`pattern`	Yes	Pattern to search for. Treated literal by default; pass regex:true to use as a basic regex (BusyBox grep BRE — no PCRE features).
`maxMatches`	No	Cap on returned matches. Default 200, hard max 1000. Truncation is reported via budgetExceeded.
`caseInsensitive`	No	Default: false. When true, adds -i to grep.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`siteId`	Yes
`target`	Yes
`matches`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalMatches`	Yes
`budgetExceeded`	Yes	True if the search hit maxMatches and there are likely more matches not returned.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Disclosures: read-only behavior (matching annotations), literal vs regex mode with limitations (BusyBox grep BRE, no PCRE), glob pre-filtering via find, maxMatches truncation reported, target behavior. Adds value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two succinct sentences: first defines core function, second provides actionable guidance. No redundant or filler content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 8 parameters (1 required), available output schema, and annotations, description covers essential behavior, usage advice, parameter caveats, and limitations. No significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 75%; description adds context for key parameters (target requires sourceStored, regex limitations, glob applied before grep, maxMatches cap). Does not describe every parameter but fills practical gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description specifies verb ('Search') and resource ('files across a site's files') with clear scope ('served dist or editable source tree'). Differentiates from sibling tools that read files rather than search content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly recommends using before batch-reading files to save round trips, and explains when to use 'source' target (requires Site.sourceStored=true). No explicit alternatives named, but context is strong.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_forms_configEnable or update the forms relay for a siteA

DestructiveIdempotent

Inspect

Configure the built-in form-to-email relay, fully self-service. Supports a custom From (via a verified sender domain or your own SMTP relay), an explicit Reply-To, and full email branding (subject template, field labels/order, logo, accent color, or a custom HTML body). Requires team role owner or admin. Pass config:null to switch the relay off. If you set a custom sender without an smtpRelay, the response returns the DNS records to publish; then call verify_forms_sender_domain. Submissions: POST JSON to the returned endpoint with Content-Type: application/json (UTF-8). Flat object of form fields (strings/numbers/booleans; checkbox groups may be arrays of strings, joined with ', '). Max 30 fields, 5000 chars/field, 20000 total. Response: {success:true,data:{ok:true}} or {success:false,error:{code,message}}. Rate limit: 10 submits per IP per 10 minutes. Include a hidden honeypot input (default "_gotcha") and leave it empty.

ParametersJSON Schema

Name	Required	Description	Default
`config`	Yes	Full config to store (replaces existing). Pass null to disable and clear.
`siteName`	Yes	The site to configure.

Output Schema

ParametersJSON Schema

Name	Required	Description
`notes`	Yes
`enabled`	Yes
`delivery`	Yes	platform \| verified-domain \| verified-domain-pending \| custom-relay.
`endpoint`	Yes	URL the site's form should POST to.
`siteName`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`formsConfig`	Yes	Stored config (smtpRelay.password redacted to hasPassword).
`activeSender`	Yes	The From that will actually be used right now.
`senderDomain`	No	DNS records to publish + verification status (verified-domain path).

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotent, not destructive. Description adds critical behavior: effect of passing null, DNS record generation, verification requirement, submission details (endpoint, rate limit, honeypot, field constraints). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Every sentence serves a purpose. Well-organized: starts with purpose, then prerequisites, configuration details, submission behavior, response format. No fluff, yet comprehensive. Front-loads key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (many config properties, submission rules, rate limits, verification flow), the description covers all essential aspects. Includes response format, rate limit, honeypot, and field constraints. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3. Description adds meaningful context beyond schema: explains that config:null disables, explains sender verification process, describes submission constraints (max fields, chars, rate limit, honeypot). Adds value without redundancy.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'configure' and identifies the resource as 'built-in form-to-email relay'. It clearly distinguishes from sibling tools like get_forms_config (read) and verify_forms_sender_domain (verification step).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions when to use (self-service configuration) and required role (owner/admin). Provides clear guidance on disabling (config:null) and next steps (call verify_forms_sender_domain after setting custom sender without SMTP relay). Lacks explicit 'when not to use' or alternative tools, but context is strong.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_file_contentSurgical find/replace in one fileA

Destructive

Inspect

Apply one or more literal find/replace edits to a single file on the site, in one tool call. Designed for tiny edits where uploading the full file would be wasteful (one nav-button reference, one encoding fix, one env var bump). Each edit must specify how many matches it expects; mismatches abort the whole call with NO writes. For dist edits the change goes live immediately; for source edits you still need to call build_and_deploy.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Site name.
`path`	Yes	File-relative path inside the chosen target tree.
`edits`	Yes	Ordered list of edits to apply atomically. Each is `{find, replace, count?}`. If any edit's match count doesn't equal its expected count, the whole call aborts with no writes.
`target`	No	Which tree to edit. 'dist' (default) edits the served file directly — visitors see the change immediately. 'source' edits the editable source tree; you'll need build_and_deploy (or it short-circuits via noBuild) to ship.

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`path`	Yes
`edits`	Yes
`siteId`	Yes
`target`	Yes
`afterBytes`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`beforeBytes`	Yes

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructive, non-readOnly, non-idempotent; description adds atomicity details and deployment implications. No contradiction, and it enriches understanding of side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five concise sentences front-loaded with purpose, each adding unique value. No redundant or unclear phrasing.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (find/replace, counts, targets), description covers key aspects: atomicity, mismatch behavior, deployment differences. Output schema exists, so return value details are not needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters with descriptions. The description paraphrases some schema info but adds little new meaning beyond clarifying the count logic and atomicity. Baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool applies find/replace edits to a single file, specifying it's for 'tiny edits' and listing example use cases. It distinguishes from siblings like add_files or write_source_files by emphasizing surgical precision and atomicity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use (tiny edits) and notes behavioral nuances (dist vs source, atomic abort on mismatch). It lacks explicit alternatives but implies larger edits warrant other tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_siteUpdate an existing siteA

DestructiveIdempotent

Inspect

Patch or replace files on an existing site. Defaults to patch mode: only the listed files change; everything else stays. Pass mode:'replace' to wipe-and-replace the whole site (the legacy behaviour, surfaced explicitly so it can't happen by accident). Use delete: [paths] in patch mode to remove specific files without wiping the rest. Use dryRun: true to preview the diff before committing. LARGE FILES: a 100-250 KB text file fits in one call with encoding:'gzip+base64' (gzip locally, base64 the result) — prefer that over begin_deploy + add_file_chunk streaming. Errors if the site does not exist.

ParametersJSON Schema

Name	Required	Description
`mode`	No	patch (default): write only the listed files; everything else stays. replace: delete all existing files and write only the listed ones. Use replace only when you genuinely want to throw away the rest of the site.
`name`	No	Site name (preferred).
`files`	No	Files to write. Array form `[{path, content, encoding?}]` (preferred) supports binary via encoding:'base64'; map form `{path: content}` is utf8-only. <= 500 MB total. Optional when `delete` is provided in patch mode for delete-only deploys.
`delete`	No	Patch-mode only: site-relative paths to remove from the pod. Files not in this list are kept. Reported back in `deletedFiles` listing only entries that actually existed. Combine with `files` to atomically rename in one call (write new path + delete old path). Rejected in mode:'replace' since replace already removes anything not in `files`.
`dryRun`	No	If true, validate input + introspect what would change but don't write or delete. Returns the same shape with `dryRun: true` and `deletedFiles` showing what would be removed. Use this before any destructive call (replace mode, or patch with `delete`) to verify the diff.
`siteId`	No	Site id (alternative to name).

Output Schema

ParametersJSON Schema

Name	Required	Description
`url`	Yes
`mode`	Yes	The mode that was actually applied.
`dryRun`	No	True if this was a dry-run; nothing was written or deleted.
`siteId`	Yes
`warnings`	No	Surfaced issues that did not block the deploy (e.g. DOTFILE_PUBLIC, leaked-secret patterns).
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`deletedFiles`	Yes	Files removed by this call. For patch mode this is the entries from `delete` that actually existed; for replace mode it's every pre-existing file not in `files`.
`filesDeployed`	Yes

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Disclosures beyond annotations: default patch mode, replace mode as legacy (to prevent accidental wipes), delete-only deploy, dryRun for safe preview, large file handling with gzip+base64, and error if site missing. Annotations already indicate destructiveHint true, but description adds rich safety context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is front-loaded with core purpose and then expands on modes. Each sentence adds value, though slightly verbose in places. Overall well-structured and efficient for its depth.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (6 parameters, modes, encoding, output schema present), the description covers all major use cases, edge cases, and tool interactions. Complete for a mutation tool with multiple modes.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage, so baseline 3. Description adds value by explaining patch vs replace semantics, dryRun usage, delete parameter interaction, and encoding options for large files, making it more actionable than schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Patch or replace files on an existing site,' using a specific verb and resource. It distinguishes from siblings like delete_site and deploy_site by detailing the update behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly explains when to use patch vs replace mode, how to delete files with `delete`, and encourages dryRun for preview. It also contrasts with start_deploy + add_file_chunk streaming, guiding the agent to prefer this tool for large files.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_custom_domainVerify a custom domain (step 2 of 2)A

Idempotent

Inspect

Check the TXT record the user added at step 1 and, if found, attach the domain to the site's ingress. If verification fails, the most common cause is DNS propagation delay; wait a few minutes and try again. Once verified, the domain serves the site immediately (HTTPS issues automatically within ~30s).

ParametersJSON Schema

Name	Required	Description	Default
`recordId`	Yes	The recordId returned by add_custom_domain.
`siteName`	Yes	The site the domain was attached to.

Output Schema

ParametersJSON Schema

Name	Required	Description
`domain`	Yes
`message`	Yes
`verified`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses external DNS checking, attachment mutation, propagation delay, and HTTPS timing. Annotations already indicate non-readonly and idempotency; the description adds valuable behavioral context without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with the core action, no fluff. Each sentence serves a purpose: action, failure handling, and outcome.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the action, common failure, and result. It assumes knowledge of step 1, but the title and sibling tools provide that context. The existence of an output schema compensates for missing return value details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for both parameters. The description adds only the context that recordId comes from add_custom_domain, but that is already implied by the schema and title.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks a TXT record and attaches the domain to the site's ingress. The title explicitly marks it as step 2 of 2, distinguishing it from sibling tools like add_custom_domain and remove_custom_domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains it is the follow-up to step 1 (add_custom_domain) and gives guidance on handling DNS propagation delays. It does not explicitly list when not to use it, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_forms_sender_domainVerify a forms custom sender domainA

Idempotent

Inspect

Check the DNS records for a site's custom sender domain (DKIM TXT + SPF include). Once the DKIM record is observed, the sender domain is marked verified and the relay sends from the custom From (DKIM-signed). Until then it falls back to the platform address. DNS can take a few minutes to propagate — re-run if it fails the first time. Not needed when the site uses a custom smtpRelay.

ParametersJSON Schema

Name	Required	Description	Default
`siteName`	Yes	The site whose custom sender domain to (re)check.

Output Schema

ParametersJSON Schema

Name	Required	Description
`domain`	Yes
`status`	Yes	pending \| verified \| failed
`lastError`	Yes
`dnsRecords`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`spfVerified`	Yes
`dkimVerified`	Yes

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explains the full verification flow: DNS check, marking verified, changing relay behavior from custom DKIM-signed From to platform fallback. It also mentions propagation delay and idempotency. Annotations (idempotentHint, readOnlyHint false) are consistent and supplemented.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences, each serving a purpose: action, consequence, fallback, and usage notes. It is front-loaded with the core action and contains no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple schema (one param) and presence of an output schema, the description covers behavior, fallback, propagation, and exclusion. It could mention prerequisites like domain setup, but overall it is complete for a well-documented tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter siteName, so the baseline is 3. The description does not add additional parameter details, but the purpose is clear from context. No extra value beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it checks DNS records (DKIM TXT + SPF include) for a site's custom sender domain used in forms. It explains the effect of marking verified and the fallback behavior. This distinguishes it from similar sibling tools like verify_custom_domain (general domain verification) and set_forms_config.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says it is not needed when the site uses a custom smtpRelay, giving a clear condition to bypass. It also advises re-running if DNS propagation fails. While alternatives are not named, the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

write_source_filesWrite files into the editable source treeA

DestructiveIdempotent

Inspect

Stage edits to a site's editable source tree (not the live dist). Use list_source_files first to discover what's there. The dist is unchanged until you re-deploy via update_site or run build_and_deploy. Sites have source storage enabled by default; if a legacy site doesn't, the call fails with SOURCE_STORAGE_NOT_ENABLED and the user should contact VibeDeploy support to enable it.

ParametersJSON Schema

Name	Required	Description
`name`	No
`files`	Yes	Files to write into the source tree. Same wire shape as add_files. Re-writing a path overwrites the previous source. Per-file cap 5 MB; per-call cap 50 MB; max 200 files per call.
`siteId`	No

Output Schema

ParametersJSON Schema

Name	Required	Description
`name`	Yes
`siteId`	Yes
`written`	Yes
`request_id`	No	Server-assigned request correlation id. Quote it when contacting support.
`totalBytes`	Yes
`totalFiles`	Yes

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds significant behavioral context beyond annotations: staging nature, persistence until redeploy, per-file and per-call size caps, max files per call, and legacy site error condition. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four concise sentences, each adding value. Front-loaded with purpose and key workflow steps. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of writing to a staging source tree with constraints, the description covers prerequisites, error handling, capacity limits, and deployment workflow. Output schema handles return values, so completeness is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 33% schema description coverage, the description does not compensate for the lack of parameter details for name and siteId. It adds no new information about parameter meaning beyond what the schema already provides for files.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it stages edits to the editable source tree, distinguishing from the live dist. Mentions sibling tool list_source_files as prerequisite, and aligns with write_source_files name.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Advises to use list_source_files first and explains that dist is unchanged until redeploy. Provides context but does not explicitly exclude when not to use, though it suffices for typical workflow.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?