mailbox

by bot.mailbox

Server Details

Physical mail API for AI agents. Send letters, certified mail, postcards from code via MCP.

Status: Healthy
Last Tested: 2026-05-25 08:33
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.9/5.0

Tool DescriptionsA

Average 4.1/5 across 22 of 22 tools scored. Lowest: 3.5/5.

Server CoherenceA

Disambiguation4/5

Most tools have clearly distinct purposes, but there is overlap between request_scan and request_action (which also includes scanning). This could cause agent confusion. Otherwise, the set is well-disambiguated.

Naming Consistency5/5

All tool names follow a consistent verb_noun pattern in snake_case, e.g., add_note, get_package, send_outbound_mail. No deviations or mixed conventions.

Tool Count4/5

22 tools is slightly above the typical well-scoped range (3-15) but still reasonable for a comprehensive mailbox management server. No tool seems redundant.

Completeness3/5

Core operations are covered (list, get, create, send), but missing update/delete for rules (only create_rule) and no way to remove tags or cancel outbound mail. These gaps may hinder some workflows.

Available Tools

22 tools

add_noteAInspect

Add an observation or context note to a package. Notes are visible to the facility operator and the renter. Use for recording decisions, observations, or agent reasoning.

ParametersJSON Schema

Name	Required	Description
`note`	Yes	Note text (e.g. "Appears to be the replacement GPU from RMA #4521").
`metadata`	No	Optional structured metadata attached to the note (e.g. { "rma_number": "4521", "vendor": "NVIDIA" }).
`package_id`	Yes	UUID of the package to annotate.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes that notes are visible to facility operator and renter, adding contextual info beyond annotations. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with key information, no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple add-note operation; no output schema is provided, but return value might be assumed. Lacks mention of response, but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are already well-documented. The tool description adds little extra meaning for parameters beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action 'Add' and resource 'note to a package', and specifies visibility to facility operator and renter, distinguishing it from sibling tools like 'add_tag'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context for use ('recording decisions, observations, or agent reasoning'), but does not explicitly mention when not to use or alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_tagA

Idempotent

Inspect

Add a tag/label to a package for categorization and filtering. Tags are free-form strings. Adding the same tag twice is a no-op.

ParametersJSON Schema

Name	Required	Description	Default
`tag`	Yes	Tag name (e.g. "hardware-order", "urgent", "return-needed"). Free-form, case-sensitive.
`package_id`	Yes	UUID of the package to tag.

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses idempotent behavior (adding same tag twice is a no-op), which adds value beyond the idempotentHint annotation. However, no mention of error handling or side effects beyond that.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two succinct sentences with no unnecessary words. Purpose and key behavior are front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with two well-described parameters, the description covers purpose and idempotency. Missing potential error conditions but adequate for typical use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers both parameters with 100% description coverage. The description adds minimal 'free-form strings' which is already in schema. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action: add a tag to a package, with purpose (categorization/filtering). However, it does not explicitly differentiate from sibling tool 'add_note' which might serve a similar function for notes vs tags.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for tagging packages but lacks explicit guidance on when to use this tool versus alternatives or any preconditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_ruleAInspect

Create a standing instruction that auto-triggers actions when incoming packages match conditions. Rules run on every new package and execute the specified action if all conditions match. Use requires_approval to add a human review step before execution.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Human-readable rule name (e.g. "Forward Amazon packages", "Shred junk mail").
`conditions`	Yes	Conditions that must ALL match for the rule to trigger.
`action_type`	Yes	Action to auto-trigger when conditions match.
`action_params`	Yes	Parameters for the action (e.g. forwarding address for "forward", scan_type for "scan").
`requires_approval`	No	If true, matched packages require human approval before the action executes.

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint=false, destructiveHint=false) indicate it's a write operation but not destructive. The description adds context: rules auto-trigger on every new package and require all conditions to match. It does not elaborate on potential side effects (e.g., irreversible actions like 'dispose') or execution guarantees, so transparency is adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the primary purpose, followed by a specific usage tip. Every sentence contributes without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 5 parameters (4 required) and nested objects, but lacks an output schema. The description does not explain what the tool returns (e.g., rule ID) or how to structure 'action_params'. Given complexity, more detail is needed for full context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so each parameter is documented. The description adds value by explaining the purpose of 'requires_approval' (human review step), but does not provide additional semantics for other parameters beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Create' and the resource 'standing instruction' that auto-triggers actions on incoming packages. It distinguishes from sibling tools like 'request_action' and 'update_action' by explicitly defining the rule's persistent behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the rule runs on every new package and executes the action if conditions match, implying automated recurring usage. It also mentions the 'requires_approval' option for human review. However, it does not explicitly compare to alternatives like 'request_action' for one-time actions, leaving room for improvement.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_facility_messagesA

Read-onlyIdempotent

Inspect

Read the message thread with a specific facility. Returns messages in reverse chronological order with sender role (member, facility, agent). Supports cursor-based pagination. Automatically marks facility messages as read.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Maximum number of messages to return (1-100). Defaults to 50.
`before`	No	Cursor: only return messages sent before this ISO 8601 timestamp. Use the oldest message timestamp from the previous page.
`facility_id`	Yes	UUID of the facility whose conversation to read.

Tool Definition Quality

A3.6/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description claims 'Automatically marks facility messages as read', a side effect contradicting the readOnlyHint=true annotation. Per instructions, this is an annotation contradiction, scoring 1.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with core action. Every sentence adds value, no redundant text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description explains return order and sender roles. Covers pagination and side effect. Adequate for a read tool, though output fields could be more explicit.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good descriptions. The description adds a mention of cursor-based pagination but no per-parameter details beyond the schema. Baseline of 3 applicable.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the verb 'Read' and resource 'message thread with a specific facility'. Distinguishes from send_facility_message (write) and list_facility_conversations (list).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies when to use (read messages for a facility) and provides context like pagination and auto-mark as read. Does not explicitly state alternatives or exclusions, but sibling context helps.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_mailboxA

Read-onlyIdempotent

Inspect

Get your agent's postal mailing address, suite number, facility details, and current mailbox status. Returns the full street address you can use as a return address on outbound mail.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only and idempotent behavior. The description adds context about returning the street address and mailbox status but does not disclose any additional traits like error handling or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, clear and direct, with no unnecessary information. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description adequately explains the return values (full street address, mailbox status) for a simple get tool with no output schema. It lacks details on error conditions or missing mailbox setup but is sufficient for basic use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has no parameters, so description coverage is 100%. The description does not need to add parameter info; a baseline of 4 is appropriate for zero-parameter tools.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves the agent's physical mailing address, suite number, facility details, and mailbox status, specifying it returns a street address usable as a return address. This is specific and distinct from sibling tools like get_mailbox_md.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use when needing a physical address but does not provide explicit guidance on when to use this tool over alternatives like get_mailbox_md or get_usage. No exclusions or when-not-to-use context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_mailbox_mdA

Read-onlyIdempotent

Inspect

Get the renter's MAILBOX.md standing instructions for this agent. Returns the full instruction text, version number, content hash, and last update timestamp. Call this on startup and cache the version — you must pass it to send_outbound_mail and update_action for sync verification.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true and idempotentHint=true. Description adds return fields (full text, version, hash, timestamp) and caching advice, but no further behavioral traits like rate limits or auth needs. Adequate given annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first clearly states purpose and outputs, second provides usage guidance. No redundant text, efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, description lists all return fields. Context of startup caching and sync verification is sufficient. No gaps given tool's simplicity and annotation coverage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 0 parameters; baseline is 4 per rules. Description correctly conveys no parameters needed, adding no extra param detail.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Get the renter's MAILBOX.md standing instructions' with specific verb and resource. Distinguishes from siblings like get_mailbox and get_outbound_mail by clarifying it's for the instructions file and its version.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Call this on startup and cache the version' and explains why (sync verification for send_outbound_mail and update_action). Lacks explicit when-not-to-use but provides strong usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_outbound_mailA

Read-onlyIdempotent

Inspect

Get full details of an outbound mail job including recipient address, mail class, page count, cost breakdown, current status, fulfillment photos, and a time-limited signed URL to download the original PDF.

ParametersJSON Schema

Name	Required	Description	Default
`mail_id`	Yes	UUID of the outbound mail job to retrieve.

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare read-only and idempotent; description adds specific details like time-limited signed URL and fulfillment photos, enhancing transparency beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, well-structured sentence with clear listing of included fields; slightly verbose due to enumeration but remains concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description covers major return fields thoroughly; could mention potential errors or missing fields but is sufficient for a retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and description does not add additional parameter semantics beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Describes getting full details of an outbound mail job with a specific list of fields, clearly distinguishing from sibling tools like list_outbound_mail and send_outbound_mail.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage when full details of a specific mail job are needed, but lacks explicit guidance on when to use versus alternatives (e.g., list_outbound_mail) or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_packageA

Read-onlyIdempotent

Inspect

Get full package details including photos, tracking events, shipping label data (carrier, addresses, weight), forwarding status, storage location, and action history.

ParametersJSON Schema

Name	Required	Description	Default
`package_id`	Yes	UUID of the package to retrieve.

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true. The description adds value by detailing the data returned (photos, tracking, shipping label, etc.), which goes beyond the annotation's safety profile.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence that lists all included details with no redundancy, making it highly efficient and easy to scan.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the scope of data returned for a retrieval tool with one parameter, but lacks details on the response format or structure since there is no output schema. Still nearly complete for its simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, and the description does not add extra meaning to the package_id parameter beyond what is already in the schema. Baseline of 3 applies as no additional parameter context is needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'Get' plus resource 'package', and explicitly lists the full scope of details returned (photos, tracking, shipping, etc.), clearly distinguishing it from siblings like get_package_photos and list_packages.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when full package details are needed by listing what is included, but does not provide explicit when-to-use or when-not-to-use guidance, nor name alternatives like get_package_photos for specific data.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_package_photosA

Read-onlyIdempotent

Inspect

Get photos for a package with OCR-extracted text and confidence scores. Filter by photo type to get only exterior shots, label closeups, barcode scans, or content scans.

ParametersJSON Schema

Name	Required	Description	Default
`package_id`	Yes	UUID of the package to get photos for.
`photo_type`	No	Filter by photo type. "exterior" = package exterior, "label" = shipping label closeup, "barcode" = barcode scan, "content_scan" = opened package contents.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, and not destructive. The description adds that the tool returns OCR text and confidence scores, which is behavioral detail not in annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no wasted words. Front-loaded with core purpose, then filter guidance. Efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description covers what is returned. Both parameters are described. Lacks pagination details but acceptable for a read-only tool. Overall sufficient for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions. The description adds context about filtering by photo type and the output (OCR text, confidence scores) but does not significantly enhance parameter semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves photos for a package, specifically mentioning OCR-extracted text and confidence scores, and distinguishes from sibling tools like get_package or get_scan_results by focusing on photo retrieval with additional metadata.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the optional photo_type filter and gives examples of what each filter value returns, guiding the agent on when to use the parameter. It does not explicitly exclude other tools, but the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_scan_resultsA

Read-onlyIdempotent

Inspect

Get document scan results including raw OCR text, structured data fields (addresses, dates, amounts), and confidence scores. Returns empty if scan is still processing.

ParametersJSON Schema

Name	Required	Description	Default
`package_id`	Yes	UUID of the package to get scan results for.

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds behavioral detail beyond annotations: returns empty if scan is still processing. Annotations already indicate safe read operation, so description provides useful edge case context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences front-load the purpose and include a necessary edge case (processing state). No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers main content of results and processing edge case. No output schema, but description lists returned fields. Lacks error handling details for invalid package_id.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Single parameter package_id is fully described in schema (100% coverage). Description does not add new meaning beyond restating the UUID of the package.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states the verb 'Get' and resource 'document scan results' and details included content (raw OCR text, structured data fields, confidence scores), clearly distinguishing from sibling tools like request_scan.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies use after scan request, but lacks explicit when-to-use or when-not-to-use guidance. No direct alternatives mentioned, though siblings include other retrieval tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_usageA

Read-onlyIdempotent

Inspect

Get usage summary and billing events for a time period. Returns itemized events (scans, forwards, mail sends) with costs, plus period totals. Defaults to the current billing period if no dates are specified.

ParametersJSON Schema

Name	Required	Description	Default
`period_end`	No	End of the reporting period in ISO 8601 format. Defaults to now.
`period_start`	No	Start of the reporting period in ISO 8601 format. Defaults to current billing period start.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true, so the agent knows this is a safe, read-only operation. The description adds valuable context: 'Returns itemized events (scans, forwards, mail sends) with costs, plus period totals,' specifying the exact output, which goes beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loading the purpose in the first sentence and adding default behavior in the second. Every word adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains what is returned (itemized events with costs and period totals). It does not mention pagination or timezone, but for a simple summary tool, this is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (both parameters have descriptions), but the description adds meaning by explaining the default behavior when dates are omitted: 'Defaults to the current billing period if no dates are specified.' This clarifies usage beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with 'Get usage summary and billing events for a time period,' clearly stating the verb (Get) and resource (usage summary and billing events). It is specific and distinct from sibling tools, which focus on mail, packages, and actions, making it unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions the default behavior ('Defaults to the current billing period if no dates are specified'), guiding the agent on when to omit parameters. However, it does not explicitly state when to use this tool versus alternatives, though no direct alternatives exist among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_facility_conversationsA

Read-onlyIdempotent

Inspect

List your active facility conversations with unread message counts and last message preview. Each conversation corresponds to one facility where you have a mailbox.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Maximum number of conversations to return (1-100). Defaults to 20.
`offset`	No	Number of conversations to skip for pagination. Defaults to 0.

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint and idempotentHint. The description adds limited behavioral context (e.g., unread counts, active conversations) but does not expand on pagination or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no wasted words, front-loaded with purpose. The key information is presented efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high schema coverage and annotation support, the description sufficiently explains the output (unread counts, preview) and scope (active conversations). Minor omissions like sorting or errors are not critical for this type of tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all parameters with full descriptions. The description does not add extra meaning beyond what the schema already provides, including defaults and constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists active facility conversations with unread message counts and last message preview, distinguishing it from sibling tools like get_facility_messages.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for overview of conversations but lacks explicit exclusion or alternative guidance, such as when to use get_facility_messages instead.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_outbound_mailA

Read-onlyIdempotent

Inspect

List outbound mail jobs with status tracking. Returns mail ID, recipient, mail class, status, cost, and timestamps. Filter by status to see pending, in-transit, or delivered mail.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Maximum number of mail jobs to return (1-100). Defaults to 20.
`offset`	No	Number of mail jobs to skip for pagination. Defaults to 0.
`status`	No	Filter by mail status. "pending_approval" = awaiting human approval, "submitted" = queued for facility, "ready" = printed and ready to mail, "mailed" = in transit, "delivered" = confirmed delivery, "failed" = delivery failed, "cancelled" = cancelled before mailing.

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and idempotentHint=true, so the description adds limited behavioral context beyond confirming it is a list operation with return fields. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no wasted words. The first sentence states purpose and output, the second suggests a use case. Could be slightly improved with bullet points, but efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a list tool with no output schema, the description covers return fields and a common use case (status filtering). It does not mention ordering or rate limits, but these are less critical for a simple listing endpoint.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed parameter descriptions. The description only re-emphasizes filtering by status without adding new meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'outbound mail jobs', with specific details on returned fields (mail ID, recipient, etc.). This distinguishes it from sibling 'get_outbound_mail', which fetches a single job.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description suggests filtering by status, but does not explicitly contrast with other tools like 'get_outbound_mail' or provide when-not-to-use guidance. The context for usage is clear but lacks exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_packagesA

Read-onlyIdempotent

Inspect

List inbound packages at your mailbox with optional filters by status, carrier, and date. Returns tracking number, carrier, status, and received timestamp for each package. Use pagination (limit/offset) for large result sets.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Maximum number of packages to return (1-100). Defaults to 20.
`since`	No	Only return packages received after this ISO 8601 date-time.
`offset`	No	Number of packages to skip for pagination. Defaults to 0.
`status`	No	Filter by package lifecycle status. "received" = just arrived, "stored" = in facility storage, "forwarded" = shipped to forwarding address.
`carrier`	No	Filter by shipping carrier.

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true. The description adds that it returns specific fields and supports pagination, which is consistent with the annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose and filters, every sentence adds value. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With good annotations and full schema coverage, the description is adequate. It mentions return fields and pagination. Lacks info on default ordering or if all packages are returned without filters, but for a list tool this is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed descriptions for all parameters. The description reiterates pagination (limit/offset) but adds no new meaning beyond the schema. Baseline is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists inbound packages with optional filters, and specifies the return fields (tracking number, carrier, status, timestamp). It distinguishes from siblings like get_package (single) and list_outbound_mail (outbound).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises using pagination for large result sets, which is a clear usage guideline. It does not explicitly state when not to use this tool or mention alternatives, but the context of sibling tools and the filters imply its appropriate use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

propose_mailbox_md_editAInspect

Propose changes to the renter's MAILBOX.md instructions with reasoning. The renter will see your suggestion in their dashboard and can accept, reject, or modify it. Use this when you observe patterns that could be codified into standing instructions.

ParametersJSON Schema

Name	Required	Description	Default
`reason`	Yes	Why this change is suggested (e.g. "Observed 5 Amazon packages this week, all forwarded manually — adding auto-forward rule").
`suggested_content`	Yes	Full proposed MAILBOX.md content (max 10,000 chars). Must include the complete document, not just the diff.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description explains that the renter will see the suggestion and can accept/reject/modify, aligning with annotations (readOnlyHint=false, destructiveHint=false). Adds important detail about providing full content, not just diff.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first defines purpose, second explains usage and outcome. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains the proposal workflow (renter review). Parameters are well-covered. No gaps in understanding tool behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters with descriptions. Description reinforces the 'full content' requirement and provides a concrete example for the reason parameter, adding value beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action (propose changes), the resource (MAILBOX.md), and includes reasoning. Distinguishes from sibling tools like get_mailbox_md (read-only) and update_action (direct update).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use this when you observe patterns that could be codified into standing instructions,' providing clear context for when to use. No explicit alternatives mentioned, but the guidance is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_expectedAInspect

Pre-register an expected inbound shipment so it is auto-matched when it arrives at the facility. Optionally specify an action to auto-execute on arrival (e.g. forward immediately, scan on receipt).

ParametersJSON Schema

Name	Required	Description
`carrier`	No	Shipping carrier (e.g. "fedex", "ups", "usps").
`auto_action`	No	Action to auto-execute when the package arrives.
`description`	No	Human-readable description of the shipment (e.g. "Replacement laptop from Dell").
`expected_by`	No	Expected arrival date in ISO 8601 format. Used for alerts if the package is late.
`tracking_number`	No	Carrier tracking number for the expected shipment.
`auto_action_params`	No	Parameters for the auto-action (e.g. forwarding address).

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint=false) indicate a write operation, and the description adds valuable behavioral context: auto-matching and optional auto-execution. However, it does not disclose consequences like duplicates or response format, leaving minor gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose and followed by optional details. No unnecessary words; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (6 params, nested objects, no output schema), the description covers the main workflow and key parameter behavior (e.g., 'expected_by' used for alerts). Missing details about the response or required fields are minor given the schema covers all parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the schema already documents each parameter. The description adds value by explaining the 'auto_action' with examples and the concept of auto-matching, slightly enhancing understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Pre-register' and the resource 'expected inbound shipment', with a specific purpose (auto-match on arrival). It effectively distinguishes from sibling tools, none of which offer registration of expected shipments.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implicitly conveys usage: when you want to register an inbound shipment in advance. No explicit alternatives are given, but none exist among siblings, so the context is clear and sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_actionA

Destructive

Inspect

Request a physical action on a package at the facility. Actions include forwarding to another address, shredding, scanning documents, holding for pickup, disposing, returning to sender, photographing, opening and scanning contents, or recording a video. Some actions (shred, dispose) are irreversible.

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes	Action to perform. "forward" = ship to another address, "shred" = destroy (irreversible), "scan" = OCR document scan, "hold" = keep in storage, "dispose" = discard (irreversible), "return_to_sender" = send back, "photograph" = take photos, "open_and_scan" = open package and scan contents, "record_video" = video recording of package.
`priority`	No	Processing priority. "urgent" = same-day processing, "high" = next business day, "normal" = standard queue, "low" = when convenient.	normal
`package_id`	Yes	UUID of the package to act on.
`parameters`	No	Action-specific parameters. For "forward": { address, city, state, zip }. For "scan": { scan_type }. For "hold": { until_date }.

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint=true and readOnlyHint=false. The description adds that some actions are 'irreversible,' which aligns with destructiveHint. However, it does not disclose whether the action is queued or immediate, what side effects occur (e.g., package removal after shred), or error states.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two concise sentences, front-loaded with the core purpose and a list of actions. Every sentence adds value without redundancy or verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains the tool's purpose and action list but lacks details on return value (no output schema), error handling, prerequisites, or operation outcomes. For a tool with 4 parameters including nested objects, this is only partially complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers all parameters with descriptions (100% coverage). The description repeats the action list but adds no new parameter semantics beyond the schema. Baseline 3 is appropriate since schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool requests a physical action on a package, lists nine specific actions, and notes irreversibility. Both title and description differentiate from sibling tools like 'request_scan' and 'update_action' by focusing on a broader set of physical actions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lists possible actions but provides no guidance on when to choose each action or when to use this tool over siblings (e.g., 'request_scan' vs 'scan' action here). It lacks prerequisites, such as package existence or permission requirements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_scanAInspect

Request document scanning (OCR + structured data extraction) for a package. The facility will scan the document and extract text, addresses, dates, and other structured data. Results are available via get_scan_results after processing.

ParametersJSON Schema

Name	Required	Description	Default
`scan_type`	No	Type of scan. "label" = shipping label only, "envelope" = exterior envelope, "document" = full document OCR, "content" = opened package contents.	document
`package_id`	Yes	UUID of the package to scan.

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate a mutation (readOnlyHint=false, destructiveHint=false). The description adds that processing occurs asynchronously and results are retrieved separately, but it does not disclose potential side effects like costs, idempotency, or whether multiple scan requests are allowed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the main action and outcome. Every sentence adds value, with no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple request tool with two parameters and no output schema, the description adequately explains the process and how to obtain results. Minor gaps include error handling and processing time, but overall it is complete enough for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers 100% of parameters with descriptions. The tool description adds minimal extra meaning beyond the schema, stating that the facility will scan and extract data, which is already implied.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Request document scanning') and specifies the output ('OCR + structured data extraction') and resource ('for a package'). It distinguishes itself from the sibling tool get_scan_results by mentioning results are available there.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear usage pattern: call this tool, then use get_scan_results for the output. However, it does not explicitly mention when not to use this tool (e.g., if scanning is already in progress) or alternative approaches.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_facility_messageAInspect

Send a message to the facility operator managing your mailbox. Messages appear in the shared conversation visible to you, the renter, and the facility. Optionally link the message to a specific package or action request for context.

ParametersJSON Schema

Name	Required	Description
`body`	Yes	Message text (1-5000 characters).
`package_id`	No	Optional: link this message to a specific package for context.
`facility_id`	Yes	The facility to message. Get this from the get_mailbox response.
`action_request_id`	No	Optional: link this message to an action request for context.

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate the tool is not read-only (readOnlyHint=false) and not destructive (destructiveHint=false). The description adds that messages appear in a shared conversation visible to the renter, facility, and operator, and that messages can be linked to packages or action requests. This provides useful context but does not significantly expand beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (two sentences) and front-loaded with the primary purpose. Every sentence adds value without redundancy. It avoids unnecessary details and is well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the core functionality, shared conversation nature, and optional linking. Since there is no output schema, the description adequately implies the outcome (message appears in conversation). It could mention if there is any confirmation or error handling, but it is largely complete for a simple messaging operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All four parameters are described in the input schema with clear descriptions and types. The tool description does not add any additional parameter details beyond the schema. With 100% schema coverage, the baseline score is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: sending a message to the facility operator managing the user's mailbox. It includes the verb 'send' and the resource 'facility message,' and distinguishes itself from sibling tools like 'send_outbound_mail' and 'get_facility_messages' by specifying the recipient and shared context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context for when to use the tool (e.g., messaging facility operator, optionally linking to package or action request) but does not explicitly state when not to use it or mention alternative tools. The sibling list includes related tools, but no direct comparisons are made.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_outbound_mailAInspect

Submit a document for printing and postal mailing by the facility. Supported formats: PDF, DOCX, JPG, PNG, TXT, CSV. The document is stored securely and printed by the facility operator. IMPORTANT: With a production key (sk_agent_), this immediately charges the member's card on file. Use dry_run=true to preview cost before committing, or requires_approval=true to defer until human approval. Sandbox keys (sk_agent_test_) skip billing entirely.

ParametersJSON Schema

Name	Required	Description	Default
`color`	No	Print in color. Adds a per-page color surcharge.
`duplex`	No	Print double-sided to reduce page count and postage.
`dry_run`	No	Validate inputs and return cost breakdown without creating a record or charging. Use to preview cost before committing.
`metadata`	No	Arbitrary key-value pairs echoed in GET responses and webhooks. Recommended convention: { "workflow_id": "wf_123", "reason": "Customer cancellation", "correlation_id": "abc" }.
`mail_class`	No	USPS mail class. "first_class" = 3-5 days, "priority" = 1-3 days, "certified" = with tracking and proof of mailing, "certified_return_receipt" = certified with signed delivery confirmation.	first_class
`package_id`	No	Link this mail to an inbound package (e.g. replying to received correspondence).
`page_count`	No	Explicit page count for non-PDF documents when exact pagination is known. When supplied for DOCX, TXT, or CSV, it overrides local detection and makes pricing deterministic.
`return_zip`	No	Return address ZIP code. Defaults to member profile if omitted.
`agent_notes`	No	Instructions for the facility operator (e.g. "Time-sensitive — mail today").
`return_city`	No	Return address city. Defaults to member profile if omitted.
`return_name`	No	Return address name. Defaults to the member's profile name if omitted.
`return_line1`	No	Return address line 1. Defaults to member profile if omitted.
`return_line2`	No	Return address line 2 (suite, unit, etc.).
`return_state`	No	Return address state (2-letter code). Defaults to member profile if omitted.
`recipient_zip`	Yes	5 or 5+4 digit ZIP code (e.g. "90210" or "90210-1234").
`max_cost_cents`	No	Cost cap in cents. If the calculated cost exceeds this, the request is rejected with 422 before any charge. Prevents accidental expensive mailings.
`recipient_city`	Yes	Recipient city.
`recipient_name`	Yes	Full name of the mail recipient.
`document_base64`	Yes	Base64-encoded document file. Supported formats: PDF, DOCX, JPG, PNG, TXT, CSV. Max 10MB decoded.
`recipient_line1`	Yes	Street address line 1 of the recipient.
`recipient_line2`	No	Street address line 2 (apartment, suite, unit, etc.).
`recipient_state`	Yes	2-letter US state code (e.g. CA, NY, TX).
`document_filename`	No	Original filename with extension (e.g. "letter.docx"). Required for reliable non-PDF format detection.
`recipient_country`	No	ISO 3166-1 alpha-2 country code. Defaults to "US".	US
`requires_approval`	No	If true, the renter must approve in their dashboard before the mail is printed and sent.
`mailbox_md_version`	Yes	Your current MAILBOX.md version (from get_mailbox_md). Required for sync verification.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses critical behavioral traits beyond annotations: it charges the member's card on file with production keys, stores documents securely, and explains dry_run and requires_approval. No annotation contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is single paragraph, front-loaded with the main action, then formats, then critical billing note. Every sentence adds essential information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (26 params, 7 required, no output schema), the description covers billing, sandbox behavior, dry_run, approval, document formats, storage, and sync verification. It leaves little ambiguity for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the billing implications of production vs sandbox keys, which is not in the schema. This enhances parameter understanding despite the schema already being detailed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool submits a document for printing and physical mailing. It distinguishes from sibling tools like get_outbound_mail (retrieve) and send_facility_message (electronic message).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on using dry_run=true to preview costs and requires_approval=true to defer billing. It also notes sandbox vs production key behavior. However, it does not explicitly state when not to use this tool versus alternatives like send_facility_message.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_actionA

Idempotent

Inspect

Push notes, structured data, or a clarification response to an existing action request. Use this to add agent reasoning, attach extracted data, or respond when the facility asks for clarification. Requires mailbox_md_version to prove your MAILBOX.md instructions are in sync.

ParametersJSON Schema

Name	Required	Description
`action_id`	Yes	The action request ID to update.
`agent_data`	No	Structured data to attach (e.g. OCR results, extracted fields, classification labels).
`agent_notes`	No	Free-text notes from the agent (e.g. "Forwarding per standing rule #3").
`decision_context`	No	Link this decision to a specific MAILBOX.md instruction for auditability.
`mailbox_md_version`	Yes	Your current MAILBOX.md version (from get_mailbox_md). Required for sync verification.
`respond_to_clarification`	No	Response text when action status is needs_clarification. Providing this auto-resumes the action to in_progress.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true and destructiveHint=false, but the description adds critical behavioral details: it requires mailbox_md_version to prove MAILBOX.md sync, and responding to clarification auto-resumes the action to 'in_progress'. These details go beyond the annotations and help the agent understand side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first clearly states the core purpose and use cases, the second adds a critical constraint (mailbox_md_version requirement). Every word serves a purpose, with no fluff or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 6 parameters (including nested objects) and no output schema, the description adequately covers when to use it and key behaviors (sync requirement, auto-resume). The only minor gap is the lack of mention about return values or confirmation of update, but the description is still fairly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so the baseline is 3. The description adds extra meaning by contextualizing 'mailbox_md_version' as a sync requirement and explaining that 'respond_to_clarification' auto-resumes the action. This additional context improves parameter understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: pushing notes, structured data, or clarification responses to existing action requests. It uses strong verbs ('push', 'add', 'attach', 'respond') and specifies the resource ('existing action request'), distinguishing it from sibling tools like 'request_action' (create) and query tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly lists use cases: 'add agent reasoning, attach extracted data, or respond when the facility asks for clarification.' It also notes a prerequisite (mailbox_md_version for sync). However, it does not state when not to use the tool or mention alternatives to other tools like 'add_note'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_webhookA

Idempotent

Inspect

Configure webhook endpoint URL and event subscriptions for real-time notifications. Events include package.received, package.status_changed, action.completed, mail.status_changed, and more. The endpoint must use HTTPS and respond with 2xx within 10 seconds.

ParametersJSON Schema

Name	Required	Description
`enabled`	No	Set to false to pause webhook delivery without removing the URL.
`event_types`	No	Array of event types to subscribe to (e.g. ["package.received", "mail.status_changed"]). Empty array disables all events.
`webhook_url`	No	HTTPS URL to receive webhook POST requests. Must respond with 2xx within 10 seconds.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide idempotentHint=true and destructiveHint=false. The description adds key behavioral details: endpoint must use HTTPS and respond with 2xx within 10 seconds, and lists event types. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose and key constraints. No unnecessary words; every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and three parameters, the description adequately covers purpose, parameters, and behavioral constraints. Could mention retry or failure behavior, but overall complete for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters with descriptions. The description adds value by providing example event types and reinforcing the HTTPS/response requirement, which goes beyond the schema's 'format: uri' constraint.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool configures webhook endpoint URL and event subscriptions for real-time notifications. It lists example events, making the purpose specific and distinguishable from sibling tools like create_rule or update_action.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (to configure webhooks) but provides no explicit guidance on when not to use or alternatives. The context is clear but lacks exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?