Skip to main content
Glama
Ownership verified

Server Details

Physical mail API for AI agents. Send letters, certified mail. Sandbox + live keys via MCP.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.2/5 across 29 of 29 tools scored. Lowest: 3.6/5.

Server CoherenceA
Disambiguation4/5

Most tools have clearly distinct purposes, though list_inbound_mail and list_packages could be confused without careful reading of descriptions. Overall, the set is well-differentiated.

Naming Consistency5/5

All tool names follow a consistent verb_noun pattern with underscores, making the API predictable and easy to navigate.

Tool Count2/5

With 29 tools, the server is overly heavy for a typical MCP server. The domain might justify many operations, but this exceeds the recommended range for coherence.

Completeness3/5

The tool surface covers core workflows (create, read, list, request actions) but lacks updates and deletions for packages, rules, and tags, leaving notable gaps.

Available Tools

29 tools
add_noteAInspect

Add an observation or context note to a package. Notes are visible to the facility operator and the renter. Use for recording decisions, observations, or agent reasoning.

ParametersJSON Schema
NameRequiredDescriptionDefault
noteYesNote text (e.g. "Appears to be the replacement GPU from RMA #4521").
metadataNoOptional structured metadata attached to the note (e.g. { "rma_number": "4521", "vendor": "NVIDIA" }).
package_idYesUUID of the package to annotate.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesCreated package note record.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds visibility scope ('visible to facility operator and renter') beyond annotations, which are minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded, no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Fully covers the creation and usage of notes; output schema exists, so return details are unnecessary.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% so baseline 3; description doesn't add new parameter info beyond what's in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb 'Add' and resource 'note to a package' differentiate from sibling 'add_tag'. Includes visibility context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States 'Use for recording decisions, observations, or agent reasoning' but lacks explicit when-not-to-use or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_tagA
Idempotent
Inspect

Add a tag/label to a package for categorization and filtering. Tags are free-form strings. Adding the same tag twice is a no-op.

ParametersJSON Schema
NameRequiredDescriptionDefault
tagYesTag name (e.g. "hardware-order", "urgent", "return-needed"). Free-form, case-sensitive.
package_idYesUUID of the package to tag.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesCreated or existing package tag record.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description mentions the free-form nature of tags and the no-op behavior on duplicates, which complements the idempotentHint annotation. However, it does not disclose other behavioral traits such as error handling for invalid package_ids or any side effects beyond the idempotency already annotated.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise (two sentences), front-loaded with the action, and contains no unnecessary words. Every sentence contributes meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 parameters, output schema exists), the description covers the key aspects: purpose, parameter behavior, and idempotency. It lacks details on error cases or return values, but the output schema likely covers that. Still, it could be slightly more comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the free-form nature of the 'tag' parameter and the no-op behavior, which goes beyond the schema's property descriptions. For 'package_id', no additional information is provided beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Add a tag/label'), the resource ('to a package'), and the purpose ('for categorization and filtering'). It also specifies that tags are free-form strings and that adding the same tag twice is a no-op, which distinguishes it from other sibling tools like add_note.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context for when to use the tool (categorization and filtering) and notes the idempotent behavior (no-op on duplicate). However, it does not explicitly compare to alternatives or state when not to use it, leaving room for ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

advance_test_outbound_mailAInspect

Advance a test_mode outbound mail record one lifecycle step and queue the matching webhook. submitted becomes ready with simulated pages/envelope photos; ready becomes mailed with tracking, carrier, dispatch method, and receipt photo; mailed becomes delivered.

ParametersJSON Schema
NameRequiredDescriptionDefault
mail_idYesUUID of the test_mode outbound mail record to advance.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesAdvanced sandbox outbound mail job and webhook status.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are minimal (readOnlyHint false, destructiveHint false), so the description carries the burden. It details each state transition (e.g., 'submitted becomes ready with simulated pages/envelope photos') and mentions queuing webhooks, adding value beyond structured fields.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single well-structured sentence that front-loads the primary action and then details state transitions. It is concise but could benefit from splitting into multiple sentences or bullet points for improved readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema, the description adequately covers the lifecycle steps and side effect (webhook queue). However, it does not explicitly state the precondition that the record must be in test_mode, which could be inferred but is not explicit.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description of the mail_id parameter. The tool description does not add extra semantic meaning beyond what the schema already provides, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool advances a test_mode outbound mail record one lifecycle step, specifying the verb (advance) and resource (test_mode outbound mail record). It differentiates from siblings like create_test_outbound_mail (creation) or get_outbound_mail (read).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage in testing scenarios but does not provide explicit guidance on when to use this tool versus alternatives. No exclusions or alternatives are mentioned, relying on implied context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_ruleAInspect

Create a standing instruction that auto-triggers actions when incoming packages match conditions. Rules run on every new package and execute the specified action if all conditions match. Use requires_approval to add a human review step before execution.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesHuman-readable rule name (e.g. "Forward Amazon packages", "Shred junk mail").
conditionsYesConditions that must ALL match for the rule to trigger.
action_typeYesAction to auto-trigger when conditions match.
action_paramsYesParameters for the action (e.g. forwarding address for "forward", scan_type for "scan").
requires_approvalNoIf true, matched packages require human approval before the action executes.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesCreated standing rule record.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explains that rules run on every new package and execute actions, but does not disclose potential side effects, limits, or execution details. Annotations provide readOnlyHint=false, so description adds some context but not extensive transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences: purpose, behavior, and usage tip. No superfluous words, front-loaded with key information. Slightly more could be added but remains concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and the presence of an output schema, the description covers the core purpose and a key feature (requires_approval). It omits details about action_params and return values, but those are likely covered by the schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema coverage is 100%, so baseline is 3. The description adds minimal parameter information beyond the schema, only mentioning requires_approval. The schema already provides detailed descriptions for all parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool creates a standing instruction that auto-triggers actions on matching packages. It uses specific verbs and resources ('Create a standing instruction') and distinguishes itself from sibling tools by focusing on automated rules rather than manual actions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions when to use requires_approval but does not explicitly provide guidance on when to use this tool versus alternatives like request_action. Usage is implied but not fully clarified.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_test_outbound_mailAInspect

Create a sandbox outbound mail record without uploading a real document. The record is always test_mode=true, cost_cents=0, includes estimated_live_cost_cents and cost_breakdown, and queues a mail.submitted webhook. Use with a sandbox key to rehearse outbound workflows before sending real physical mail.

ParametersJSON Schema
NameRequiredDescriptionDefault
colorNoWhether to include color-print surcharge in the live estimate.
metadataNoArbitrary metadata echoed in responses and webhooks.
mail_classNoMail class to simulate.first_class
page_countNoSimulated page count used for pricing.
agent_notesNoOptional facility/operator notes for the simulated mailpiece.
recipient_zipNoRecipient ZIP code. Affects estimated live postage.94105
recipient_cityNoRecipient city.San Francisco
recipient_nameNoRecipient name for the simulated mailpiece.Test Recipient
recipient_line1NoRecipient street line 1.123 Test Street
recipient_stateNoRecipient 2-letter state code.CA

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesCreated sandbox outbound mail job and webhook status.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Details that records are always test_mode=true, cost_cents=0, include live estimate, and trigger webhooks. Annotations confirm non-destructive, non-readOnly. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states core function, second gives usage guidance. Front-loaded and efficient with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With output schema present and description mentioning key outputs (estimated cost, webhook), the tool is fully contextualized for testing purposes.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline 3. Description adds overall context but no parameter-specific details beyond what schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a sandbox outbound mail record without uploading a real document, distinguishing it from real mail sending tools like 'send_outbound_mail'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises using with a sandbox key to rehearse outbound workflows before real mail, providing clear context. Does not list alternatives but sibling tools imply 'send_outbound_mail' for real use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_facility_messagesA
Read-onlyIdempotent
Inspect

Read the message thread with a specific facility. Returns messages in reverse chronological order with sender role (member, facility, agent). Supports cursor-based pagination. Automatically marks facility messages as read.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of messages to return (1-100). Defaults to 50.
beforeNoCursor: only return messages sent before this ISO 8601 timestamp. Use the oldest message timestamp from the previous page.
facility_idYesUUID of the facility whose conversation to read.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesMessages exchanged with a facility.
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, implying no state mutation, but the description states 'Automatically marks facility messages as read,' which is a mutation. This is a direct contradiction. Without this issue, the description would add useful behavior details, but the contradiction severely undermines transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each informative and no redundancy. Front-loaded with purpose, then details on order, roles, pagination, and side effect. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and clear parameter descriptions, the description covers key aspects: purpose, pagination, ordering, sender roles, and a side effect. The mark-as-read behavior could use more detail (e.g., reversibility), but the output schema likely fills gaps. Slightly above adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions. The description adds context about pagination (cursor-based) but does not significantly expand on parameter meaning beyond what the schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states a specific verb ('Read') and resource ('message thread with a specific facility'). It distinguishes from sibling tools like 'send_facility_message' (write) and 'list_facility_conversations' (list, not messages). Explicitly details return order and sender role information.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates when to use (read a facility's message thread) and provides usage details like cursor-based pagination and automatic marking as read. It does not explicitly exclude alternatives or state when not to use, but the sibling context clarifies the landscape.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_inbound_mailA
Read-onlyIdempotent
Inspect

Get one forwarded inbound mail item with compact draft_context by default. Use this before drafting an outbound reply when you need sender context, reply contact candidates, deadline clues, source files, and thread linkage in one stable payload.

ParametersJSON Schema
NameRequiredDescriptionDefault
includeNoOptional expansions. Defaults to ["drafting"]. Add signed_urls only when the agent truly needs temporary file access.
signed_urlsNoIf true, return short-lived signed URLs for stored files.
inbound_mail_idYesUUID of the inbound mail item to retrieve.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesOne forwarded inbound mail item.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint as safe. The description adds behavioral nuance: it returns a 'compact draft_context by default' and describes the payload as 'stable' (reinforcing idempotency). This adds value beyond annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, each earning its place. The first sentence states the core function and default behavior. The second provides actionable usage context. No wasted words; critical information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (optional expansions, default behavior) and the presence of an output schema (which documents return structure), the description covers all essential aspects: purpose, when to use, parameter guidance, and behavioral expectations. It is fully adequate for an agent to select and invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all three parameters. The description adds extra guidance: 'Add signed_urls only when the agent truly needs temporary file access' and implies the default include is ['drafting']. This goes beyond what the schema provides, adding usage context for parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get one forwarded inbound mail item with compact draft_context by default.' This specifies the verb (get), resource (inbound mail item), and scope (one, forwarded). It effectively distinguishes from sibling tools like list_inbound_mail (for multiple) and get_outbound_mail (different direction).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly recommends using this tool 'before drafting an outbound reply when you need sender context, reply contact candidates, deadline clues, source files, and thread linkage in one stable payload.' It provides clear context-specific guidance and implies when not to use it (if those elements are unneeded). This is excellent usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_mailboxA
Read-onlyIdempotent
Inspect

Get your agent's real mailing address beta endpoint when the account has explicit beta access: street address + mailbox number for approved accounts. For generally available inbound context, use list_inbound_forwarding_addresses instead; that returns a private intake alias for scans, PDFs, photos, provider notices, and notes from addresses the operator already uses.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesMailbox address, facility, and status details.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false, so the safety profile is clear. The description adds that it's a beta endpoint requiring explicit beta access and approved accounts, which is valuable behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, then alternative. No wasted words. Highly concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has no parameters and an output schema exists. Description explains what is returned and the access condition, which is sufficient for a simple read tool with good annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so schema coverage is 100%. Description does not need to add param info; baseline of 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns the agent's real mailing address (street address + mailbox number) for approved accounts with explicit beta access. It distinguishes from sibling list_inbound_forwarding_addresses by specifying the alternative use case.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use (beta access, real address) and when not to (generally available inbound context), directing to list_inbound_forwarding_addresses as the alternative. No ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_mailbox_mdA
Read-onlyIdempotent
Inspect

Get the renter's MAILBOX.md standing instructions for this agent. Returns the full instruction text, version number, content hash, and last update timestamp. Call this on startup and cache the version — you must pass it to send_outbound_mail and update_action for sync verification.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesCurrent MAILBOX.md standing instructions.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, and non-destructive. The description adds beyond that by specifying return fields and caching behavior, but does not mention any other behavioral traits like authorization needs or failure modes.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences: first states purpose and return fields, second gives usage instructions. No redundant words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, return fields, and usage guidance. With no parameters, rich annotations, and an output schema, the description is complete enough for the tool's simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so schema coverage is 100%. The description adds no parameter details because none are needed. Baseline of 4 is appropriate for a parameterless tool.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get'), the resource ('renter's MAILBOX.md'), and lists specific return fields. It distinguishes from siblings like 'get_mailbox' by focusing on standing instructions in markdown format.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use ('on startup'), what to do with the result ('cache the version'), and why ('must pass it to send_outbound_mail and update_action for sync verification'). Provides clear context and alternatives implicitly.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_outbound_mailA
Read-onlyIdempotent
Inspect

Get full details of an outbound mail job including recipient address, mail class, page count, cost breakdown, current status, fulfillment photos, and a time-limited signed URL to download the original PDF.

ParametersJSON Schema
NameRequiredDescriptionDefault
mail_idYesUUID of the outbound mail job to retrieve.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesFull outbound mail job details with signed document URL.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false; description adds valuable specifics about the response contents (e.g., time-limited signed URL, fulfillment photos), providing context beyond the safety profile.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence front-loaded with purpose and listing key details; no superfluous words. Efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool is simple (one required param), output schema exists (so return format not needed), and description covers all relevant aspects of the response and behavior. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description for mail_id; the tool description does not add extra meaning for the parameter but lists what the result includes, which indirectly informs the parameter's role. Baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description starts with 'Get full details of an outbound mail job,' clearly stating verb and resource, then enumerates specific attributes like recipient, cost, status, and photos, distinguishing it from get_inbound_mail or list_outbound_mail.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use when a mail_id is available and full job details are needed, but does not explicitly state when to use this vs. siblings like list_outbound_mail or get_inbound_mail, nor excludes scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_packageA
Read-onlyIdempotent
Inspect

Get full package details including photos, tracking events, shipping label data (carrier, addresses, weight), forwarding status, storage location, and action history.

ParametersJSON Schema
NameRequiredDescriptionDefault
package_idYesUUID of the package to retrieve.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesPackage details with photos, events, and extracted label data.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds context on what data is retrieved but does not contradict annotations. However, it does not go beyond what annotations provide in terms of behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single clear sentence that efficiently lists the included data categories. It is appropriately sized but could be slightly more concise by grouping related items.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of an output schema and the tool's clear retrieval nature, the description adequately communicates what the tool returns. However, it does not mention that other tools exist for partial data (e.g., get_package_photos).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter package_id, which is well-described in the schema. The description adds no additional meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'Get' and resource 'full package details' and lists the included data categories (photos, tracking events, etc.), clearly distinguishing it from siblings like list_packages and get_package_photos.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use when comprehensive package details are needed, but does not explicitly state when not to use it or mention alternatives such as get_package_photos for photo-only retrieval.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_package_photosA
Read-onlyIdempotent
Inspect

Get photos for a package with OCR-extracted text and confidence scores. Filter by photo type to get only exterior shots, label closeups, barcode scans, or content scans.

ParametersJSON Schema
NameRequiredDescriptionDefault
package_idYesUUID of the package to get photos for.
photo_typeNoFilter by photo type. "exterior" = package exterior, "label" = shipping label closeup, "barcode" = barcode scan, "content_scan" = opened package contents.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesPackage photo records with OCR metadata.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the description's disclosure of OCR text and confidence scores adds some context but no additional behavioral traits. With annotations covering safety, a score of 3 is appropriate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no wasted words. The main action is front-loaded, and the structure is efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 params, output schema exists), the description covers the core functionality and filtering. It doesn't elaborate on return format or pagination, but the output schema handles that. Adequate for the complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the description's mention of filtering by photo type adds minimal new meaning. The description slightly rephrases enum options but does not significantly enhance understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves photos for a package, including OCR-extracted text and confidence scores. It specifies filtering by photo type, making it distinct from sibling tools like get_package or get_scan_results.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide guidance on when to use this tool versus alternatives (e.g., get_package, get_scan_results). No explicit when-not-to-use or differentiation from siblings is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_postal_threadA
Read-onlyIdempotent
Inspect

Get one physical-mail thread with optional timeline events. Use this to explain how a generated outbound mail piece relates back to prior inbound scans and review decisions.

ParametersJSON Schema
NameRequiredDescriptionDefault
includeNoOptional expansions. Add events to include inbound/outbound timeline references.
thread_idYesUUID of the postal mail thread to retrieve.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesOne postal mail workflow thread.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, destructiveHint, and idempotentHint, so safety is clear. The description adds behavioral context by explaining that the tool can include optional timeline events and how the data relates inbound and outbound mail, beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Exactly two sentences, no wasted words. The first sentence defines the tool's core functionality, and the second provides usage context. Information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (2 parameters, output schema exists), the description fully covers what the tool does and when to use it. It is complete and leaves no ambiguity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds semantic meaning by mentioning 'timeline events' and the relationship between outbound and inbound mail, which enriches the parameter 'include' beyond its schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the verb 'Get' and the resource 'physical-mail thread', and explicitly distinguishes it from listing tools by stating 'one'. It also provides a specific use case, differentiating it from sibling tools like get_inbound_mail or get_outbound_mail.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'Use this to explain how a generated outbound mail piece relates back to prior inbound scans and review decisions.' It provides clear context, though it does not explicitly mention when not to use alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_scan_resultsA
Read-onlyIdempotent
Inspect

Get document scan results including raw OCR text, structured data fields (addresses, dates, amounts), and confidence scores. Returns empty if scan is still processing.

ParametersJSON Schema
NameRequiredDescriptionDefault
package_idYesUUID of the package to get scan results for.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesDocument scan records and OCR results.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, covering safety. The description adds the critical behavior that an empty result indicates ongoing processing, which is valuable for agent decision-making beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first concisely states what the tool returns, and the second notes the empty response during processing. No superfluous words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single parameter, output schema exists), the description covers all essential aspects: what is returned, the processing state, and implicit polling use case. It is complete for a read-only retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already provides a description for the sole parameter (package_id), achieving 100% coverage. The description adds no additional parameter semantics, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get document scan results' and lists the specific data types included (raw OCR text, structured data fields, confidence scores). It distinguishes from siblings like request_scan (which initiates scans) and get_package (which retrieves package details).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage after requesting a scan by noting 'returns empty if scan is still processing', providing clear context for polling. However, it does not explicitly mention when not to use or suggest alternatives like request_scan for initiating.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_usageA
Read-onlyIdempotent
Inspect

Get usage summary and billing events for a time period. Returns itemized events (scans, forwards, mail sends) with costs, plus period totals. Defaults to the current billing period if no dates are specified.

ParametersJSON Schema
NameRequiredDescriptionDefault
period_endNoEnd of the reporting period in ISO 8601 format. Defaults to now.
period_startNoStart of the reporting period in ISO 8601 format. Defaults to current billing period start.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesUsage and billing event records.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds value by detailing the returned data (itemized events, costs, totals) and default period behavior, providing useful context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no waste, front-loaded with main purpose, then default behavior. Highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (not shown but indicated), the description adequately covers what the tool does, what it returns, and defaults. No gaps for the intended use case.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for both parameters. The description adds the default behavior (current billing period) but doesn't add significant new semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves usage summary and billing events, specifying itemized events with costs and period totals. It uniquely addresses usage/billing, distinguishing it from sibling tools which focus on mail, packages, and other tasks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context but lacks explicit guidance on when not to use or alternatives. However, sibling tools are all different, so no confusion. Clear enough for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_facility_conversationsA
Read-onlyIdempotent
Inspect

List your active facility conversations with unread message counts and last message preview. Each conversation corresponds to one facility where you have a mailbox.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of conversations to return (1-100). Defaults to 20.
offsetNoNumber of conversations to skip for pagination. Defaults to 0.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesFacility conversations plus pagination.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds that it lists 'active' conversations and includes unread counts and last message preview, but does not elaborate on sorting or what 'active' means. Moderate additional context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the action and key outputs. Every word adds value; no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description need not detail return values. It covers the main purpose and content (unread counts, preview). However, it omits sorting or what 'active' entails, leaving minor gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with descriptions for both limit and offset. The description does not add any parameter-specific information beyond the schema defaults and limits, so it meets the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists active facility conversations, including unread counts and last message preview. It distinguishes from sibling tools like get_facility_messages (which retrieves messages within a conversation) and list_inbound_mail (individual mail items).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for viewing conversations across facilities but provides no explicit guidance on when to use this tool versus alternatives (e.g., get_facility_messages, list_inbound_mail). No exclusions or when-not-to-use context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_inbound_forwarding_addressesA
Read-onlyIdempotent
Inspect

List the renter’s private inbound forwarding aliases on forward.mailbox.bot. These are the unique intake email addresses an operator, assistant, provider, or external agent can forward scans, PDFs, photos, provider notices, notes, and other context-aware documents to so mailbox.bot can build OCR-backed inbound context. Forwarding/emailing attachments here initiates OCR/extraction; this tool discovers the address and does not upload files directly into OCR. The alias is member-scoped, so live and sandbox agent keys for the same member resolve to the same intake address.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesPrivate inbound forwarding email aliases.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, idempotentHint, destructiveHint. The description adds valuable context: the tool discovers addresses, does not initiate OCR, is member-scoped, and resolves to the same address across live/sandbox keys. This goes well beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is informative without being overly verbose. Each sentence adds meaning: purpose, use case, key differentiator, and scoping behavior. It is front-loaded and efficient, though slightly longer than necessary.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters and an existing output schema, the description covers all necessary aspects: purpose, use case, behavioral nuance, and member scoping. It is fully complete for an agent to decide when and how to use this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so schema coverage is 100%. The description does not need to add parameter info; its value is in explaining the tool's functionality. Baseline of 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists inbound forwarding aliases, defines their purpose (intake addresses for OCR), and distinguishes it from uploading files directly. The verb 'list' combined with specific resource 'inbound forwarding aliases' makes the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the tool's use case (discovering addresses for forwarding documents) and explicitly says it does not upload files, differentiating it from other tools. However, it does not list when not to use it or name specific alternatives, though the context is clear given the sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_inbound_mailA
Read-onlyIdempotent
Inspect

List forwarded inbound mail items captured from private forwarding aliases. Default output includes compact draft_context so an LLM or external agent can reason about OCR context, reply contact candidates, deadlines, and thread linkage before generating outbound mail.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of inbound items to return (1-100).
offsetNoNumber of inbound items to skip for pagination.
statusNoOptional inbound status filter.
includeNoOptional expansions. Defaults to ["drafting"]. Add ocr/lineage only when deeper provenance is needed.
categoryNoOptional category filter such as "Needs review" or "Loan / Mortgage".
thread_idNoOnly return inbound items linked to this postal mail thread.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesForwarded inbound mail items plus pagination.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, covering safety. The description adds that the default output includes compact draft_context for LLM reasoning, which provides mild behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences: first defines the primary function, second explains the default output and its purpose. No wasted words, front-loaded with essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With a fully described input schema and existence of an output schema, the description is largely complete. However, it does not mention pagination behavior or error cases, and the context around the 'include' parameter's default could be clearer. Still quite good.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds minimal parameter meaning beyond the schema, only hinting at the default include behavior. No extra semantic value for individual parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (List) and the resource (forwarded inbound mail items captured from private forwarding aliases), distinguishing it from sibling tools like list_outbound_mail or get_inbound_mail.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide guidance on when to use this tool versus alternatives, nor does it mention scenarios to avoid. It only explains the default output purpose, lacking explicit usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_outbound_mailA
Read-onlyIdempotent
Inspect

List outbound mail jobs with status tracking. Returns mail ID, recipient, mail class, status, cost, and timestamps. Filter by status to see pending, in-transit, or delivered mail.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of mail jobs to return (1-100). Defaults to 20.
offsetNoNumber of mail jobs to skip for pagination. Defaults to 0.
statusNoFilter by mail status. "pending_approval" = awaiting human approval, "submitted" = queued for facility, "ready" = printed and ready to mail, "mailed" = in transit, "delivered" = confirmed delivery, "failed" = delivery failed, "cancelled" = cancelled before mailing.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesOutbound mail job summaries.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true. Description adds context on return fields and filtering, consistent with safe read operation. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states purpose and return fields, second provides filtering guidance. No fluff, every word contributes value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (3 optional parameters, no required params, output schema exists), the description covers key aspects. Could mention pagination explicitly but offset/limit are in schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed parameter descriptions. Description mentions filtering by status but doesn't add new semantic detail beyond enum values. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists outbound mail jobs with status tracking, and specifies returned fields (mail ID, recipient, mail class, status, cost, timestamps). Differentiates from sibling tools like list_inbound_mail and get_outbound_mail.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides filtering guidance by status (pending, in-transit, delivered). Does not explicitly state when to use vs alternatives, but the context of listing versus getting individual items is implied by sibling names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_packagesA
Read-onlyIdempotent
Inspect

List inbound mail or packages for approved real mailing address/package beta accounts with optional filters by status, carrier, and date. Returns tracking number, carrier, status, and received timestamp where available. For generally available inbound postal context, use list_inbound_mail with forwarded scans/PDFs/notes instead.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of packages to return (1-100). Defaults to 20.
sinceNoOnly return packages received after this ISO 8601 date-time.
offsetNoNumber of packages to skip for pagination. Defaults to 0.
statusNoFilter by package lifecycle status. "received" = just arrived, "stored" = in facility storage, "forwarded" = shipped to forwarding address.
carrierNoFilter by shipping carrier.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesInbound package summaries.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. The description adds value by specifying return fields (tracking number, carrier, status, received timestamp) and the scope (approved accounts). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two sentences. It front-loads the purpose and filters, then adds return info and alternative. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists (not shown but noted), description covers return fields and filter options. It doesn't mention pagination explicitly, but the schema covers limit/offset. For a read-only list tool with good schema and annotations, it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description mentions optional filters by status, carrier, and date, but does not add detail beyond the schema. No additional parameter semantics provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists inbound mail/packages for specific accounts, with filters and return fields. It explicitly distinguishes from the sibling tool list_inbound_mail by noting the beta context and alternative features.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: use this for approved real mailing address/package beta accounts, and for generally available inbound postal context, use list_inbound_mail instead. This clearly indicates when and when not to use the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_postal_threadsA
Read-onlyIdempotent
Inspect

List physical-mail threads that group inbound mail context, human review, and outbound sends. Use this to understand which inbound items and outbound documents belong to the same business workflow.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of threads to return (1-100).
offsetNoNumber of threads to skip for pagination.
statusNoOptional thread status filter.
includeNoOptional expansions. Add events to include inbound/outbound timeline references.
categoryNoOptional category filter.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesPostal mail workflow threads plus pagination.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true, so the agent knows it's a safe read-only operation. The description adds context about grouping but no additional behavioral traits (e.g., pagination, rate limits). Given annotations cover the safety profile, a score of 3 is appropriate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long with zero wasted words. The first sentence defines the tool's function, and the second provides usage guidance. It is front-loaded and concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters, an output schema, and 29 sibling tools, the description adequately explains the core concept of grouping threads, which is essential for the agent. It does not cover edge cases or specific filter behaviors, but it is sufficient for a list operation with rich annotations and schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with all parameters having descriptions in the input schema. The tool description does not add extra meaning beyond what the schema already provides, so it meets the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'list' and resource 'physical-mail threads', and explains that these threads group inbound mail context, human review, and outbound sends. This differentiates it from siblings like list_inbound_mail and list_outbound_mail, which list individual items.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use this to understand which inbound items and outbound documents belong to the same business workflow', providing clear context for when to use the tool. However, it does not explicitly state when not to use it or mention alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

propose_mailbox_md_editAInspect

Propose changes to the renter's MAILBOX.md instructions with reasoning. The renter will see your suggestion in their dashboard and can accept, reject, or modify it. Use this when you observe patterns that could be codified into standing instructions.

ParametersJSON Schema
NameRequiredDescriptionDefault
reasonYesWhy this change is suggested (e.g. "Observed 5 Amazon packages this week, all forwarded manually — adding auto-forward rule").
suggested_contentYesFull proposed MAILBOX.md content (max 10,000 chars). Must include the complete document, not just the diff.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesCreated MAILBOX.md suggestion record.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that the suggestion is displayed to the renter for approval, and explains the interactive nature. With annotations providing no behavioral hints (all false), the description effectively communicates the non-destructive, proposal-based workflow.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the core action and outcome. No wasted words. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists (not shown), the description adequately covers the tool's purpose, usage, and behavior. The tool is simple with two parameters, and the description is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for both parameters. The description adds a concrete example for 'reason' but does not significantly extend beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Propose changes' on the specific resource 'MAILBOX.md instructions', and distinguishes from siblings by emphasizing that the renter must accept/reject/modify the proposal.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage context: 'Use this when you observe patterns that could be codified into standing instructions'. It does not explicitly list alternatives, but the context is clear and helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_expectedAInspect

Pre-register an expected inbound shipment so it is auto-matched when it arrives at the facility. Optionally specify an action to auto-execute on arrival (e.g. forward immediately, scan on receipt).

ParametersJSON Schema
NameRequiredDescriptionDefault
carrierNoShipping carrier (e.g. "fedex", "ups", "usps").
auto_actionNoAction to auto-execute when the package arrives.
descriptionNoHuman-readable description of the shipment (e.g. "Replacement laptop from Dell").
expected_byNoExpected arrival date in ISO 8601 format. Used for alerts if the package is late.
tracking_numberNoCarrier tracking number for the expected shipment.
auto_action_paramsNoParameters for the auto-action (e.g. forwarding address).

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesCreated expected shipment record.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide basic safety hints (not read-only, not destructive). The description adds behavioral context: auto-matching on arrival and optional auto-execution, though it could detail side effects like confirmation or updates.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no redundant information. The first sentence states the primary purpose, the second adds optional detail. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 6 parameters (0 required) and an output schema, the description covers the core functionality. It does not mention that all parameters are optional, which could help agents, but it is otherwise complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents each parameter. The description only adds a brief example ('e.g. forward immediately, scan on receipt') for auto_action, not significantly enhancing meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Pre-register' and resource 'expected inbound shipment', with specific outcome 'auto-matched when it arrives'. This distinguishes it from sibling tools like list_packages or create_rule.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies when to use (before arrival) and what optional actions can be set, but does not explicitly mention when not to use or alternatives among siblings (e.g., using rules for auto-actions).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_actionA
Destructive
Inspect

Request a physical action on a package at the facility. Actions include forwarding to another address, shredding, scanning documents, holding for pickup, disposing, returning to sender, photographing, opening and scanning contents, or recording a video. Some actions (shred, dispose) are irreversible.

ParametersJSON Schema
NameRequiredDescriptionDefault
actionYesAction to perform. "forward" = ship to another address, "shred" = destroy (irreversible), "scan" = OCR document scan, "hold" = keep in storage, "dispose" = discard (irreversible), "return_to_sender" = send back, "photograph" = take photos, "open_and_scan" = open package and scan contents, "record_video" = video recording of package.
priorityNoProcessing priority. "urgent" = same-day processing, "high" = next business day, "normal" = standard queue, "low" = when convenient.normal
package_idYesUUID of the package to act on.
parametersNoAction-specific parameters. For "forward": { address, city, state, zip }. For "scan": { scan_type }. For "hold": { until_date }.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesCreated facility action request record.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description supplements annotations by noting that 'shred' and 'dispose' are irreversible, aligning with destructiveHint=true. It does not elaborate on other behavioral traits (e.g., authorization, rate limits), but annotations already cover key safety hints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences cover all essential information: the tool's purpose and a critical behavioral note (irreversibility). Every sentence earns its place without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema (not shown), the description appropriately omits return value details. It covers key aspects: action types, irreversibility, and parameter structure. Minor gap: no mention of prerequisites (e.g., package ownership) but acceptable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with each parameter already well-described (action enum with explanations, priority levels, package_id format, parameters object with examples). The tool description restates action types but adds minimal new meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: requesting a physical action on a package at a facility, listing nine specific actions. It distinguishes itself from sibling tools like 'request_scan' (which is specific to scanning) and other read/mutation tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lists available actions but gives no guidance on when to use this tool versus alternatives (e.g., when to use 'forward' vs 'scan' vs 'shred'). No explicit when-to-use or when-not-to-use context is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_scanAInspect

Request document scanning (OCR + structured data extraction) for a package. The facility will scan the document and extract text, addresses, dates, and other structured data. Results are available via get_scan_results after processing.

ParametersJSON Schema
NameRequiredDescriptionDefault
scan_typeNoType of scan. "label" = shipping label only, "envelope" = exterior envelope, "document" = full document OCR, "content" = opened package contents.document
package_idYesUUID of the package to scan.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesCreated scan request record.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses asynchronous behavior ('Results are available via get_scan_results after processing'), adding value beyond annotations which only indicate non-read-only and non-destructive. It could mention idempotency or timing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two focused sentences: first states the action, second explains the outcome and next step. No unnecessary words; every sentence is valuable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists, the description appropriately omits return details. It covers the tool's purpose, process (async), and result retrieval, making it complete for a two-parameter request tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the description adds meaning by explaining what the scan extracts (text, addresses, dates). This goes beyond the schema's type definitions, justifying a score above baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool requests document scanning with OCR and structured data extraction. It distinguishes itself from sibling tools like get_scan_results, which retrieves results, and other package-related tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage before get_scan_results, providing a clear workflow. However, it does not explicitly state when not to use this tool or compare to alternatives beyond the result retrieval.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_facility_messageAInspect

Send a message to the facility operator managing your mailbox. Messages appear in the shared conversation visible to you, the renter, and the facility. Optionally link the message to a specific package or action request for context.

ParametersJSON Schema
NameRequiredDescriptionDefault
bodyYesMessage text (1-5000 characters).
package_idNoOptional: link this message to a specific package for context.
facility_idYesThe facility to message. Get this from the get_mailbox response.
action_request_idNoOptional: link this message to an action request for context.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesSent facility message identifiers and body.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate a write operation (readOnlyHint=false) and non-destructive (destructiveHint=false). The description adds that messages appear in a shared conversation visible to renter and facility, and can be linked to packages/actions. This provides useful behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no wasted words. The first sentence states the core action, the second adds optional context. Well-structured and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main purpose, optional parameters, and visibility. It does not mention character limits or error cases, but the schema handles character limit and output schema exists for return values. Adequate for a messaging tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all parameters. The description adds context about optional linking and visibility, but the schema already explains the parameters well. The added value is marginal.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it sends a message to the facility operator for the mailbox, specifies the audience and visibility, and mentions optional linking to packages/action requests. This distinguishes it from sibling tools like get_facility_messages.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for communicating with the facility operator about packages or actions, but does not explicitly state when to use this tool versus alternatives like add_note or when not to use it. No exclusions or comparisons provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_outbound_mailAInspect

Submit a document for printing and postal mailing by the facility. Supported formats: PDF, DOCX, JPG, PNG, TXT, CSV. The document is stored securely and printed by the facility operator. IMPORTANT: With a production key (sk_agent_), this immediately charges the member's card on file. Use dry_run=true to preview cost before committing, or requires_approval=true to defer until human approval. Sandbox keys (sk_agent_test_) skip billing entirely. Optionally attach the outbound mail to inbound context with inbound_capture_id and postal_mail_thread_id so lineage stays explicit.

ParametersJSON Schema
NameRequiredDescriptionDefault
colorNoPrint in color. Adds a per-page color surcharge.
duplexNoPrint double-sided to reduce page count and postage.
dry_runNoValidate inputs and return cost breakdown without creating a record or charging. Use to preview cost before committing.
metadataNoArbitrary key-value pairs echoed in GET responses and webhooks. Recommended convention: { "workflow_id": "wf_123", "reason": "Customer cancellation", "correlation_id": "abc" }.
mail_classNoUSPS mail class. "first_class" = 3-5 days, "priority" = 1-3 days, "certified" = with tracking and proof of mailing, "certified_return_receipt" = certified with signed delivery confirmation.first_class
package_idNoLink this mail to an inbound package (e.g. replying to received correspondence).
page_countNoExplicit page count for non-PDF documents when exact pagination is known. When supplied for DOCX, TXT, or CSV, it overrides local detection and makes pricing deterministic.
return_zipNoReturn address ZIP code. Defaults to member profile if omitted.
agent_notesNoInstructions for the facility operator (e.g. "Time-sensitive — mail today").
return_cityNoReturn address city. Defaults to member profile if omitted.
return_nameNoReturn address name. Defaults to the member's profile name if omitted.
return_line1NoReturn address line 1. Defaults to member profile if omitted.
return_line2NoReturn address line 2 (suite, unit, etc.).
return_stateNoReturn address state (2-letter code). Defaults to member profile if omitted.
recipient_zipYes5 or 5+4 digit ZIP code (e.g. "90210" or "90210-1234").
max_cost_centsNoCost cap in cents. If the calculated cost exceeds this, the request is rejected with 422 before any charge. Prevents accidental expensive mailings.
recipient_cityYesRecipient city.
recipient_nameYesFull name of the mail recipient.
document_base64YesBase64-encoded document file. Supported formats: PDF, DOCX, JPG, PNG, TXT, CSV. Max 10MB decoded.
recipient_line1YesStreet address line 1 of the recipient.
recipient_line2NoStreet address line 2 (apartment, suite, unit, etc.).
recipient_stateYes2-letter US state code (e.g. CA, NY, TX).
document_filenameNoOriginal filename with extension (e.g. "letter.docx"). Required for reliable non-PDF format detection.
recipient_countryNoISO 3166-1 alpha-2 country code. Defaults to "US".US
requires_approvalNoIf true, the renter must approve in their dashboard before the mail is printed and sent.
inbound_capture_idNoOptional inbound mail item this outbound piece is replying to. Recommended when drafting from OCR/forwarded-mail context.
mailbox_md_versionYesYour current MAILBOX.md version (from get_mailbox_md). Required for sync verification.
postal_mail_thread_idNoOptional physical-mail thread to attach this outbound mail to. Lets agents keep inbound and outbound activity in one durable workflow.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesSubmitted outbound mail job or dry-run cost preview.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint=false), the description discloses that with a production key the tool immediately charges the member's card, and that documents are stored securely and printed by facility operators. It also explains behavior for dry_run, requires_approval, and max_cost_cents. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences, front-loaded with the main action, then format support, then important billing/approval details, and finally optional threading. Every sentence contributes essential information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (28 parameters, 7 required, nested objects, output schema exists), the description covers key behavioral aspects: billing, format, approval, threading. It does not detail the output but the output schema exists. Adequate for an agent to understand when and how to use the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining how key parameters (dry_run, requires_approval, inbound_capture_id, postal_mail_thread_id) are used in context, beyond what the schema provides. For example, it links sandbox keys to billing skip and dry_run to cost preview.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Submit a document for printing and postal mailing by the facility' and lists supported formats. The title 'Send Outbound Mail' reinforces this. It distinguishes from sibling tools like create_test_outbound_mail by emphasizing production vs. sandbox key behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use sandbox keys (testing) vs. production keys (real charges), and how to use dry_run for preview or requires_approval for approval workflows. However, it does not explicitly differentiate from sibling tools like advance_test_outbound_mail, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_actionA
Idempotent
Inspect

Push notes, structured data, or a clarification response to an existing action request. Use this to add agent reasoning, attach extracted data, or respond when the facility asks for clarification. Requires mailbox_md_version to prove your MAILBOX.md instructions are in sync.

ParametersJSON Schema
NameRequiredDescriptionDefault
action_idYesThe action request ID to update.
agent_dataNoStructured data to attach (e.g. OCR results, extracted fields, classification labels).
agent_notesNoFree-text notes from the agent (e.g. "Forwarding per standing rule #3").
decision_contextNoLink this decision to a specific MAILBOX.md instruction for auditability.
mailbox_md_versionYesYour current MAILBOX.md version (from get_mailbox_md). Required for sync verification.
respond_to_clarificationNoResponse text when action status is needs_clarification. Providing this auto-resumes the action to in_progress.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesUpdated facility action request record.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true and non-destructive. The description adds valuable behavioral context: the need for mailbox_md_version to prove sync, and the auto-resume behavior when respond_to_clarification is provided. This goes beyond the annotations without contradicting them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loading the core purpose and then adding the critical prerequisite. Every sentence provides necessary information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, nested objects, output schema), the description covers the essential context: purpose, usage scenarios, and a key requirement. Return values are not described, but an output schema exists to fill that gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all 6 parameters. The description adds extra meaning by explaining how parameters are used (e.g., decision_context links to MAILBOX.md sections, respond_to_clarification auto-resumes the action). This adds value beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: pushing notes, structured data, or clarification responses to an existing action request. It also provides specific use cases (add agent reasoning, attach extracted data, respond to clarification), distinguishing it from siblings like request_action (which creates actions) and add_note (which adds notes to mail).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool (to add reasoning, attach data, or respond to clarification) and includes a critical requirement (mailbox_md_version for sync verification). While it does not list alternative tools or when not to use, the context is clear enough for an agent to decide.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_webhookA
Idempotent
Inspect

Configure webhook endpoint URL and event subscriptions for real-time notifications. Events include package.received, package.status_changed, action.completed, mail.status_changed, and more. The endpoint must use HTTPS and respond with 2xx within 10 seconds.

ParametersJSON Schema
NameRequiredDescriptionDefault
enabledNoSet to false to pause webhook delivery without removing the URL.
event_typesNoArray of event types to subscribe to (e.g. ["package.received", "mail.status_changed"]). Empty array disables all events.
webhook_urlNoHTTPS URL to receive webhook POST requests. Must respond with 2xx within 10 seconds.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultYesWebhook configuration status.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide idempotentHint and non-destructive hints. The description adds the constraint that the endpoint must use HTTPS and respond within 10 seconds, which is valuable beyond annotations. However, it does not detail other behaviors like rate limits or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, then key constraints. Every sentence adds value; no wasted words. Ideal conciseness for a tool description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With an output schema and full annotations, the description is mostly complete. It covers purpose, key constraints, and event examples. Minor gap: does not explain if the tool creates or updates, but the idempotent hint and name imply upsert.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description does not add significant meaning beyond the schema; it repeats event examples already implied by the event_types description. No additional parameter semantics are provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool configures webhook endpoint URL and event subscriptions, with specific examples of events. It distinguishes from sibling tools by being the only webhook-related tool, and the verb 'configure' aligns with the tool name 'update_webhook'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, or when not to use it. Since there are no sibling webhook tools, the agent still lacks context on whether this creates or updates, and no prerequisites or scenarios are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources