mailbox
Server Details
Physical mail API for AI agents. Send letters, certified mail. Sandbox + live keys via MCP.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.2/5 across 29 of 29 tools scored. Lowest: 3.6/5.
Most tools have clearly distinct purposes, though list_inbound_mail and list_packages could be confused without careful reading of descriptions. Overall, the set is well-differentiated.
All tool names follow a consistent verb_noun pattern with underscores, making the API predictable and easy to navigate.
With 29 tools, the server is overly heavy for a typical MCP server. The domain might justify many operations, but this exceeds the recommended range for coherence.
The tool surface covers core workflows (create, read, list, request actions) but lacks updates and deletions for packages, rules, and tags, leaving notable gaps.
Available Tools
29 toolsadd_noteAInspect
Add an observation or context note to a package. Notes are visible to the facility operator and the renter. Use for recording decisions, observations, or agent reasoning.
| Name | Required | Description | Default |
|---|---|---|---|
| note | Yes | Note text (e.g. "Appears to be the replacement GPU from RMA #4521"). | |
| metadata | No | Optional structured metadata attached to the note (e.g. { "rma_number": "4521", "vendor": "NVIDIA" }). | |
| package_id | Yes | UUID of the package to annotate. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Created package note record. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds visibility scope ('visible to facility operator and renter') beyond annotations, which are minimal.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded, no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Fully covers the creation and usage of notes; output schema exists, so return details are unnecessary.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% so baseline 3; description doesn't add new parameter info beyond what's in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Add' and resource 'note to a package' differentiate from sibling 'add_tag'. Includes visibility context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
States 'Use for recording decisions, observations, or agent reasoning' but lacks explicit when-not-to-use or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
add_tagAIdempotentInspect
Add a tag/label to a package for categorization and filtering. Tags are free-form strings. Adding the same tag twice is a no-op.
| Name | Required | Description | Default |
|---|---|---|---|
| tag | Yes | Tag name (e.g. "hardware-order", "urgent", "return-needed"). Free-form, case-sensitive. | |
| package_id | Yes | UUID of the package to tag. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Created or existing package tag record. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description mentions the free-form nature of tags and the no-op behavior on duplicates, which complements the idempotentHint annotation. However, it does not disclose other behavioral traits such as error handling for invalid package_ids or any side effects beyond the idempotency already annotated.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise (two sentences), front-loaded with the action, and contains no unnecessary words. Every sentence contributes meaning.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 parameters, output schema exists), the description covers the key aspects: purpose, parameter behavior, and idempotency. It lacks details on error cases or return values, but the output schema likely covers that. Still, it could be slightly more comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by explaining the free-form nature of the 'tag' parameter and the no-op behavior, which goes beyond the schema's property descriptions. For 'package_id', no additional information is provided beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Add a tag/label'), the resource ('to a package'), and the purpose ('for categorization and filtering'). It also specifies that tags are free-form strings and that adding the same tag twice is a no-op, which distinguishes it from other sibling tools like add_note.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides context for when to use the tool (categorization and filtering) and notes the idempotent behavior (no-op on duplicate). However, it does not explicitly compare to alternatives or state when not to use it, leaving room for ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
advance_test_outbound_mailAInspect
Advance a test_mode outbound mail record one lifecycle step and queue the matching webhook. submitted becomes ready with simulated pages/envelope photos; ready becomes mailed with tracking, carrier, dispatch method, and receipt photo; mailed becomes delivered.
| Name | Required | Description | Default |
|---|---|---|---|
| mail_id | Yes | UUID of the test_mode outbound mail record to advance. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Advanced sandbox outbound mail job and webhook status. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are minimal (readOnlyHint false, destructiveHint false), so the description carries the burden. It details each state transition (e.g., 'submitted becomes ready with simulated pages/envelope photos') and mentions queuing webhooks, adding value beyond structured fields.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single well-structured sentence that front-loads the primary action and then details state transitions. It is concise but could benefit from splitting into multiple sentences or bullet points for improved readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has an output schema, the description adequately covers the lifecycle steps and side effect (webhook queue). However, it does not explicitly state the precondition that the record must be in test_mode, which could be inferred but is not explicit.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a clear description of the mail_id parameter. The tool description does not add extra semantic meaning beyond what the schema already provides, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool advances a test_mode outbound mail record one lifecycle step, specifying the verb (advance) and resource (test_mode outbound mail record). It differentiates from siblings like create_test_outbound_mail (creation) or get_outbound_mail (read).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage in testing scenarios but does not provide explicit guidance on when to use this tool versus alternatives. No exclusions or alternatives are mentioned, relying on implied context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_ruleAInspect
Create a standing instruction that auto-triggers actions when incoming packages match conditions. Rules run on every new package and execute the specified action if all conditions match. Use requires_approval to add a human review step before execution.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Human-readable rule name (e.g. "Forward Amazon packages", "Shred junk mail"). | |
| conditions | Yes | Conditions that must ALL match for the rule to trigger. | |
| action_type | Yes | Action to auto-trigger when conditions match. | |
| action_params | Yes | Parameters for the action (e.g. forwarding address for "forward", scan_type for "scan"). | |
| requires_approval | No | If true, matched packages require human approval before the action executes. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Created standing rule record. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description explains that rules run on every new package and execute actions, but does not disclose potential side effects, limits, or execution details. Annotations provide readOnlyHint=false, so description adds some context but not extensive transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three well-structured sentences: purpose, behavior, and usage tip. No superfluous words, front-loaded with key information. Slightly more could be added but remains concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity and the presence of an output schema, the description covers the core purpose and a key feature (requires_approval). It omits details about action_params and return values, but those are likely covered by the schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema coverage is 100%, so baseline is 3. The description adds minimal parameter information beyond the schema, only mentioning requires_approval. The schema already provides detailed descriptions for all parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool creates a standing instruction that auto-triggers actions on matching packages. It uses specific verbs and resources ('Create a standing instruction') and distinguishes itself from sibling tools by focusing on automated rules rather than manual actions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions when to use requires_approval but does not explicitly provide guidance on when to use this tool versus alternatives like request_action. Usage is implied but not fully clarified.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_test_outbound_mailAInspect
Create a sandbox outbound mail record without uploading a real document. The record is always test_mode=true, cost_cents=0, includes estimated_live_cost_cents and cost_breakdown, and queues a mail.submitted webhook. Use with a sandbox key to rehearse outbound workflows before sending real physical mail.
| Name | Required | Description | Default |
|---|---|---|---|
| color | No | Whether to include color-print surcharge in the live estimate. | |
| metadata | No | Arbitrary metadata echoed in responses and webhooks. | |
| mail_class | No | Mail class to simulate. | first_class |
| page_count | No | Simulated page count used for pricing. | |
| agent_notes | No | Optional facility/operator notes for the simulated mailpiece. | |
| recipient_zip | No | Recipient ZIP code. Affects estimated live postage. | 94105 |
| recipient_city | No | Recipient city. | San Francisco |
| recipient_name | No | Recipient name for the simulated mailpiece. | Test Recipient |
| recipient_line1 | No | Recipient street line 1. | 123 Test Street |
| recipient_state | No | Recipient 2-letter state code. | CA |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Created sandbox outbound mail job and webhook status. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Details that records are always test_mode=true, cost_cents=0, include live estimate, and trigger webhooks. Annotations confirm non-destructive, non-readOnly. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first states core function, second gives usage guidance. Front-loaded and efficient with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With output schema present and description mentioning key outputs (estimated cost, webhook), the tool is fully contextualized for testing purposes.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline 3. Description adds overall context but no parameter-specific details beyond what schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool creates a sandbox outbound mail record without uploading a real document, distinguishing it from real mail sending tools like 'send_outbound_mail'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises using with a sandbox key to rehearse outbound workflows before real mail, providing clear context. Does not list alternatives but sibling tools imply 'send_outbound_mail' for real use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_facility_messagesARead-onlyIdempotentInspect
Read the message thread with a specific facility. Returns messages in reverse chronological order with sender role (member, facility, agent). Supports cursor-based pagination. Automatically marks facility messages as read.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of messages to return (1-100). Defaults to 50. | |
| before | No | Cursor: only return messages sent before this ISO 8601 timestamp. Use the oldest message timestamp from the previous page. | |
| facility_id | Yes | UUID of the facility whose conversation to read. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Messages exchanged with a facility. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, implying no state mutation, but the description states 'Automatically marks facility messages as read,' which is a mutation. This is a direct contradiction. Without this issue, the description would add useful behavior details, but the contradiction severely undermines transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each informative and no redundancy. Front-loaded with purpose, then details on order, roles, pagination, and side effect. Highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and clear parameter descriptions, the description covers key aspects: purpose, pagination, ordering, sender roles, and a side effect. The mark-as-read behavior could use more detail (e.g., reversibility), but the output schema likely fills gaps. Slightly above adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions. The description adds context about pagination (cursor-based) but does not significantly expand on parameter meaning beyond what the schema provides. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states a specific verb ('Read') and resource ('message thread with a specific facility'). It distinguishes from sibling tools like 'send_facility_message' (write) and 'list_facility_conversations' (list, not messages). Explicitly details return order and sender role information.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates when to use (read a facility's message thread) and provides usage details like cursor-based pagination and automatic marking as read. It does not explicitly exclude alternatives or state when not to use, but the sibling context clarifies the landscape.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_inbound_mailARead-onlyIdempotentInspect
Get one forwarded inbound mail item with compact draft_context by default. Use this before drafting an outbound reply when you need sender context, reply contact candidates, deadline clues, source files, and thread linkage in one stable payload.
| Name | Required | Description | Default |
|---|---|---|---|
| include | No | Optional expansions. Defaults to ["drafting"]. Add signed_urls only when the agent truly needs temporary file access. | |
| signed_urls | No | If true, return short-lived signed URLs for stored files. | |
| inbound_mail_id | Yes | UUID of the inbound mail item to retrieve. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | One forwarded inbound mail item. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint as safe. The description adds behavioral nuance: it returns a 'compact draft_context by default' and describes the payload as 'stable' (reinforcing idempotency). This adds value beyond annotations without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, each earning its place. The first sentence states the core function and default behavior. The second provides actionable usage context. No wasted words; critical information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (optional expansions, default behavior) and the presence of an output schema (which documents return structure), the description covers all essential aspects: purpose, when to use, parameter guidance, and behavioral expectations. It is fully adequate for an agent to select and invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all three parameters. The description adds extra guidance: 'Add signed_urls only when the agent truly needs temporary file access' and implies the default include is ['drafting']. This goes beyond what the schema provides, adding usage context for parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get one forwarded inbound mail item with compact draft_context by default.' This specifies the verb (get), resource (inbound mail item), and scope (one, forwarded). It effectively distinguishes from sibling tools like list_inbound_mail (for multiple) and get_outbound_mail (different direction).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly recommends using this tool 'before drafting an outbound reply when you need sender context, reply contact candidates, deadline clues, source files, and thread linkage in one stable payload.' It provides clear context-specific guidance and implies when not to use it (if those elements are unneeded). This is excellent usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_mailboxARead-onlyIdempotentInspect
Get your agent's real mailing address beta endpoint when the account has explicit beta access: street address + mailbox number for approved accounts. For generally available inbound context, use list_inbound_forwarding_addresses instead; that returns a private intake alias for scans, PDFs, photos, provider notices, and notes from addresses the operator already uses.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Mailbox address, facility, and status details. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false, so the safety profile is clear. The description adds that it's a beta endpoint requiring explicit beta access and approved accounts, which is valuable behavioral context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, then alternative. No wasted words. Highly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool has no parameters and an output schema exists. Description explains what is returned and the access condition, which is sufficient for a simple read tool with good annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so schema coverage is 100%. Description does not need to add param info; baseline of 4 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns the agent's real mailing address (street address + mailbox number) for approved accounts with explicit beta access. It distinguishes from sibling list_inbound_forwarding_addresses by specifying the alternative use case.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use (beta access, real address) and when not to (generally available inbound context), directing to list_inbound_forwarding_addresses as the alternative. No ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_mailbox_mdARead-onlyIdempotentInspect
Get the renter's MAILBOX.md standing instructions for this agent. Returns the full instruction text, version number, content hash, and last update timestamp. Call this on startup and cache the version — you must pass it to send_outbound_mail and update_action for sync verification.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Current MAILBOX.md standing instructions. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnly, idempotent, and non-destructive. The description adds beyond that by specifying return fields and caching behavior, but does not mention any other behavioral traits like authorization needs or failure modes.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first states purpose and return fields, second gives usage instructions. No redundant words, front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, return fields, and usage guidance. With no parameters, rich annotations, and an output schema, the description is complete enough for the tool's simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so schema coverage is 100%. The description adds no parameter details because none are needed. Baseline of 4 is appropriate for a parameterless tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get'), the resource ('renter's MAILBOX.md'), and lists specific return fields. It distinguishes from siblings like 'get_mailbox' by focusing on standing instructions in markdown format.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use ('on startup'), what to do with the result ('cache the version'), and why ('must pass it to send_outbound_mail and update_action for sync verification'). Provides clear context and alternatives implicitly.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_outbound_mailARead-onlyIdempotentInspect
Get full details of an outbound mail job including recipient address, mail class, page count, cost breakdown, current status, fulfillment photos, and a time-limited signed URL to download the original PDF.
| Name | Required | Description | Default |
|---|---|---|---|
| mail_id | Yes | UUID of the outbound mail job to retrieve. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Full outbound mail job details with signed document URL. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false; description adds valuable specifics about the response contents (e.g., time-limited signed URL, fulfillment photos), providing context beyond the safety profile.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence front-loaded with purpose and listing key details; no superfluous words. Efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool is simple (one required param), output schema exists (so return format not needed), and description covers all relevant aspects of the response and behavior. No gaps identified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a clear description for mail_id; the tool description does not add extra meaning for the parameter but lists what the result includes, which indirectly informs the parameter's role. Baseline score applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description starts with 'Get full details of an outbound mail job,' clearly stating verb and resource, then enumerates specific attributes like recipient, cost, status, and photos, distinguishing it from get_inbound_mail or list_outbound_mail.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use when a mail_id is available and full job details are needed, but does not explicitly state when to use this vs. siblings like list_outbound_mail or get_inbound_mail, nor excludes scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_packageARead-onlyIdempotentInspect
Get full package details including photos, tracking events, shipping label data (carrier, addresses, weight), forwarding status, storage location, and action history.
| Name | Required | Description | Default |
|---|---|---|---|
| package_id | Yes | UUID of the package to retrieve. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Package details with photos, events, and extracted label data. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds context on what data is retrieved but does not contradict annotations. However, it does not go beyond what annotations provide in terms of behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single clear sentence that efficiently lists the included data categories. It is appropriately sized but could be slightly more concise by grouping related items.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the existence of an output schema and the tool's clear retrieval nature, the description adequately communicates what the tool returns. However, it does not mention that other tools exist for partial data (e.g., get_package_photos).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter package_id, which is well-described in the schema. The description adds no additional meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb 'Get' and resource 'full package details' and lists the included data categories (photos, tracking events, etc.), clearly distinguishing it from siblings like list_packages and get_package_photos.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use when comprehensive package details are needed, but does not explicitly state when not to use it or mention alternatives such as get_package_photos for photo-only retrieval.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_package_photosARead-onlyIdempotentInspect
Get photos for a package with OCR-extracted text and confidence scores. Filter by photo type to get only exterior shots, label closeups, barcode scans, or content scans.
| Name | Required | Description | Default |
|---|---|---|---|
| package_id | Yes | UUID of the package to get photos for. | |
| photo_type | No | Filter by photo type. "exterior" = package exterior, "label" = shipping label closeup, "barcode" = barcode scan, "content_scan" = opened package contents. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Package photo records with OCR metadata. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the description's disclosure of OCR text and confidence scores adds some context but no additional behavioral traits. With annotations covering safety, a score of 3 is appropriate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with no wasted words. The main action is front-loaded, and the structure is efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 params, output schema exists), the description covers the core functionality and filtering. It doesn't elaborate on return format or pagination, but the output schema handles that. Adequate for the complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the description's mention of filtering by photo type adds minimal new meaning. The description slightly rephrases enum options but does not significantly enhance understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves photos for a package, including OCR-extracted text and confidence scores. It specifies filtering by photo type, making it distinct from sibling tools like get_package or get_scan_results.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide guidance on when to use this tool versus alternatives (e.g., get_package, get_scan_results). No explicit when-not-to-use or differentiation from siblings is given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_postal_threadARead-onlyIdempotentInspect
Get one physical-mail thread with optional timeline events. Use this to explain how a generated outbound mail piece relates back to prior inbound scans and review decisions.
| Name | Required | Description | Default |
|---|---|---|---|
| include | No | Optional expansions. Add events to include inbound/outbound timeline references. | |
| thread_id | Yes | UUID of the postal mail thread to retrieve. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | One postal mail workflow thread. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, destructiveHint, and idempotentHint, so safety is clear. The description adds behavioral context by explaining that the tool can include optional timeline events and how the data relates inbound and outbound mail, beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Exactly two sentences, no wasted words. The first sentence defines the tool's core functionality, and the second provides usage context. Information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (2 parameters, output schema exists), the description fully covers what the tool does and when to use it. It is complete and leaves no ambiguity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. The description adds semantic meaning by mentioning 'timeline events' and the relationship between outbound and inbound mail, which enriches the parameter 'include' beyond its schema description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the verb 'Get' and the resource 'physical-mail thread', and explicitly distinguishes it from listing tools by stating 'one'. It also provides a specific use case, differentiating it from sibling tools like get_inbound_mail or get_outbound_mail.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool: 'Use this to explain how a generated outbound mail piece relates back to prior inbound scans and review decisions.' It provides clear context, though it does not explicitly mention when not to use alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_scan_resultsARead-onlyIdempotentInspect
Get document scan results including raw OCR text, structured data fields (addresses, dates, amounts), and confidence scores. Returns empty if scan is still processing.
| Name | Required | Description | Default |
|---|---|---|---|
| package_id | Yes | UUID of the package to get scan results for. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Document scan records and OCR results. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, covering safety. The description adds the critical behavior that an empty result indicates ongoing processing, which is valuable for agent decision-making beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences: the first concisely states what the tool returns, and the second notes the empty response during processing. No superfluous words, front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (single parameter, output schema exists), the description covers all essential aspects: what is returned, the processing state, and implicit polling use case. It is complete for a read-only retrieval tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already provides a description for the sole parameter (package_id), achieving 100% coverage. The description adds no additional parameter semantics, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get document scan results' and lists the specific data types included (raw OCR text, structured data fields, confidence scores). It distinguishes from siblings like request_scan (which initiates scans) and get_package (which retrieves package details).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage after requesting a scan by noting 'returns empty if scan is still processing', providing clear context for polling. However, it does not explicitly mention when not to use or suggest alternatives like request_scan for initiating.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_usageARead-onlyIdempotentInspect
Get usage summary and billing events for a time period. Returns itemized events (scans, forwards, mail sends) with costs, plus period totals. Defaults to the current billing period if no dates are specified.
| Name | Required | Description | Default |
|---|---|---|---|
| period_end | No | End of the reporting period in ISO 8601 format. Defaults to now. | |
| period_start | No | Start of the reporting period in ISO 8601 format. Defaults to current billing period start. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Usage and billing event records. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds value by detailing the returned data (itemized events, costs, totals) and default period behavior, providing useful context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no waste, front-loaded with main purpose, then default behavior. Highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (not shown but indicated), the description adequately covers what the tool does, what it returns, and defaults. No gaps for the intended use case.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters. The description adds the default behavior (current billing period) but doesn't add significant new semantics beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves usage summary and billing events, specifying itemized events with costs and period totals. It uniquely addresses usage/billing, distinguishing it from sibling tools which focus on mail, packages, and other tasks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context but lacks explicit guidance on when not to use or alternatives. However, sibling tools are all different, so no confusion. Clear enough for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_facility_conversationsARead-onlyIdempotentInspect
List your active facility conversations with unread message counts and last message preview. Each conversation corresponds to one facility where you have a mailbox.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of conversations to return (1-100). Defaults to 20. | |
| offset | No | Number of conversations to skip for pagination. Defaults to 0. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Facility conversations plus pagination. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds that it lists 'active' conversations and includes unread counts and last message preview, but does not elaborate on sorting or what 'active' means. Moderate additional context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the action and key outputs. Every word adds value; no redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description need not detail return values. It covers the main purpose and content (unread counts, preview). However, it omits sorting or what 'active' entails, leaving minor gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage with descriptions for both limit and offset. The description does not add any parameter-specific information beyond the schema defaults and limits, so it meets the baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists active facility conversations, including unread counts and last message preview. It distinguishes from sibling tools like get_facility_messages (which retrieves messages within a conversation) and list_inbound_mail (individual mail items).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for viewing conversations across facilities but provides no explicit guidance on when to use this tool versus alternatives (e.g., get_facility_messages, list_inbound_mail). No exclusions or when-not-to-use context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_inbound_forwarding_addressesARead-onlyIdempotentInspect
List the renter’s private inbound forwarding aliases on forward.mailbox.bot. These are the unique intake email addresses an operator, assistant, provider, or external agent can forward scans, PDFs, photos, provider notices, notes, and other context-aware documents to so mailbox.bot can build OCR-backed inbound context. Forwarding/emailing attachments here initiates OCR/extraction; this tool discovers the address and does not upload files directly into OCR. The alias is member-scoped, so live and sandbox agent keys for the same member resolve to the same intake address.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Private inbound forwarding email aliases. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint, idempotentHint, destructiveHint. The description adds valuable context: the tool discovers addresses, does not initiate OCR, is member-scoped, and resolves to the same address across live/sandbox keys. This goes well beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is informative without being overly verbose. Each sentence adds meaning: purpose, use case, key differentiator, and scoping behavior. It is front-loaded and efficient, though slightly longer than necessary.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters and an existing output schema, the description covers all necessary aspects: purpose, use case, behavioral nuance, and member scoping. It is fully complete for an agent to decide when and how to use this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so schema coverage is 100%. The description does not need to add parameter info; its value is in explaining the tool's functionality. Baseline of 4 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists inbound forwarding aliases, defines their purpose (intake addresses for OCR), and distinguishes it from uploading files directly. The verb 'list' combined with specific resource 'inbound forwarding aliases' makes the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains the tool's use case (discovering addresses for forwarding documents) and explicitly says it does not upload files, differentiating it from other tools. However, it does not list when not to use it or name specific alternatives, though the context is clear given the sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_inbound_mailARead-onlyIdempotentInspect
List forwarded inbound mail items captured from private forwarding aliases. Default output includes compact draft_context so an LLM or external agent can reason about OCR context, reply contact candidates, deadlines, and thread linkage before generating outbound mail.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of inbound items to return (1-100). | |
| offset | No | Number of inbound items to skip for pagination. | |
| status | No | Optional inbound status filter. | |
| include | No | Optional expansions. Defaults to ["drafting"]. Add ocr/lineage only when deeper provenance is needed. | |
| category | No | Optional category filter such as "Needs review" or "Loan / Mortgage". | |
| thread_id | No | Only return inbound items linked to this postal mail thread. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Forwarded inbound mail items plus pagination. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, covering safety. The description adds that the default output includes compact draft_context for LLM reasoning, which provides mild behavioral context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first defines the primary function, second explains the default output and its purpose. No wasted words, front-loaded with essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With a fully described input schema and existence of an output schema, the description is largely complete. However, it does not mention pagination behavior or error cases, and the context around the 'include' parameter's default could be clearer. Still quite good.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds minimal parameter meaning beyond the schema, only hinting at the default include behavior. No extra semantic value for individual parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (List) and the resource (forwarded inbound mail items captured from private forwarding aliases), distinguishing it from sibling tools like list_outbound_mail or get_inbound_mail.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide guidance on when to use this tool versus alternatives, nor does it mention scenarios to avoid. It only explains the default output purpose, lacking explicit usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_outbound_mailARead-onlyIdempotentInspect
List outbound mail jobs with status tracking. Returns mail ID, recipient, mail class, status, cost, and timestamps. Filter by status to see pending, in-transit, or delivered mail.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of mail jobs to return (1-100). Defaults to 20. | |
| offset | No | Number of mail jobs to skip for pagination. Defaults to 0. | |
| status | No | Filter by mail status. "pending_approval" = awaiting human approval, "submitted" = queued for facility, "ready" = printed and ready to mail, "mailed" = in transit, "delivered" = confirmed delivery, "failed" = delivery failed, "cancelled" = cancelled before mailing. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Outbound mail job summaries. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true. Description adds context on return fields and filtering, consistent with safe read operation. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first states purpose and return fields, second provides filtering guidance. No fluff, every word contributes value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (3 optional parameters, no required params, output schema exists), the description covers key aspects. Could mention pagination explicitly but offset/limit are in schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed parameter descriptions. Description mentions filtering by status but doesn't add new semantic detail beyond enum values. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it lists outbound mail jobs with status tracking, and specifies returned fields (mail ID, recipient, mail class, status, cost, timestamps). Differentiates from sibling tools like list_inbound_mail and get_outbound_mail.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides filtering guidance by status (pending, in-transit, delivered). Does not explicitly state when to use vs alternatives, but the context of listing versus getting individual items is implied by sibling names.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_packagesARead-onlyIdempotentInspect
List inbound mail or packages for approved real mailing address/package beta accounts with optional filters by status, carrier, and date. Returns tracking number, carrier, status, and received timestamp where available. For generally available inbound postal context, use list_inbound_mail with forwarded scans/PDFs/notes instead.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of packages to return (1-100). Defaults to 20. | |
| since | No | Only return packages received after this ISO 8601 date-time. | |
| offset | No | Number of packages to skip for pagination. Defaults to 0. | |
| status | No | Filter by package lifecycle status. "received" = just arrived, "stored" = in facility storage, "forwarded" = shipped to forwarding address. | |
| carrier | No | Filter by shipping carrier. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Inbound package summaries. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. The description adds value by specifying return fields (tracking number, carrier, status, received timestamp) and the scope (approved accounts). No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences. It front-loads the purpose and filters, then adds return info and alternative. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (not shown but noted), description covers return fields and filter options. It doesn't mention pagination explicitly, but the schema covers limit/offset. For a read-only list tool with good schema and annotations, it is sufficiently complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description mentions optional filters by status, carrier, and date, but does not add detail beyond the schema. No additional parameter semantics provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists inbound mail/packages for specific accounts, with filters and return fields. It explicitly distinguishes from the sibling tool list_inbound_mail by noting the beta context and alternative features.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance: use this for approved real mailing address/package beta accounts, and for generally available inbound postal context, use list_inbound_mail instead. This clearly indicates when and when not to use the tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_postal_threadsARead-onlyIdempotentInspect
List physical-mail threads that group inbound mail context, human review, and outbound sends. Use this to understand which inbound items and outbound documents belong to the same business workflow.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of threads to return (1-100). | |
| offset | No | Number of threads to skip for pagination. | |
| status | No | Optional thread status filter. | |
| include | No | Optional expansions. Add events to include inbound/outbound timeline references. | |
| category | No | Optional category filter. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Postal mail workflow threads plus pagination. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true, so the agent knows it's a safe read-only operation. The description adds context about grouping but no additional behavioral traits (e.g., pagination, rate limits). Given annotations cover the safety profile, a score of 3 is appropriate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long with zero wasted words. The first sentence defines the tool's function, and the second provides usage guidance. It is front-loaded and concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters, an output schema, and 29 sibling tools, the description adequately explains the core concept of grouping threads, which is essential for the agent. It does not cover edge cases or specific filter behaviors, but it is sufficient for a list operation with rich annotations and schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with all parameters having descriptions in the input schema. The tool description does not add extra meaning beyond what the schema already provides, so it meets the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'list' and resource 'physical-mail threads', and explains that these threads group inbound mail context, human review, and outbound sends. This differentiates it from siblings like list_inbound_mail and list_outbound_mail, which list individual items.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this to understand which inbound items and outbound documents belong to the same business workflow', providing clear context for when to use the tool. However, it does not explicitly state when not to use it or mention alternative tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
propose_mailbox_md_editAInspect
Propose changes to the renter's MAILBOX.md instructions with reasoning. The renter will see your suggestion in their dashboard and can accept, reject, or modify it. Use this when you observe patterns that could be codified into standing instructions.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | Yes | Why this change is suggested (e.g. "Observed 5 Amazon packages this week, all forwarded manually — adding auto-forward rule"). | |
| suggested_content | Yes | Full proposed MAILBOX.md content (max 10,000 chars). Must include the complete document, not just the diff. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Created MAILBOX.md suggestion record. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that the suggestion is displayed to the renter for approval, and explains the interactive nature. With annotations providing no behavioral hints (all false), the description effectively communicates the non-destructive, proposal-based workflow.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the core action and outcome. No wasted words. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (not shown), the description adequately covers the tool's purpose, usage, and behavior. The tool is simple with two parameters, and the description is complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters. The description adds a concrete example for 'reason' but does not significantly extend beyond the schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Propose changes' on the specific resource 'MAILBOX.md instructions', and distinguishes from siblings by emphasizing that the renter must accept/reject/modify the proposal.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit usage context: 'Use this when you observe patterns that could be codified into standing instructions'. It does not explicitly list alternatives, but the context is clear and helpful.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
register_expectedAInspect
Pre-register an expected inbound shipment so it is auto-matched when it arrives at the facility. Optionally specify an action to auto-execute on arrival (e.g. forward immediately, scan on receipt).
| Name | Required | Description | Default |
|---|---|---|---|
| carrier | No | Shipping carrier (e.g. "fedex", "ups", "usps"). | |
| auto_action | No | Action to auto-execute when the package arrives. | |
| description | No | Human-readable description of the shipment (e.g. "Replacement laptop from Dell"). | |
| expected_by | No | Expected arrival date in ISO 8601 format. Used for alerts if the package is late. | |
| tracking_number | No | Carrier tracking number for the expected shipment. | |
| auto_action_params | No | Parameters for the auto-action (e.g. forwarding address). |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Created expected shipment record. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide basic safety hints (not read-only, not destructive). The description adds behavioral context: auto-matching on arrival and optional auto-execution, though it could detail side effects like confirmation or updates.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with no redundant information. The first sentence states the primary purpose, the second adds optional detail. Every word adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 6 parameters (0 required) and an output schema, the description covers the core functionality. It does not mention that all parameters are optional, which could help agents, but it is otherwise complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents each parameter. The description only adds a brief example ('e.g. forward immediately, scan on receipt') for auto_action, not significantly enhancing meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Pre-register' and resource 'expected inbound shipment', with specific outcome 'auto-matched when it arrives'. This distinguishes it from sibling tools like list_packages or create_rule.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies when to use (before arrival) and what optional actions can be set, but does not explicitly mention when not to use or alternatives among siblings (e.g., using rules for auto-actions).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
request_actionADestructiveInspect
Request a physical action on a package at the facility. Actions include forwarding to another address, shredding, scanning documents, holding for pickup, disposing, returning to sender, photographing, opening and scanning contents, or recording a video. Some actions (shred, dispose) are irreversible.
| Name | Required | Description | Default |
|---|---|---|---|
| action | Yes | Action to perform. "forward" = ship to another address, "shred" = destroy (irreversible), "scan" = OCR document scan, "hold" = keep in storage, "dispose" = discard (irreversible), "return_to_sender" = send back, "photograph" = take photos, "open_and_scan" = open package and scan contents, "record_video" = video recording of package. | |
| priority | No | Processing priority. "urgent" = same-day processing, "high" = next business day, "normal" = standard queue, "low" = when convenient. | normal |
| package_id | Yes | UUID of the package to act on. | |
| parameters | No | Action-specific parameters. For "forward": { address, city, state, zip }. For "scan": { scan_type }. For "hold": { until_date }. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Created facility action request record. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description supplements annotations by noting that 'shred' and 'dispose' are irreversible, aligning with destructiveHint=true. It does not elaborate on other behavioral traits (e.g., authorization, rate limits), but annotations already cover key safety hints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences cover all essential information: the tool's purpose and a critical behavioral note (irreversibility). Every sentence earns its place without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has an output schema (not shown), the description appropriately omits return value details. It covers key aspects: action types, irreversibility, and parameter structure. Minor gap: no mention of prerequisites (e.g., package ownership) but acceptable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with each parameter already well-described (action enum with explanations, priority levels, package_id format, parameters object with examples). The tool description restates action types but adds minimal new meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: requesting a physical action on a package at a facility, listing nine specific actions. It distinguishes itself from sibling tools like 'request_scan' (which is specific to scanning) and other read/mutation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description lists available actions but gives no guidance on when to use this tool versus alternatives (e.g., when to use 'forward' vs 'scan' vs 'shred'). No explicit when-to-use or when-not-to-use context is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
request_scanAInspect
Request document scanning (OCR + structured data extraction) for a package. The facility will scan the document and extract text, addresses, dates, and other structured data. Results are available via get_scan_results after processing.
| Name | Required | Description | Default |
|---|---|---|---|
| scan_type | No | Type of scan. "label" = shipping label only, "envelope" = exterior envelope, "document" = full document OCR, "content" = opened package contents. | document |
| package_id | Yes | UUID of the package to scan. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Created scan request record. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses asynchronous behavior ('Results are available via get_scan_results after processing'), adding value beyond annotations which only indicate non-read-only and non-destructive. It could mention idempotency or timing.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two focused sentences: first states the action, second explains the outcome and next step. No unnecessary words; every sentence is valuable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description appropriately omits return details. It covers the tool's purpose, process (async), and result retrieval, making it complete for a two-parameter request tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, the description adds meaning by explaining what the scan extracts (text, addresses, dates). This goes beyond the schema's type definitions, justifying a score above baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool requests document scanning with OCR and structured data extraction. It distinguishes itself from sibling tools like get_scan_results, which retrieves results, and other package-related tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage before get_scan_results, providing a clear workflow. However, it does not explicitly state when not to use this tool or compare to alternatives beyond the result retrieval.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_facility_messageAInspect
Send a message to the facility operator managing your mailbox. Messages appear in the shared conversation visible to you, the renter, and the facility. Optionally link the message to a specific package or action request for context.
| Name | Required | Description | Default |
|---|---|---|---|
| body | Yes | Message text (1-5000 characters). | |
| package_id | No | Optional: link this message to a specific package for context. | |
| facility_id | Yes | The facility to message. Get this from the get_mailbox response. | |
| action_request_id | No | Optional: link this message to an action request for context. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Sent facility message identifiers and body. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate a write operation (readOnlyHint=false) and non-destructive (destructiveHint=false). The description adds that messages appear in a shared conversation visible to renter and facility, and can be linked to packages/actions. This provides useful behavioral context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words. The first sentence states the core action, the second adds optional context. Well-structured and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the main purpose, optional parameters, and visibility. It does not mention character limits or error cases, but the schema handles character limit and output schema exists for return values. Adequate for a messaging tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all parameters. The description adds context about optional linking and visibility, but the schema already explains the parameters well. The added value is marginal.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it sends a message to the facility operator for the mailbox, specifies the audience and visibility, and mentions optional linking to packages/action requests. This distinguishes it from sibling tools like get_facility_messages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for communicating with the facility operator about packages or actions, but does not explicitly state when to use this tool versus alternatives like add_note or when not to use it. No exclusions or comparisons provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_outbound_mailAInspect
Submit a document for printing and postal mailing by the facility. Supported formats: PDF, DOCX, JPG, PNG, TXT, CSV. The document is stored securely and printed by the facility operator. IMPORTANT: With a production key (sk_agent_), this immediately charges the member's card on file. Use dry_run=true to preview cost before committing, or requires_approval=true to defer until human approval. Sandbox keys (sk_agent_test_) skip billing entirely. Optionally attach the outbound mail to inbound context with inbound_capture_id and postal_mail_thread_id so lineage stays explicit.
| Name | Required | Description | Default |
|---|---|---|---|
| color | No | Print in color. Adds a per-page color surcharge. | |
| duplex | No | Print double-sided to reduce page count and postage. | |
| dry_run | No | Validate inputs and return cost breakdown without creating a record or charging. Use to preview cost before committing. | |
| metadata | No | Arbitrary key-value pairs echoed in GET responses and webhooks. Recommended convention: { "workflow_id": "wf_123", "reason": "Customer cancellation", "correlation_id": "abc" }. | |
| mail_class | No | USPS mail class. "first_class" = 3-5 days, "priority" = 1-3 days, "certified" = with tracking and proof of mailing, "certified_return_receipt" = certified with signed delivery confirmation. | first_class |
| package_id | No | Link this mail to an inbound package (e.g. replying to received correspondence). | |
| page_count | No | Explicit page count for non-PDF documents when exact pagination is known. When supplied for DOCX, TXT, or CSV, it overrides local detection and makes pricing deterministic. | |
| return_zip | No | Return address ZIP code. Defaults to member profile if omitted. | |
| agent_notes | No | Instructions for the facility operator (e.g. "Time-sensitive — mail today"). | |
| return_city | No | Return address city. Defaults to member profile if omitted. | |
| return_name | No | Return address name. Defaults to the member's profile name if omitted. | |
| return_line1 | No | Return address line 1. Defaults to member profile if omitted. | |
| return_line2 | No | Return address line 2 (suite, unit, etc.). | |
| return_state | No | Return address state (2-letter code). Defaults to member profile if omitted. | |
| recipient_zip | Yes | 5 or 5+4 digit ZIP code (e.g. "90210" or "90210-1234"). | |
| max_cost_cents | No | Cost cap in cents. If the calculated cost exceeds this, the request is rejected with 422 before any charge. Prevents accidental expensive mailings. | |
| recipient_city | Yes | Recipient city. | |
| recipient_name | Yes | Full name of the mail recipient. | |
| document_base64 | Yes | Base64-encoded document file. Supported formats: PDF, DOCX, JPG, PNG, TXT, CSV. Max 10MB decoded. | |
| recipient_line1 | Yes | Street address line 1 of the recipient. | |
| recipient_line2 | No | Street address line 2 (apartment, suite, unit, etc.). | |
| recipient_state | Yes | 2-letter US state code (e.g. CA, NY, TX). | |
| document_filename | No | Original filename with extension (e.g. "letter.docx"). Required for reliable non-PDF format detection. | |
| recipient_country | No | ISO 3166-1 alpha-2 country code. Defaults to "US". | US |
| requires_approval | No | If true, the renter must approve in their dashboard before the mail is printed and sent. | |
| inbound_capture_id | No | Optional inbound mail item this outbound piece is replying to. Recommended when drafting from OCR/forwarded-mail context. | |
| mailbox_md_version | Yes | Your current MAILBOX.md version (from get_mailbox_md). Required for sync verification. | |
| postal_mail_thread_id | No | Optional physical-mail thread to attach this outbound mail to. Lets agents keep inbound and outbound activity in one durable workflow. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Submitted outbound mail job or dry-run cost preview. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint=false), the description discloses that with a production key the tool immediately charges the member's card, and that documents are stored securely and printed by facility operators. It also explains behavior for dry_run, requires_approval, and max_cost_cents. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is four sentences, front-loaded with the main action, then format support, then important billing/approval details, and finally optional threading. Every sentence contributes essential information without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (28 parameters, 7 required, nested objects, output schema exists), the description covers key behavioral aspects: billing, format, approval, threading. It does not detail the output but the output schema exists. Adequate for an agent to understand when and how to use the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by explaining how key parameters (dry_run, requires_approval, inbound_capture_id, postal_mail_thread_id) are used in context, beyond what the schema provides. For example, it links sandbox keys to billing skip and dry_run to cost preview.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Submit a document for printing and postal mailing by the facility' and lists supported formats. The title 'Send Outbound Mail' reinforces this. It distinguishes from sibling tools like create_test_outbound_mail by emphasizing production vs. sandbox key behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use sandbox keys (testing) vs. production keys (real charges), and how to use dry_run for preview or requires_approval for approval workflows. However, it does not explicitly differentiate from sibling tools like advance_test_outbound_mail, leaving some ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_actionAIdempotentInspect
Push notes, structured data, or a clarification response to an existing action request. Use this to add agent reasoning, attach extracted data, or respond when the facility asks for clarification. Requires mailbox_md_version to prove your MAILBOX.md instructions are in sync.
| Name | Required | Description | Default |
|---|---|---|---|
| action_id | Yes | The action request ID to update. | |
| agent_data | No | Structured data to attach (e.g. OCR results, extracted fields, classification labels). | |
| agent_notes | No | Free-text notes from the agent (e.g. "Forwarding per standing rule #3"). | |
| decision_context | No | Link this decision to a specific MAILBOX.md instruction for auditability. | |
| mailbox_md_version | Yes | Your current MAILBOX.md version (from get_mailbox_md). Required for sync verification. | |
| respond_to_clarification | No | Response text when action status is needs_clarification. Providing this auto-resumes the action to in_progress. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Updated facility action request record. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate idempotentHint=true and non-destructive. The description adds valuable behavioral context: the need for mailbox_md_version to prove sync, and the auto-resume behavior when respond_to_clarification is provided. This goes beyond the annotations without contradicting them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loading the core purpose and then adding the critical prerequisite. Every sentence provides necessary information without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (6 parameters, nested objects, output schema), the description covers the essential context: purpose, usage scenarios, and a key requirement. Return values are not described, but an output schema exists to fill that gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all 6 parameters. The description adds extra meaning by explaining how parameters are used (e.g., decision_context links to MAILBOX.md sections, respond_to_clarification auto-resumes the action). This adds value beyond the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: pushing notes, structured data, or clarification responses to an existing action request. It also provides specific use cases (add agent reasoning, attach extracted data, respond to clarification), distinguishing it from siblings like request_action (which creates actions) and add_note (which adds notes to mail).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use the tool (to add reasoning, attach data, or respond to clarification) and includes a critical requirement (mailbox_md_version for sync verification). While it does not list alternative tools or when not to use, the context is clear enough for an agent to decide.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_webhookAIdempotentInspect
Configure webhook endpoint URL and event subscriptions for real-time notifications. Events include package.received, package.status_changed, action.completed, mail.status_changed, and more. The endpoint must use HTTPS and respond with 2xx within 10 seconds.
| Name | Required | Description | Default |
|---|---|---|---|
| enabled | No | Set to false to pause webhook delivery without removing the URL. | |
| event_types | No | Array of event types to subscribe to (e.g. ["package.received", "mail.status_changed"]). Empty array disables all events. | |
| webhook_url | No | HTTPS URL to receive webhook POST requests. Must respond with 2xx within 10 seconds. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes | Webhook configuration status. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide idempotentHint and non-destructive hints. The description adds the constraint that the endpoint must use HTTPS and respond within 10 seconds, which is valuable beyond annotations. However, it does not detail other behaviors like rate limits or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, then key constraints. Every sentence adds value; no wasted words. Ideal conciseness for a tool description.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema and full annotations, the description is mostly complete. It covers purpose, key constraints, and event examples. Minor gap: does not explain if the tool creates or updates, but the idempotent hint and name imply upsert.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description does not add significant meaning beyond the schema; it repeats event examples already implied by the event_types description. No additional parameter semantics are provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool configures webhook endpoint URL and event subscriptions, with specific examples of events. It distinguishes from sibling tools by being the only webhook-related tool, and the verb 'configure' aligns with the tool name 'update_webhook'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives, or when not to use it. Since there are no sibling webhook tools, the agent still lacks context on whether this creates or updates, and no prerequisites or scenarios are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!