Skip to main content
Glama

EmailMCP

Server Details

AI agent email — 23 tools free to read inbox, 28 paid to send. $9.99/mo. Token auto-provisioned.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
nfodor/emailmcp
GitHub Stars
0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsB

Average 3.5/5 across 51 of 51 tools scored. Lowest: 2.7/5.

Server CoherenceA
Disambiguation3/5

While each tool has a distinct purpose, the large number of tools (51) creates potential confusion, especially among similar-sounding tools like check_email_config, verify_email_setup, and test_email_config, or among the various list/get tools. The descriptions help, but the sheer volume makes disambiguation challenging for an agent.

Naming Consistency5/5

All tool names follow a consistent snake_case verb_noun pattern (e.g., add_sending_domain, list_email_accounts, start_health_monitor). There are no mixed conventions or irregular names, making the naming highly predictable and coherent.

Tool Count2/5

The server has 51 tools, which is far above the typical well-scoped range of 3-15. While the tools cover a broad email management domain, the count feels excessive and could overwhelm users. The scope is too broad for a single server, indicating a need for modularization.

Completeness4/5

The tool set covers the core lifecycle of email management: domains, accounts, mailboxes, rules, sending, receiving, monitoring, provisioning, and security. Minor gaps exist, such as the lack of a bulk suppress-all tool or a way to update mailbox configurations, but overall the surface is impressively complete.

Available Tools

51 tools
add_sending_domainAInspect

Add a customer sending domain. Generates DKIM keys and returns the 4 DNS records the customer needs to add. Works like SendGrid domain authentication — self-hosted.

ParametersJSON Schema
NameRequiredDescriptionDefault
domainYesCustomer domain (e.g. company.com)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must disclose behavior. It reveals key actions (generating DKIM keys, returning DNS records) but omits details like idempotency, error handling, or prerequisites. It's adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the core action, no unnecessary words. Efficient and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity (1 param, no output schema), the description is reasonably complete: it states what it does and what it returns. It lacks prerequisites or side-effect details, but for a straightforward setup tool, it's mostly sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter 'domain' described as 'Customer domain (e.g. company.com)'. The description adds no further semantic nuance beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool adds a sending domain, generates DKIM keys, and returns 4 DNS records. It distinguishes from siblings like list_sending_domains and remove_sending_domain by specifying the action and output.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for setting up a new sending domain, but lacks explicit when-to-use, when-not-to-use, or alternative recommendations. Siblings include other domain-related tools, but no guidance is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_email_configBInspect

Check if email providers are configured, guide setup if needed

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description lacks detail on behavioral aspects such as whether the tool modifies state, requires authentication, or returns specific data. The phrase 'guide setup if needed' is vague about what actions the tool takes.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise with a single sentence. It communicates the core purpose without unnecessary words. However, it could benefit from better structure separating the check and guide actions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having no parameters and no output schema, the description does not specify what the tool returns (e.g., boolean, configuration details, or setup instructions). The agent lacks information to interpret the result correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero parameters with 100% coverage, so the baseline is 4. The description adds meaning by clarifying the tool's purpose (check and guide), even though no parameters need explanation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Check if email providers are configured') and mentions guiding setup if needed. It distinguishes from sibling tools like test_email_config, which likely tests functionality rather than configuration status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., test_email_config). The description does not provide context for when the guide setup feature is triggered or what prerequisites exist.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_ip_statusAInspect

Check the status of a pending WireGuard IP provisioning request

ParametersJSON Schema
NameRequiredDescriptionDefault
request_idNoRequest ID from request_email_ip (omit to check saved request)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden but only states it checks status of 'pending' requests. It does not disclose behavior for completed, failed, or missing requests, nor any side effects or return format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence with no superfluous words. It is efficiently front-loaded with the essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given it's a simple status check with one optional parameter, the description is minimally adequate but omits information about the response format, which is important since no output schema is provided.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% and the schema already includes the 'omit to check saved request' detail. The description adds no additional meaning beyond what the schema provides, so baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (check status), the resource (pending WireGuard IP provisioning request), and distinguishes from sibling tools like request_email_ip or provision_status by specifying the exact context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when a pending request ID exists or to check a saved request, but does not explicitly state when to use this tool over alternatives, nor does it provide when-not scenarios or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

configure_mailgunCInspect

Configure Mailgun email provider with step-by-step guidance

ParametersJSON Schema
NameRequiredDescriptionDefault
domainYesMailgun domain (e.g., mg.yourdomain.com)
api_keyYesMailgun API Key (starts with "key-")
from_emailYesFrom email address (e.g., noreply@yourdomain.com)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It does not disclose behavioral traits like side effects, idempotency, or required prerequisites (e.g., domain verification). The verb 'configure' is vague.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no filler. Efficiently conveys the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema and no annotations. The description is too brief for a configuration tool with 3 required params. Missing details on validation, effects, or workflow.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for all three parameters, so baseline is 3. Description adds no extra semantics beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool configures Mailgun, a specific provider, implying initial setup. However, it does not differentiate from siblings like 'check_email_config' or 'verify_email_setup', which could overlap.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The phrase 'step-by-step guidance' hints at a guided process, but there is no when-to-use or when-not-to-use context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_email_accountAInspect

Create an email account with SMTP credentials. Returns SMTP password once — save it. The account can then send/receive via this server.

ParametersJSON Schema
NameRequiredDescriptionDefault
domainNoSending domain (auto-derived from address if omitted)
addressYesEmail address (e.g. sales@company.com)
display_nameNoDisplay name for the account
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description provides important behavioral traits: the SMTP password is returned only once ('save it'), and the account becomes operational. However, it does not disclose idempotency, destructive actions, auth requirements, or what happens if the address already exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences (19 words) with no redundancy. Each sentence adds value: purpose and output in first, critical warning and capability in second. Efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a creation tool: mentions output (password), the need to save it, and the resulting capability. However, it lacks details on error handling, prerequisites (e.g., domain configuration), and whether the tool is idempotent. No output schema, so return value context is minimal.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% so the baseline is 3. The description adds no additional meaning beyond the schema; it mentions 'SMTP credentials' but does not elaborate on parameter formats or relationships.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create an email account') and the key output ('SMTP credentials'). It mentions the account can send/receive, which distinguishes from other creation tools like create_mailbox, but does not explicitly differentiate from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives or prerequisites. The description implies usage for setting up a sending/receiving account, but lacks context about when not to use or what conditions are needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_email_ruleBInspect

Create a rule that triggers actions when incoming emails match conditions. Actions: forward, auto_reply, webhook, flag, move, mcp_tool, log, reject.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesRule name (unique)
stopNoStop processing further rules after this one matches (default: false)
matchYesMatch conditions (ALL must match). Use from, domain, subject, subject_regex, body_contains, has_attachments, to, header.
actionsYesActions to execute when rule matches
priorityNoPriority (lower = evaluated first, default 100)
descriptionNoWhat this rule does
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It only states 'Create a rule' without disclosing side effects (e.g., uniqueness enforcement, overwrite behavior, rate limits). Minimal transparency for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence plus list of actions; very concise. Front-loaded with key action types. Could be slightly more structured but efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Complex tool with nested objects (match, actions) and no output schema. Description lacks high-level explanation of rule evaluation order or stop parameter behavior, though schema covers details. Adequate but not comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. Description lists action types (already in schema enum), adding minimal extra value. Does not elaborate on match conditions beyond what schema states.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Create a rule that triggers actions when incoming emails match conditions' – specific verb+resource+function. Lists action types, distinguishing it from update/delete/list siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs update_email_rule or delete_email_rule. No preconditions or exclusions mentioned. Implicit purpose but lacking explicit context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_mailboxBInspect

Create a new local mailbox for receiving email (autonomous mode). Each address gets its own inbox accessible via MCP tools.

ParametersJSON Schema
NameRequiredDescriptionDefault
addressYesEmail address for the mailbox (e.g. sales@company.com)
display_nameNoDisplay name for the mailbox
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It only mentions creation and inbox access, but omits side effects, idempotency, error handling, or required permissions. No output schema further limits transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded with the core action. Every word adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a creation tool with no output schema and two parameters, the description is incomplete. It does not explain return values, idempotency, uniqueness of address, or relationship to other mailbox tools like 'list_local_mailboxes'.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description does not need to compensate. It adds no extra meaning beyond the schema's parameter descriptions, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Create' and the resource 'local mailbox for receiving email'. It distinguishes this tool from siblings by specifying 'autonomous mode' and that each address gets its own inbox.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like 'create_email_account' or 'list_local_mailboxes'. The mention of 'autonomous mode' provides some context but no exclusion criteria or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_email_accountBInspect

Delete an email account and revoke its SMTP credentials

ParametersJSON Schema
NameRequiredDescriptionDefault
addressYesEmail address of the account to delete
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and the description only says 'delete and revoke SMTP credentials' without disclosing side effects, irreversibility, or impact on related data (e.g., rules, mailboxes).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, efficient and to the point, but could include more context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema and no details on response or error conditions; for a destructive tool, more context is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with one parameter described; the description adds no extra meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (delete) and the resource (email account), and distinguishes it from sibling tools like create_email_account or list_email_accounts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool or when not to; no prerequisites or alternatives mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_email_ruleAInspect

Delete an email rule by ID

ParametersJSON Schema
NameRequiredDescriptionDefault
rule_idYesRule ID to delete
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and the description lacks any behavioral context such as side effects, permissions, or irreversibility. For a delete operation, critical details are missing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with the action and resource. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Minimal description lacking details on success/error responses or idempotency. Adequate for a simple operation but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter (rule_id). The description adds no further meaning beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action (delete), resource (email rule), and identifier (by ID). Distinguishes from siblings like create_email_rule.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use or not use this tool. While sibling context implies its role, there is no mention of prerequisites or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_mailboxAInspect

Delete a local mailbox and all its stored emails (autonomous mode)

ParametersJSON Schema
NameRequiredDescriptionDefault
addressYesEmail address of the mailbox to delete
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explicitly states that the tool deletes all stored emails, which is crucial behavioral information. However, with no annotations, it does not mention permissions, reversibility, or side effects. The destructive nature is clear.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently conveys the action and scope, with no wasted words. It is front-loaded with the key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple deletion tool with one parameter and no output schema, the description adequately covers the action and impact. It could mention prerequisites or confirmation but is fairly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter 'address' that is self-explanatory. The description adds no extra meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (delete) and resource (local mailbox and its emails). The phrase 'autonomous mode' adds context. It distinguishes from siblings like create_mailbox and list_mailboxes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not explicitly state when to use or when not to use this tool, nor does it mention alternatives. The 'autonomous mode' hint provides some context, but no direct guidance is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_conversation_contextBInspect

Get full conversation context for a specific thread

ParametersJSON Schema
NameRequiredDescriptionDefault
threadIdYesThread ID to get conversation for
maxEmailsNoMaximum number of emails to include in context
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description should disclose behavioral traits. It mentions 'full context' but does not explain that maxEmails defaults to 10, limiting the context, nor does it describe side effects or the scope of 'full.' This is a significant gap for a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no fluff, making it very concise. However, it sacrifices valuable detail for brevity, preventing a perfect score.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, so the description should hint at return format or structure. It does not. Additionally, given the presence of sibling tools with similar purposes, the description lacks differentiation and practical usage context, making it incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers 100% of parameters with descriptions, so the description does not need to add parameter meaning. It adds no extra context beyond what the schema provides, achieving the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'full conversation context' for a specific thread, using a specific verb and resource. This distinguishes it from siblings like 'get_conversation_threads' which likely list threads rather than retrieve full context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when full thread context is needed, but provides no explicit guidance on when to use this tool versus alternatives (e.g., get_conversation_threads, get_threads_for_sender) or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_conversation_threadsCInspect

Get conversation threads with forwarded emails

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of threads to return
sinceNoFilter threads active since this date (ISO string)
subjectNoFilter threads by subject content
participantNoFilter threads by participant email address
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations and only a sparse description, the tool lacks behavioral transparency. The description does not disclose important traits such as whether results are paginated, sorted, or limited to a user context. It also omits any side effects, authentication needs, or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short (6 words), which is concise but at the expense of informativeness. Important details that would help an agent are omitted, making it under-specified rather than efficiently concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the domain (email threads) and the lack of an output schema, the description should clarify what constitutes a thread, default ordering, and behavior of parameters like since. It provides none of this context, leaving significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% parameter description coverage, so the schema itself provides adequate meaning for each parameter. The description adds no extra context beyond the schema, earning the baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and resource 'conversation threads with forwarded emails', making the primary action and subject clear. However, it does not distinguish this tool from siblings like get_conversation_context or get_threads_for_sender, which might have similar names or functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

There is no guidance on when to use this tool versus its many siblings (e.g., get_conversation_context, get_forwarded_emails, read_inbox). The description fails to provide any context about use cases, prerequisites, or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_dns_recordsAInspect

Generate all required DNS records (A, MX, SPF, DKIM, DMARC) for autonomous email. Outputs copyable records for your DNS provider.

ParametersJSON Schema
NameRequiredDescriptionDefault
ipv4YesYour dedicated IPv4 address (from WireGuard provisioning)
domainYesYour email domain (e.g. company.com)
dmarc_policyNoDMARC policy (default: reject)
dkim_selectorNoDKIM selector (default: emailmcp)
mail_hostnameNoMail server hostname (default: mail.domain)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It states 'generate' but does not disclose whether this operation has side effects, requires specific permissions, or is safe. For a mutating-like action, more transparency on non-destructiveness or prerequisites is needed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently communicate purpose and output format. No redundant information; every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Without an output schema, the description only says 'Outputs copyable records' but does not specify format or structure. For a tool with multiple optional parameters, additional details on return value or error handling would enhance completeness. Still adequate for the task.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good descriptions. The description adds context that all records are for autonomous email, which ties parameters together, but does not add new meaning beyond schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates DNS records for autonomous email, listing specific record types (A, MX, SPF, DKIM, DMARC) and that output is copyable for DNS providers. This is a specific verb+resource+context, distinguishing it from sibling tools like get_domain_dns_records which likely fetch existing records.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for setting up autonomous email, but does not explicitly mention when not to use or alternative tools. However, the context of sibling tools (e.g., provision_start) provides enough clarity. A brief exclusion or alternative mention would improve it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_domain_dns_recordsBInspect

Get the DNS records a customer needs to add for a sending domain

ParametersJSON Schema
NameRequiredDescriptionDefault
domainYesCustomer domain
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose any behavioral traits (e.g., idempotency, safety, side effects). The 'get' verb implies read-only but is not explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is front-loaded and efficient, with no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with one parameter and no output schema. The description is adequate but lacks details on return format or prerequisites for the domain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description does not add extra meaning beyond the schema; the parameter description 'Customer domain' is basic.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the specific resource 'DNS records a customer needs to add for a sending domain'. It differentiates from sibling tools like 'get_dns_records' by specifying the context of a sending domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., get_dns_records). No prerequisites or context are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_email_usageBInspect

Get email usage statistics — delivery rate, bounce rate, per-domain breakdown, top bounce reasons. Reads from persistent stats.

ParametersJSON Schema
NameRequiredDescriptionDefault
domainNoFilter stats by domain
periodNoTime period for usage statisticsday
accountNoFilter stats by account address
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden for behavioral context. It only states 'Reads from persistent stats', hinting at read-only but lacking details on data freshness, access requirements, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence that front-loads key purpose and a behavioral hint. No fluff, though a brief mention of output structure could enhance it.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description does not explain the output format or how parameters (domain, period, account) affect the results. Given no output schema, more detail on return values and usage with period is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover all parameters fully. The tool description mentions 'per-domain breakdown' but does not add new semantic constraints or examples beyond the schema, so a baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves email usage statistics and lists specific metrics (delivery rate, bounce rate, per-domain breakdown, top bounce reasons), making it distinct from sibling tools like get_health_status or list_email_accounts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies it is for statistics, but does not exclude scenarios where other tools might be more appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_forwarded_emailsCInspect

Retrieve forwarded emails with optional filtering

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of emails to return
sinceNoFilter emails since this date (ISO string)
threadIdNoFilter by conversation thread ID
originalSenderNoFilter by original sender email address
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It only says 'Retrieve', implying read-only, but lacks details on authentication, rate limits, or side effects. The description adds minimal value beyond the tool name.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is concise and front-loaded. It contains no unnecessary words, though it could benefit from slightly more structure to improve readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With four parameters, no output schema, and no annotations, the description should provide more context about return format, filtering behavior, and prerequisites. The current description is too sparse to fully inform an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with descriptions for all parameters. The tool description does not add additional meaning beyond what the schema already provides, so a baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Retrieve' and the resource 'forwarded emails', and the optional filtering is mentioned. However, it does not explicitly distinguish from similar sibling tools like 'read_inbox' or 'get_conversation_context', so it earns a 4.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, nor any exclusions. It simply states what it does without context for decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_health_statusAInspect

Get current health status of the email server — tunnel, SMTP, DNS, IP blocklist checks

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses the checks performed but does not mention read-only nature, output format, side effects, or rate limits. This is adequate but not rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence that efficiently conveys purpose and scope. No extraneous words, front-loaded with the verb and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters, no output schema, and no annotations, the description is functionally complete for a simple health check. However, it could benefit from mentioning output format or how to interpret results. Still, it covers the core intent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero parameters with 100% coverage, so there is no need for parameter descriptions. The description adds value by clarifying the tool's scope beyond the empty schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves the current health status of the email server and lists specific components checked (tunnel, SMTP, DNS, IP blocklist), making the purpose precise and distinguishing it from sibling tools like check_email_config or get_mailbox_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies a diagnostic use case but does not explicitly state when to use this tool versus alternatives like check_ip_status or get_smtp_receiver_status. No exclusions or sibling comparisons are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_idle_statusAInspect

Get status of IMAP IDLE watchers — which accounts are being watched, connection status, new email counts

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description fully discloses what the tool does (returns status, connection status, new email counts) and implies it is a read operation. With no annotations, it carries the burden of transparency, and it does so adequately without hiding any side effects or assumptions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that immediately identifies the tool's function and scope. Every word adds value with no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with no parameters and no output schema, the description adequately covers what the tool does and what it returns. It could be slightly improved by explicitly stating it is a read-only operation, but it is substantially complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so schema coverage is 100%. The description adds no parameter information, but the baseline for zero parameters is 4, and the description effectively explains the tool's purpose without needing param details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves the status of IMAP IDLE watchers, specifying exactly what is included (watched accounts, connection status, new email counts). It distinguishes from sibling tools like start_idle_watch and stop_idle_watch.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the tool is used to check the current state of IMAP IDLE monitoring, but it does not explicitly mention when to use it versus alternatives or provide any exclusions or prerequisites. While the context of sibling tools suggests its purpose, explicit guidance is absent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_mailbox_statusCInspect

Get status information for an email mailbox. If IMAP is configured in .env, no credentials needed. Use "account" to pick a named account.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountNoNamed account (e.g. "work", "personal"). Omit for default.
mailboxNoMailbox to checkINBOX
imap_configNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It briefly mentions IMAP configuration but does not explain what 'status information' means (e.g., connection state, message count, errors), nor does it state that the operation is read-only or what happens on failure. This is insufficient for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short at two sentences, with the key action in the first sentence. However, the second sentence mixes two pieces of information (credentials and account selection) which could be better separated or reordered. Still, it is mostly concise and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema and three parameters (including a nested object), yet the description does not specify what the tool returns (only 'status information'). It also leaves ambiguity about how the account parameter interacts with the .env configuration. The description feels incomplete given the tool's complexity and lack of output specification.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 67% description coverage, meaning some parameters lack descriptions. The description adds value by noting the .env fallback for credentials, which relates to the imap_config parameter. However, it does not explain the mailbox parameter's behavior (what 'INBOX' default means) or the structure of imap_config in more detail, so it provides only marginal improvement over the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'status information for an email mailbox', making the primary purpose obvious. However, among many sibling tools like 'check_email_config' or 'get_health_status', it does not differentiate what makes this tool unique, so it misses full clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description offers some usage context ('If IMAP is configured in .env, no credentials needed. Use "account" to pick a named account.') but fails to specify when to use this tool versus siblings. For example, no mention of when to use get_mailbox_status instead of check_email_config or get_health_status.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_smtp_receiver_statusBInspect

Get the current status of the SMTP forward receiver

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full responsibility for behavioral disclosure. It only states it 'gets status' without indicating whether it is a safe, read-only operation, what side effects exist, or what the response contains. This is insufficient for a tool mutating state (though it appears read-only, nothing guarantees it).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is front-loaded and contains no extraneous information. It efficiently communicates the tool's purpose without unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple, parameterless status-check tool, the description is adequate but lacks detail. It does not specify what 'status' entails (e.g., enabled/disabled, errors, last activity) or the expected return format. Given the existence of sibling tools like start_smtp_receiver, some context about the lifecycle would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Since there are no parameters, the input schema coverage is 100%. The description does not need to add parameter semantics, but it also does not provide any additional context about the tool's input or output. Baseline 3 is appropriate as the description is neutral and schema handles everything.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'current status of the SMTP forward receiver', which is specific and distinguishes this tool from sibling tools like start_smtp_receiver and stop_smtp_receiver that perform actions rather than queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, such as after starting the receiver or for troubleshooting. There is no mention of prerequisites or context, making it less helpful for an agent deciding between this and other status tools like get_health_status.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_suppression_listAInspect

View all addresses on the bounce suppression list — these addresses will be rejected on send

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description covers the read-only nature ('View'). It does not disclose any additional behavioral traits such as authentication needs, rate limits, or potential side effects; however, for a simple read operation with no parameters, the description is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that immediately conveys the purpose, with no wasted words. It is well front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no parameters and no output schema, the description provides sufficient context: it lists the resource and its significance. It could be slightly enhanced by mentioning typical use cases or response format, but it is largely complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist in the input schema, and schema coverage is 100%. The description does not need to add parameter semantics; baseline for zero parameters is 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('View all addresses') and the specific resource ('bounce suppression list'), and adds context that these addresses are rejected on send. It distinguishes from sibling tools like remove_from_suppression.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for viewing the suppression list but does not explicitly state when to use this tool versus alternatives like remove_from_suppression. No direct guidance on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_threads_for_senderAInspect

Get all conversation threads involving a specific sender

ParametersJSON Schema
NameRequiredDescriptionDefault
senderAddressYesEmail address of the sender
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden. It implies a read operation but does not clarify permission needs, idempotency, or rate limits. Adequate for a simple read tool but lacks detail.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single, front-loaded sentence with no unnecessary words. Efficient and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool with no output schema, the description is adequate but could clarify whether 'involving' means sent by or also received by the sender. Return format not hinted.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with senderAddress fully described. The description adds minimal value beyond the schema, just reiterating the filter concept. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (get), resource (conversation threads), and scope (involving a specific sender). It distinguishes from sibling tools like get_conversation_threads which likely retrieves all threads without filtering.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., get_conversation_threads). No exclusion criteria or prerequisites mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_cloud_providersAInspect

List supported cloud providers with pricing, regions, and PTR support info

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears full responsibility. It discloses what information is returned (pricing, regions, PTR support), which covers key behavioral aspects for a read-only list. Missing details like whether the list is static or dynamic, but acceptable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence of 11 words that immediately conveys the tool's purpose and scope. No fluff; every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters and no output schema, the description provides the essential information: what is listed and what fields are included. It could mention sorting or filtering but is largely complete for this simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are zero parameters and 100% schema coverage. The description adds value beyond the schema by specifying the fields returned (pricing, regions, PTR support), which is not in the schema. Baseline 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses the specific verb 'List' and identifies the resource 'supported cloud providers' along with attributes (pricing, regions, PTR support info). It clearly distinguishes itself from sibling tools that are email or DNS related.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies it is the tool to list cloud providers; no alternative is provided, but the context (no parameters) suggests it is a straightforward listing. No explicit when-not or exclusion, but sufficient for a simple list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_email_accountsAInspect

List all configured IMAP email accounts. Shows default and named accounts (work, personal, etc.).

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It states the tool lists accounts and mentions showing default and named accounts, giving some insight into output. However, it does not disclose potential side effects (none expected for a read), authentication needs, or any rate limits, which is acceptable for a simple list operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences with no redundant words. Every sentence contributes to understanding: the first states the action and resource, the second provides additional detail about the output scope.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no params, no output schema) and multiple sibling tools with overlapping names, the description could be clearer about what 'configured IMAP' specifically means compared to 'managed' or 'local' accounts. It is adequate but leaves room for confusion.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has zero parameters, and schema description coverage is 100%. Per guidelines, baseline is 4. The description adds value by explaining what the list contains (default and named accounts), enriching the semantic understanding beyond the empty schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'List' and identifies the resource as 'configured IMAP email accounts'. It further clarifies that it shows both default and named accounts (work, personal, etc.), which distinguishes it from the sibling tool 'list_email_accounts_managed' and other list tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no explicit guidance on when to use this tool versus alternatives like 'list_email_accounts_managed' or 'list_mailboxes'. It does not indicate prerequisites or contexts where this tool is preferred, requiring the agent to infer from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_email_accounts_managedAInspect

List all email accounts with SMTP credentials (optionally filtered by domain)

ParametersJSON Schema
NameRequiredDescriptionDefault
domainNoFilter by domain
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It states a read operation (list), but does not disclose rate limits, authentication requirements, or behavior when no accounts exist. Minimal but not misleading.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence with front-loaded action and scope: 'List all email accounts with SMTP credentials (optionally filtered by domain)'. No wasted words; efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with one optional parameter and no output schema or annotations, the description adequately covers purpose and filtering. However, it lacks mention of pagination or limits, which would be beneficial for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description adds 'optionally filtered by domain' which mirrors the schema's parameter description. No additional semantic depth beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists all email accounts with SMTP credentials and optionally filters by domain. This distinguishes it from sibling tools like list_email_accounts (which may not include credentials) and other list tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when needing SMTP credentials, but provides no explicit guidance on when to use this tool versus alternatives like list_email_accounts. No exclusions or context are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_email_rulesAInspect

List all email rules with their match conditions, actions, and hit counts

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Implies read-only but does not state non-destructive nature, authentication needs, rate limits, or any behavioral traits. Minimal disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

One sentence with 10 words. Front-loaded with verb and resource. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple list tool with no parameters and no output schema. Covers the result contents. Could mention ordering or lack of filtering but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters defined, so schema coverage is 100% trivially. Baseline is 4 for zero parameters. Description adds value by explaining what is listed, exceeding the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the verb 'list' and the resource 'email rules', and specifies the information included (match conditions, actions, hit counts). Distinguishes from sibling tools like create_email_rule, delete_email_rule.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. No discussion of prerequisites, context, or exclusions despite many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_local_mailboxesAInspect

List all local mailboxes on this server (autonomous mode)

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It indicates a read-only listing but does not disclose any potential side effects, permissions, or performance considerations. For a simple list tool, this is adequate but not exceptional.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that covers the essential information without any unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a zero-parameter list tool without an output schema, the description provides the minimum viable information. However, additional context like the difference from 'list_mailboxes' or what 'local mailboxes' means could improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, and schema coverage is 100%. The description adds no parameter information, which is acceptable given no parameters. Baseline 4 as per guidelines for zero-parameter tools.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (list), the resource (local mailboxes), and the context (autonomous mode). It distinguishes from sibling tools like 'list_mailboxes' by specifying 'local' and 'autonomous mode'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (when needing to list local mailboxes in autonomous mode), but provides no explicit guidance on when not to use or which alternative tools to consider.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_mailboxesAInspect

List all available mailboxes for an email account. If IMAP is configured in .env, no credentials needed. Use "account" to pick a named account.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountNoNamed account (e.g. "work", "personal"). Omit for default.
imap_configNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided. The description implies a read-only operation but does not detail behavior like whether authentication is required or what 'available' means. Some transparency provided but gaps remain.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences that front-load the primary purpose and immediately provide useful guidance. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema or annotations. The description is minimal and does not mention return format or error conditions. Adequate for a simple list operation but could be more complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 50% (only 'account' has a description). The description adds context about IMAP configuration and named accounts, partially compensating for the missing schema description on 'imap_config'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'List all available mailboxes for an email account', specifying the verb and resource. However, it does not differentiate from sibling tools like 'list_local_mailboxes'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides context on using the 'account' parameter and when IMAP credentials are needed, but lacks explicit guidance on when to use this tool versus alternatives, and no when-not usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_sending_domainsAInspect

List all configured sending domains with their verification status

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must convey behavioral traits. It accurately indicates a read-only list operation, but does not disclose potential pagination, rate limits, or the format of the verification status. The scope 'all configured sending domains' is transparent, but additional details are missing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that conveys the essential information with no wordiness. Every word contributes to the purpose, making it highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (no parameters, no output schema, no annotations), the description provides the core functionality and result content (verification status). It is complete enough for a basic list operation, though it could optionally mention that the output lists all domains with their status.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero parameters and 100% schema coverage, so the description does not need to add parameter info. The baseline for zero parameters is 4, and the description meets this standard without needing to elaborate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'list' and the resource 'sending domains', adding the specific detail of 'with their verification status'. This is unambiguous and distinguishes it from sibling tools like add_sending_domain or check_email_config.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidelines are provided. The description does not mention when to use this tool versus alternatives such as check_email_config or verify_email_setup. There is no guidance on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

provision_resumeAInspect

Resume provisioning from where you left off. Automatically detects the current phase and advances: WireGuard connection → DNS setup → TLS → verification. Run this after each user action (connecting WireGuard, adding DNS records).

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries the burden. It discloses automated phase detection and advancement, but lacks detail on error handling or what happens if run out of order. Adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: purpose, process, usage hint. No redundant information; every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no parameters and no output schema, the description provides enough context to understand its role and when to invoke it. Lacks explicit mention of prerequisites (e.g., start with provision_start), but the usage hint implies the flow.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Tool has zero parameters; description adds no parameter info, which is acceptable. Baseline 4 per guidelines for 0-parameter tools.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool resumes provisioning, specifies the phase sequence, and implicitly distinguishes from siblings like provision_start and provision_status by focusing on resuming after user actions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Run this after each user action', providing clear context. Does not explicitly mention when not to use or alternatives, but the sibling tools are distinct enough to infer.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

provision_startAInspect

Phase 1: Create a cloud VPS with dedicated IP. Generates WireGuard config. Returns the IP and instructions to connect. Supports DigitalOcean, Vultr, Hetzner, OVH.

ParametersJSON Schema
NameRequiredDescriptionDefault
regionNoCloud region (e.g. nyc1, ewr, nbg1, GRA11). Omit for default.
cloud_api_keyYesCloud provider API key
cloud_providerYesCloud provider
service_domainYesDomain you own for the email service (e.g. emailrelay.xyz). NS will point here.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses that the tool creates a VPS, generates config, and returns IP/instructions. It does not mention destructive behavior or side effects, but the creation nature is clear.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences front-loaded with 'Phase 1', then action, then output. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a provisioning tool with no output schema, the description adequately states what is returned (IP and instructions). It does not cover error conditions or detailed WireGuard config, but is sufficient for the first phase.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description does not add significant meaning beyond the schema; it merely restates that cloud_provider has supported values and service_domain is for email.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it creates a cloud VPS with dedicated IP and generates WireGuard config, listing supported providers. It distinguishes itself from sibling tools like provision_resume and provision_status by explicitly calling it 'Phase 1'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies it is the first step in provisioning ('Phase 1'), and the sibling list includes provision_resume and provision_status, suggesting a workflow. However, it does not explicitly state when to use or not use this tool versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

provision_statusAInspect

Show current provisioning state — which phases are complete, what to do next

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It describes a read-only operation but does not disclose behavioral traits such as the need for active provisioning, rate limits, or any side effects. The description is vague and fails to provide adequate transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that conveys the core purpose with no unnecessary words. It is front-loaded with the primary action and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has no parameters and no output schema, the description is adequate but lacks context about the output format or typical usage scenarios. It could be improved by noting what the response includes or providing examples.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0 parameters with 100% schema coverage, so there is no parameter information to convey. The description does not add parameter-specific details, but since none are needed, the baseline score of 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: showing the current provisioning state, including completed phases and next steps. It uses a specific verb 'show' and resource 'provisioning state', and implicitly distinguishes it from siblings like provision_resume and provision_start.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like get_health_status or check_ip_status. It does not mention prerequisites or scenarios where it should not be used.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_inboxBInspect

Read emails from an inbox via IMAP with filtering options. If IMAP is configured in .env, no credentials needed. Use "account" to pick a named account (work, personal, etc.).

ParametersJSON Schema
NameRequiredDescriptionDefault
filterNo
accountNoNamed account to read from (e.g. "work", "personal"). Omit for default account.
optionsNo
imap_configNoOptional if IMAP is configured in .env (IMAP_HOST, IMAP_EMAIL, IMAP_PASSWORD)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description must disclose behavioral traits fully. It mentions the IMAP protocol and credential options but omits details like whether emails are marked as read (option exists), pagination, rate limits, or error handling. This leaves significant gaps for the agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences that are direct and front-loaded. The first sentence states the main purpose, the second covers credential options, and the third explains account selection. No filler or repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with multiple nested parameters and no output schema, the description should summarize key capabilities like filtering, options, and IMAP configuration. It covers accounts and credentials but not the richness of filter/options. The description is adequate but not fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already describes most filter and option parameters. The description adds value by explaining the account parameter and the optional imap_config linked to .env. However, with 50% schema description coverage, the description does not compensate for all undocumented parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool reads emails from an inbox via IMAP with filtering options, which is specific and informative. However, it does not explicitly distinguish itself from sibling tools like get_conversation_context or get_threads_for_sender, although the core purpose is distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides guidance on when to use the default account vs a named account and notes that IMAP credentials can be omitted if configured in .env. It does not, however, give any other when-to-use or when-not-to-use advice, nor does it compare to alternative tools for reading emails.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

regenerate_zone_fileAInspect

Regenerate the complete DNS zone file for the service domain including all customer subdomains. Use after adding/removing domains.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It implies a write operation that regenerates the zone file but does not disclose effects like overwriting custom records, required permissions, or whether the operation is destructive. Lacks key behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the core action, no fluff. Every sentence serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a zero-parameter tool with no output schema, the description is mostly complete: it states purpose and usage context. Missing details like prerequisites or synchronous behavior, but overall adequate for selecting this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so schema coverage is effectively 100%. Baseline 3 applies. Description adds no parameter info, which is acceptable given the empty input schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action (regenerate) and the resource (complete DNS zone file for service domain including customer subdomains). Distinguishes from sibling tools like get_dns_records by focusing on regeneration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use after adding/removing domains,' providing clear context for when to invoke. Does not list alternatives or when not to use, but the guidance is direct and actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove_from_suppressionAInspect

Remove an address from the bounce suppression list (manual override to allow sending again)

ParametersJSON Schema
NameRequiredDescriptionDefault
addressYesEmail address to unblock
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses the mutation (remove) and purpose but does not detail side effects, permissions, or rate limits. Adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no redundancy, front-loaded with verb and resource. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description is sufficient. It explains the purpose and context without overloading.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter 'address' described. The description adds context ('bounce suppression list', 'manual override') beyond the schema's 'Email address to unblock', enhancing understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Remove an address'), the resource ('bounce suppression list'), and the effect ('manual override to allow sending again'). It is specific and distinct from siblings like get_suppression_list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context (manual override for re-enabling sending after bounce) but does not explicitly state when not to use or provide alternatives. It is clear but lacks exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove_sending_domainBInspect

Remove a sending domain and its DKIM keys

ParametersJSON Schema
NameRequiredDescriptionDefault
domainYesCustomer domain to remove
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description alone must convey behavioral traits. It implies a destructive action but does not detail consequences, required permissions, or state prerequisites like the domain must exist. Basic transparency is present but insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence with no superfluous words. It is front-loaded with the action and resource, making it highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-parameter removal tool with no output schema, the description is fairly complete. It covers the main action and parameter, though it could optionally add more detail about the effect on DKIM keys.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 100% coverage for the 'domain' parameter, and the description adds minimal context by linking it to DKIM keys. Baseline is 3; no significant additional meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'Remove' and the resource 'sending domain' with additional detail about DKIM keys. It is specific and distinguishes from sibling tools like add_sending_domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives, nor are there prerequisites or conditions mentioned. The description lacks explicit usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

request_email_ipBInspect

Request a dedicated WireGuard IP for email sending. Submits to the provisioning queue.

ParametersJSON Schema
NameRequiredDescriptionDefault
account_idNoOptional account identifier
email_domainYesYour email domain (e.g. company.com)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It reveals that the request is submitted to a provisioning queue (implying asynchronous processing), but lacks details on authentication, side effects, or what happens after submission.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundancy. The key action and resource are front-loaded, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description should provide more context about the queue, expected response, and how to check status. It omits critical information for an async request tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% parameter descriptions, so the baseline is 3. The tool description adds little beyond the schema; it merely restates the domain parameter implicitly. No meaningful additional semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Request') and the specific resource ('a dedicated WireGuard IP for email sending'), which uniquely identifies its purpose among sibling tools. It also mentions submission to the provisioning queue, adding context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'provision_start' or 'provision_status'. It does not explain prerequisites or when not to use it, leaving the AI agent without decision support.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reset_account_passwordAInspect

Reset the SMTP password for an email account. Returns new password once.

ParametersJSON Schema
NameRequiredDescriptionDefault
addressYesEmail address of the account
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses the main behavior (reset and return password) but does not mention potential side effects (e.g., invalidating existing connections) or error states. It is not misleading but lacks depth.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no extraneous information. Every word contributes to understanding the tool's purpose and output.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with one parameter. The description sufficiently explains the core function, though it could mention what happens on failure (e.g., account not found) or the return format. Still, it feels complete for the context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers the only parameter ('address') with a clear description. The tool description adds no additional meaning beyond the schema, so it performs at the baseline level given 100% schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the action ('reset'), the resource ('SMTP password for an email account'), and the outcome ('Returns new password once'). It is specific and distinguishes from sibling tools like 'create_email_account' or 'list_email_accounts'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for password reset but provides no explicit guidance on when to use this tool versus alternatives, no prerequisites, and no 'when not to use' advice. It is adequate but lacks situational guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_emailBInspect

Send an email with text or HTML content

ParametersJSON Schema
NameRequiredDescriptionDefault
ccNoArray of CC recipient email addresses
toYesArray of recipient email addresses
bccNoArray of BCC recipient email addresses
htmlNoHTML content of the email
textYesPlain text content of the email
subjectYesEmail subject line
providerNoSend via a specific provider instead of the default (e.g. smtp, mailgun, sendgrid, ses, gmail, resend). Requires that provider's credentials in env.
bulk_modeNoRequired when sending to multiple recipients. "individual": sends a separate email to each recipient (safest). "bcc": puts first recipient in TO:, rest in BCC. "to": puts all in TO: (exposes addresses — use only when intended, e.g. team threads).
attachmentsNoArray of file attachments
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description must disclose behavior. It merely states the content types but omits critical details like required authentication, side effects (sending email), error handling, or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no waste. However, it could be slightly expanded for clarity without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 9 parameters and no output schema, the description is too minimal. It fails to explain return values, error handling, bulk_mode usage, or provider selection, leaving significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so baseline is 3. The description adds no extra meaning beyond 'text or HTML content', which is already in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Send an email with text or HTML content', which is a specific verb+resource combination. It distinguishes from the sibling tool 'send_template_email' which sends a predefined template.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'send_template_email'. There is no mention of prerequisites, context, or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_template_emailCInspect

Send an email using a predefined template

ParametersJSON Schema
NameRequiredDescriptionDefault
toYesArray of recipient email addresses
subjectNoEmail subject (can override template subject)
providerNoSend via a specific provider instead of the default (e.g. smtp, mailgun, sendgrid, ses, gmail, resend)
templateYesTemplate name or path
bulk_modeNoRequired when sending to multiple recipients. "individual": sends a separate email to each recipient (safest). "bcc": puts first recipient in TO:, rest in BCC. "to": puts all in TO: (exposes addresses — use only when intended).
variablesNoTemplate variables to replace
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and the description does not disclose any behavioral traits such as side effects (e.g., whether it mutates state), authentication requirements, rate limits, or error behavior. The one-sentence description is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise (one sentence), but lacks structural elements like bullet points or additional context. It is not overly verbose but could benefit from more structured information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 6 parameters, no annotations, and no output schema, the description is far too brief. It fails to explain return values, error cases, or conditions for success. For a tool that sends emails, critical context is missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema coverage is 100% with detailed descriptions, especially for 'bulk_mode'. The description adds no additional value beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Send an email') and the specific resource ('using a predefined template'). It distinguishes from sibling tools like 'send_email' which likely sends without a template.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives (e.g., 'send_email'), nor any prerequisites or conditions for use. The description is purely definitional.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

setup_autonomous_securityAInspect

Auto-configure DKIM, SPF, and DMARC for autonomous email. Generates DKIM keys, creates DNS records, and outputs everything needed.

ParametersJSON Schema
NameRequiredDescriptionDefault
ipv4YesYour dedicated IPv4 address
domainYesYour email domain
dmarc_policyNoDMARC policy (default: reject)
dkim_selectorNoDKIM selector (default: emailmcp)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses key activities (generates keys, creates DNS records, outputs), but lacks detail on potential destructive actions (e.g., overwriting existing DNS records) or required permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that efficiently conveys the tool's purpose. However, it could be improved by front-loading the most critical information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, yet the description only vaguely mentions 'outputs everything needed' without specifying return format or content. For a complex setup tool, this leaves the agent lacking crucial information to parse results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with descriptions for all 4 parameters. The description adds no extra meaning beyond the schema, maintaining the baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it auto-configures DKIM, SPF, and DMARC for autonomous email, specifies actions (generates keys, creates DNS records, outputs needed info), and distinguishes from sibling tools like verify_autonomous_setup.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies it should be used for initial autonomous email setup but does not provide explicit guidance on when to use it versus alternatives like configure_mailgun or verify_autonomous_setup, nor does it mention prerequisites or conditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

start_health_monitorAInspect

Start monitoring the provisioned email server — checks tunnel, SMTP, DNS, and IP reputation every 60 seconds. Auto-restarts on failure.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses key behaviors (60-second checks, auto-restart on failure). However, it omits details on idempotency (what if already running), side effects, or required permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded with the action verb. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers what the tool does and its behavioral traits (frequency, auto-restart). It lacks information on how to stop monitoring, whether it is persistent across sessions, and what the return value indicates. Given no output schema, a bit more context would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so the schema fully covers them. The description adds no extra parameter info, but baseline is 4 given 100% coverage and zero params.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Start monitoring', specifies the resource 'provisioned email server', and details the checks (tunnel, SMTP, DNS, IP reputation) and frequency (every 60 seconds). It also includes auto-restart behavior, distinguishing it from other tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for continuous monitoring but does not explicitly state when to use it vs alternatives like 'get_health_status' or when to stop monitoring. No prerequisites or exclusions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

start_idle_watchAInspect

Start real-time IMAP IDLE watching on email accounts. Get push notifications when new emails arrive. Works with Gmail, iCloud, Office365, etc.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountNoAccount name to watch (e.g. "work", "personal"). Omit to watch all configured accounts.
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose side effects, permissions needed, or whether the connection persists in the background, leaving behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences front-load the action and benefit, with no filler words. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the description explains the function and supported accounts, it lacks details on the behavior after starting (e.g., background process, how to stop) and has no output schema. This is adequate but not complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already has a clear description for the only parameter (account), and the tool description adds no additional semantic value beyond the schema, so baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (start IMAP IDLE watching) and the resource (email accounts), and distinguishes from sibling tools like stop_idle_watch and get_idle_status by specifying it enables real-time push notifications.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions it works with Gmail, iCloud, Office365, etc., suggesting compatibility but does not explicitly state when to use this tool over alternatives or provide prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

start_smtp_receiverBInspect

Start the SMTP server to receive forwarded corporate emails

ParametersJSON Schema
NameRequiredDescriptionDefault
portNoPort to listen on (default: 2525)
enableThreadingNoEnable conversation threading
receivingDomainsNoDomains to accept emails for (empty = accept all)
allowedForwardersNoIP addresses allowed to forward emails (empty = allow all)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It fails to mention side effects (e.g., server process starts, port binding), required permissions, or behavior if already running. Without annotations, this is a critical gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence that is concise and front-loaded with the action. It contains no fluff, but could be slightly improved by adding more context without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema and no description of return values, success indicators, or errors. The agent lacks information about what happens after calling this tool, such as whether it returns a status or blocks.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The tool description adds no additional parameter context, but the schema already provides sufficient detail. Baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (Start) and the resource (SMTP server to receive forwarded corporate emails), distinguishing it from siblings like stop_smtp_receiver and get_smtp_receiver_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as stop_smtp_receiver or other start tools. No prerequisites, expected preconditions, or postconditions are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

stop_idle_watchBInspect

Stop IMAP IDLE watching on an account (or all accounts)

ParametersJSON Schema
NameRequiredDescriptionDefault
accountNoAccount to stop watching. Omit to stop all.
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the behavioral burden. It only states the basic stop action without disclosing side effects, authorization needs, or what happens to existing connections. This is insufficient for a state-changing tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no wasted words. It is efficient but could be slightly expanded with no harm to conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity and the lack of an output schema, the description provides what is necessary to understand the core action and parameter. However, it does not mention return values or success/failure indicators, leaving some gap for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (one parameter fully described). The description adds no extra meaning beyond the schema, as it simply restates the 'account' parameter's behavior. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'Stop', the technology 'IMAP IDLE', and the resource 'on an account (or all accounts)', providing a specific verb and resource that distinguishes it from sibling tools like start_idle_watch and get_idle_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies using this tool when you want to stop watching, but it does not explicitly state when to use it, when not to, or mention alternatives. It gives clear context but lacks exclusions or comparative guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

stop_smtp_receiverBInspect

Stop the SMTP forward receiver server

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavior, but it only says 'stop' without explaining effects, safety, or prerequisites.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single sentence; no wasted words, though could include more detail without harming conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Minimal for a simple tool, but lacks relationship to sibling tools like start_smtp_receiver, leaving the agent without context for proper use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so the description need not add parameter info; baseline 4 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (Stop) and the resource (SMTP forward receiver server), making the purpose obvious.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like start_smtp_receiver or get_smtp_receiver_status; lacks context for proper sequencing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

test_email_configBInspect

Test email configuration by sending a test email

ParametersJSON Schema
NameRequiredDescriptionDefault
test_emailNoEmail address to send test email to (optional, defaults to from_email)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. The description does not disclose side effects, such as whether a real email is sent, or if it requires valid configuration. Lacks transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise with one sentence. However, it is somewhat vague and could benefit from slightly more detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Missing details on what constitutes success/failure, what happens when the email is sent, and prerequisites. Incomplete for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers parameter with description. Default behavior is mentioned, but the description adds no further semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the purpose: testing email configuration by sending a test email. It distinguishes from siblings like send_email and check_email_config.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives such as check_email_config or send_email. Does not specify prerequisites or typical use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_email_ruleBInspect

Update an existing email rule (enable/disable, change match/actions/priority)

ParametersJSON Schema
NameRequiredDescriptionDefault
stopNoStop after match
matchNoNew match conditions
actionsNoNew actions
enabledNoEnable or disable the rule
rule_idYesRule ID to update
priorityNoNew priority
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose behavioral traits such as whether updates are partial or full, idempotency, or required permissions. The description relies solely on the schema, missing an opportunity to add value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that front-loads the core action and key parameters. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, plus the complexity of the tool (6 parameters, nested objects), the description is insufficient. It does not mention return values, error handling, or any behavioral context beyond the basic update operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so the baseline is 3. The description adds a brief list of changeable aspects, but this only partially overlaps with the six parameters (e.g., 'stop' is not mentioned). Overall, marginal added value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the verb 'Update' and the resource 'email rule', and lists specific changeable aspects (enable/disable, match, actions, priority). This clearly distinguishes it from siblings like create_email_rule and delete_email_rule.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus creating a new rule or deleting one. The description lacks any contextual cues for the agent to decide between update and other operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_emailBInspect

Validate email addresses and check deliverability

ParametersJSON Schema
NameRequiredDescriptionDefault
emailsYesArray of email addresses to validate
check_mxNoCheck MX records for domain
check_smtpNoPerform SMTP validation
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears full responsibility for disclosing behavioral traits. It only mentions validation and deliverability checking without detailing side effects, rate limits, authentication requirements, or error states. This is insufficient for an agent to understand the tool's behavior fully.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with no wasted words. While it is very short, it conveys the core purpose without verbosity. However, it could be slightly improved by adding more context without sacrificing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has three parameters and no output schema, the description is incomplete. It does not explain what 'check deliverability' means in practice (e.g., MX or SMTP checks are only hinted via parameters), nor does it describe the return format or potential outcomes, leaving gaps for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with all three parameters having descriptions in the input schema. The description does not add additional meaning beyond what the schema already provides, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the function: validating email addresses and checking deliverability. It uses a specific verb ('validate') and resource ('email addresses'), and the addition of 'check deliverability' distinguishes it from sibling tools like check_email_config or test_email_config that focus on configuration or testing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus its siblings, such as check_email_config or test_email_config. It does not specify prerequisites, exclusions, or scenarios where alternative tools would be more appropriate, leaving the agent without context for decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_autonomous_setupCInspect

Verify that autonomous email is properly configured: DNS records, DKIM, SMTP send/receive

ParametersJSON Schema
NameRequiredDescriptionDefault
domainYesYour email domain to verify
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, and the description fails to disclose whether the tool is read-only or has side effects. It lists what is checked but not behavioral traits like idempotency or permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with the verb and resource. Efficient but could be more structured with bullet points for readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks output schema and does not describe return values or success/failure indicators. In the context of numerous similar siblings, more detail is needed for full completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description for the domain parameter. The tool description adds no additional parameter context beyond the schema, meeting the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool verifies autonomous email configuration including DNS records, DKIM, and SMTP. However, it does not distinguish from the sibling 'verify_email_setup', which could cause confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'check_email_config' or 'test_email_config'. No prerequisites or when-not-to-use information provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_email_setupCInspect

Verify IP whitelisting and email provider configuration

ParametersJSON Schema
NameRequiredDescriptionDefault
domainNoDomain to check for DNS records (required if include_dns is true)
providerYesEmail provider to verify, or "all" for all configured providers
test_emailNoOptional test email address to verify delivery
include_dnsNoInclude DNS record verification
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It does not disclose whether the tool performs a read-only check or if it modifies configurations (e.g., by sending a test email). Safety, mutability, and side effects are absent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is direct and free of fluff. It is appropriately front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and no annotations, the description fails to explain what the verification result looks like (e.g., pass/fail, detailed report). It also omits any context about prerequisites, such as having a provider configured first. The tool's behavior is under-specified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All 4 parameters are described in the input schema (100% coverage), so the schema already provides meaning. The description does not add extra insight beyond the schema, earning a baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies the verb 'Verify' and the resources 'IP whitelisting and email provider configuration', indicating a validation operation. However, it doesn't differentiate from sibling tools like 'check_email_config' or 'test_email_config', which likely have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance is given on when to use this tool versus alternatives such as 'check_ip_status' or 'test_email_config'. The description does not mention prerequisites or conditions for use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.