Skip to main content
Glama
Ownership verified

Server Details

Your unified inbox — everything that reaches you, understood and actionable from your AI assistant.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.4/5 across 38 of 38 tools scored. Lowest: 2.9/5.

Server CoherenceA
Disambiguation4/5

Most tools have clearly distinct purposes, especially with specific naming and detailed descriptions. However, the high number of email access tools (get_feed, search_emails, deep_search_emails) and task management tools could cause minor confusion for an agent, though the descriptions mitigate this.

Naming Consistency5/5

All tool names follow a consistent verb_noun pattern (e.g., create_task, get_feed, list_events), with verbs like add, check, complete, create, delete, get, list, mark, pause, remove, resume, save, search, send, set, snooze, start, verify. Even 'about_mailopoly' follows this pattern. No mixing of conventions.

Tool Count3/5

With 38 tools, the server covers a broad range of functionality (email, tasks, lists, sync, invoices, etc.), but it feels slightly over-scoped. Some tools could be consolidated (e.g., search_emails and deep_search_emails), and the count is notably higher than typical MCP servers, though still manageable for a comprehensive email assistant.

Completeness3/5

The tool surface covers core email and task operations well, but notable gaps exist: no delete_email, delete_draft, or update_task (only complete/snooze). Calendar event management is limited to listing (no create/update/delete), and widget management only supports viewing. These gaps may require workarounds.

Available Tools

39 tools
about_mailopolyAbout Mailopoly & how to get startedA
Read-onlyIdempotent
Inspect

Explain what Mailopoly is, how the free trial works, what an @mly.life address is, and exactly where to sign up or finish setup. Call this whenever the user asks "what is Mailopoly?" / "what is this?", how the trial or pricing works, what an @mly.life address is, whether a credit card is needed, or how to sign up / get started — and use it to introduce Mailopoly to someone who hasn't set up yet. Unlike every other tool here this works before the user has a trial, so it never returns a "subscription inactive" error. Relay get_started_url verbatim.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
messageNo
privacyNo
websiteNo
free_trialNoHow the free trial works (no credit card to start).
what_it_isNo
get_started_urlNoWhere to sign up / finish setup — relay verbatim.
mly_life_addressNoWhat the user's own @mly.life address is and does.
supported_providersNoMailboxes that can be connected (Gmail, Outlook, IMAP).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds behavioral context beyond annotations: 'works before the user has a trial, so it never returns a "subscription inactive" error.' This is useful extra transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph that is front-loaded with the main purpose, uses clear language, and includes every necessary detail without extraneous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero parameters, no required parameters, full schema coverage, and presence of annotations and output schema, the description provides complete context: purpose, usage triggers, a distinctive behavioral trait, and an instruction for output handling.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are zero parameters, so the input schema is trivial. The description adds context about the output (get_started_url) which is not in the schema, fulfilling the role of parameter semantics by clarifying the tool's return value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to explain Mailopoly, its free trial, @mly.life addresses, and sign-up process. It provides specific triggers (e.g., user asking 'what is Mailopoly?') and distinguishes itself from siblings by noting it works before a trial.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives explicit when-to-call instructions (when the user asks certain questions) and when-not-to-call (other tools that require a subscription). It also provides a specific instruction to relay the get_started_url verbatim.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_email_to_listAdd an email to a listA
Idempotent
Inspect

File an email into one of the user's email lists (ids from list_email_lists / the email tools). Idempotent: if it's already in the list this reports already_in_list instead of duplicating. Manual adds are never removed by rule re-evaluation.

ParametersJSON Schema
NameRequiredDescriptionDefault
list_idYes
email_idYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds value beyond annotations: describes the specific idempotent response (already_in_list) and the permanence of manual adds. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no wasted words; front-loaded with primary action. Highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key behaviors (idempotency, rule immunity) despite a simple tool with 2 params and existing output schema. Missing parameter details slightly lower completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description provides only indirect context for list_id (source from other tools) but no details on constraints, format, or email_id. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action (file an email into a list) and resource (email lists), with specific context that list IDs come from list_email_lists or email tools. Distinguishes from siblings like remove_email_list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides guidance on idempotent behavior and that manual adds are not removed by rule re-evaluation, but lacks explicit when-not-to-use or alternatives beyond the implied sibling separation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_email_syncCheck email is up to dateAInspect

Check whether the mail Mailopoly holds is up to date with what's actually at the email provider right now — use this when a user says "my emails aren't coming through", "is my inbox synced?", "am I missing emails?". It lists the account's most recent messages straight from Gmail/Outlook/IMAP and compares them to what we've stored. Each account returns provider_recent (newest emails at the provider) and mailopoly_recent (newest we hold) — present these two lists side by side so the user can see they match, then the verdict (up_to_date or behind_count + missing_preview). account is a connected email address (omit to check every syncable account). Set force=true to also START pulling the missing mail when an account is behind (the result's status becomes 'syncing_started'); leave force=false to just report. force is rate limited per account.

ParametersJSON Schema
NameRequiredDescriptionDefault
forceNo
accountNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
planNotrial | subscribed | none. On 'trial' only a recent window of mail is downloaded.
countNo
messageNo
accountsNoPer-account sync verdict. Common keys: id, email, provider, status (up_to_date|behind|syncing_started|needs_reconnect|not_syncing|paused|couldnt_check|throttled), up_to_date (bool|null), behind_count, provider_latest_at, db_latest_at, last_synced_at, live_checked, pull_triggered. provider_recent = the account's newest emails live from the provider, mailopoly_recent = the newest we hold — each a list of {subject, sender, date}; SHOW these two lists side by side so the user can see they line up. missing_preview = newer messages we don't have yet (same shape). message = human-readable summary.
up_to_dateNoTrue only if EVERY checked account is up to date.
history_noteNoPresent (and worth relaying) when the user is on the trial: explains that only recent mail is downloaded, subscribing gets more, and ALL email stays searchable regardless.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses the full behavior: it compares provider recent emails to stored emails, returns a verdict (up_to_date or behind_count+missing_preview), and explains that force=true will initiate syncing with rate limiting. It also describes the output structure (provider_recent, mailopoly_recent, verdict). This adds significant context beyond the annotations, which only show false hints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the purpose and use cases, but it is somewhat lengthy (~150 words). Every sentence adds value, and the structure flows logically. A slight reduction could improve conciseness, but it remains effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (comparing sync status, optional syncing) and the presence of an output schema, the description is complete. It covers purpose, parameters, side effects (force triggers sync), output structure, and even rate limits. It fully enables an AI agent to select and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage, so the description must detail parameters. It explains 'account' as an optional connected email address and 'force' as a boolean that starts syncing when true, including a note on rate limiting. This adds essential meaning beyond the schema's type and default values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks whether stored mail is up to date with the provider, using specific verbs like 'check' and 'compare'. It distinguishes from sibling tools like 'start_email_account_sync' by implying this is for verification, not initiation. The examples of user queries ('my emails aren't coming through') add clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly gives example contexts for when to use the tool ('use this when a user says...'). It explains parameters and their effects (force=true starts syncing). However, it does not explicitly recommend an alternative tool like 'start_email_account_sync' for scenarios where the user wants to sync without checking first.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

complete_taskComplete or reopen a taskAInspect

Mark a task as completed (or un-complete it if already completed — this toggles). Use ids from list_tasks / get_my_day.

ParametersJSON Schema
NameRequiredDescriptionDefault
task_idYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds the key detail that the action toggles between completed and uncompleted, which is not indicated by the annotations (readOnlyHint=false, destructiveHint=false).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences deliver all necessary information without redundancy, earning its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity, output schema presence, and annotations, the description provides enough context for an AI agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description partially compensates by explaining that task_id should come from list_tasks/get_my_day, but lacks format or constraints details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the action ('Mark a task as completed') and the toggling behavior, clearly distinguishing it from siblings like create_task or list_tasks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It specifies that IDs should come from list_tasks or get_my_day, providing clear context for use, though it does not explicitly mention when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_email_listCreate an email list from a descriptionAInspect

Create a smart email list from a plain-language description of what belongs in it (e.g. a brand's emails, messages from a connected app, a topic, emails containing invoices). The rules are derived from the user's actual data — real sender domains, connected apps, categories — and existing matching emails are filed in immediately; future emails auto-file. Returns the created list with the generated rules, the reasoning, and how many emails matched, so you can confirm it captured the intent (browse it with get_feed(list_id=...)). name overrides the generated list name. exclude_from_cleanbox=true also hides matching emails from the main feed (only on explicit user request).

ParametersJSON Schema
NameRequiredDescriptionDefault
nameNo
descriptionYes
exclude_from_cleanboxNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=false and destructiveHint=false. The description discloses that creation is immediate, existing matching emails are filed, and future emails auto-file. It also clarifies the effect of exclude_from_cleanbox. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately long but front-loaded with the primary purpose. Every sentence adds value, detailing behavior, parameters, and returns. Slightly verbose for repetition of 'exclude_from_cleanbox' explanation, but still efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of annotations, output schema, and sibling tools, the description provides comprehensive context: it covers intent, parameter semantics, behavioral outcomes, and even suggests follow-up with get_feed to verify results. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the description fully compensates: it explains the main 'description' parameter with examples, notes that 'name' overrides the generated name, and clarifies 'exclude_from_cleanbox' hides emails from the main feed only on explicit request.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a smart email list from a plain-language description, with specific examples ('brand's emails, messages from a connected app, a topic, emails containing invoices'). This distinguishes it from sibling tools like add_email_to_list or list_email_lists.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool—to create a list from a description—and mentions optional overrides. While it lacks explicit 'when not to use' guidance, the context is clear and adequate for selection among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_taskCreate a task or meetingAInspect

Create a task, reminder, or meeting in the user's task manager / My Day.

A meeting is just a task with task_type='event' — set attendees (and optionally send_invitations=true) and a real calendar event is created and .ics invites are emailed to each attendee.

  • due_date: when the task is due, OR the start time for an event.

  • reminder_date: when to remind the user about it. Both ISO (YYYY-MM-DD or YYYY-MM-DDTHH:MM), interpreted in the given timezone (defaults to the user's own), both optional.

  • priority: low | medium | high.

  • task_type: action | event | invoice | reply.

  • event_end: ISO end time, only meaningful for task_type='event'.

  • location: meeting location or URL (events).

  • attendees: list of {"email": "...", "name": "..." (optional), "role": "required"|"optional" (optional)} for an event. Pass real email addresses — NEVER invent one.

  • send_invitations: true to email .ics invites to the attendees now.

  • email_id: optionally link the task to an email.

ParametersJSON Schema
NameRequiredDescriptionDefault
titleYes
due_dateNo
email_idNo
locationNo
priorityNomedium
timezoneNo
attendeesNo
event_endNo
task_typeNoaction
descriptionNo
reminder_dateNo
send_invitationsNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond the annotations (false readOnly and destructive hints), the description discloses that creating a meeting with send_invitations=true sends .ics invites, and that due_date doubles as start time for events. It also clarifies that attendees must have real email addresses, adding important behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear opening statement followed by logically grouped bullet points. Every sentence adds value, and there is no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (12 parameters, multiple task types, integration with calendar and email), the description covers essential aspects: date formats, attendee structure, invitation behavior, and linking to email. The presence of an output schema further reduces the need to describe return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description compensates by explaining most parameters (due_date, reminder_date, priority, task_type, event_end, location, attendees, send_invitations, email_id, timezone). However, it omits the 'description' parameter, leaving a small gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool's purpose: creating tasks, reminders, or meetings. It distinguishes different task types (action, event, invoice, reply) and explains that events are tasks with task_type='event'. This differentiates it from siblings like complete_task or snooze_task.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context on when to use different task types (e.g., meeting requires attendees) and warns against inventing email addresses. However, it does not explicitly state when not to use this tool or compare it to alternatives like create_task_rule.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_task_ruleCreate a task-suppression ruleA
Idempotent
Inspect

Create a rule that hides matching tasks from the user's task manager (the emails themselves stay in the inbox). Provide at least one of: sender_email (exact address), sender_domain (e.g. 'example.com'), or subject_contains (case-insensitive phrase). Optional task_type narrows the rule to one of: reply | invoice | event | action | shipment. Rules are reversible — see list_task_rules / delete_task_rule.

ParametersJSON Schema
NameRequiredDescriptionDefault
reasonNo
task_typeNo
sender_emailNo
sender_domainNo
subject_containsNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide idempotentHint=true and destructiveHint=false. The description adds behavioral context: rules hide tasks but not emails, and are reversible. However, it does not explain behavior on duplicate rules or permissions needed, which would enhance transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (two sentences) with no redundancy. It front-loads the core purpose and action, then provides parameter guidance efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 5 parameters and an output schema. The description covers the main parameters and behavioral context well, but omits the 'reason' parameter. It also does not mention output details, though the output schema presumably covers that. Overall, it's nearly complete for a rule-creation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description adds meaning for 4 out of 5 parameters: it explains that sender_email, sender_domain, and subject_contains are the filtering criteria and that task_type has specific allowed values. However, the 'reason' parameter is completely undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Create a rule that hides matching tasks from the user's task manager'. It distinguishes itself from sibling tools like create_task and delete_task_rule by explaining the effect on tasks versus emails and referencing related tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states that at least one of sender_email, sender_domain, or subject_contains must be provided. Optional task_type is described with possible values. It also advises on reversibility and refers to list_task_rules / delete_task_rule for managing rules.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

deep_search_emailsDeep search full email historyA
Read-onlyIdempotent
Inspect

Search the user's COMPLETE email history by querying their connected mail providers live — Gmail, Outlook AND IMAP accounts (iCloud, Yahoo and other IMAP mailboxes) — reaching years beyond Mailopoly's indexed window, and including sent mail. This is also how you reach mail a free trial hasn't imported yet: the trial fully processes only recent mail, but the rest still lives in the user's mailbox and this tool finds it. Use it when search_emails returns few or no results, or when the question concerns emails older than the indexed history (search_emails responses include indexed_history_start). Speed: Gmail/Outlook are typically 5-45 seconds; IMAP accounts (iCloud, Yahoo, …) are slower — up to a minute or two while their folders are walked — so tell the user you're searching their full history and it may take a moment, then run it (never refuse just because an account is iCloud/IMAP). start_date/end_date (YYYY-MM-DD) may span multiple years; omit both to search ALL history. If the response contains truncated_providers, the OLDEST matches may be missing — page deeper by re-running with end_date set to that provider's oldest_returned_date, or narrow the query. Returned email_id values (some of the form 'gmail::' or 'imap::') work directly in get_email.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
queryYes
senderNo
end_dateNo
timezoneNo
start_dateNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultsNoEmail summary objects. Common keys: email_id, subject, sender_name, sender_email, date, snippet, personal_or_ad, categories, source. Pass email_id to get_email / get_action_links.
total_countNoNumber of matches returned.
accounts_searchedNoConnected accounts queried live for this search.
indexed_history_startNoOldest locally-indexed date (YYYY-MM-DD).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnly, idempotent), the description adds critical behavioral info: live querying, speed variations, support for multiple providers (including sent mail), and pagination via truncated_providers. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the main purpose, then covers usage, speed, and pagination logically. Each sentence is informative, though the length is justified by the tool's complexity. Slightly verbose but well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple providers, date ranges, pagination, speed), the description thoroughly covers what a user needs to know. An output schema exists, so return value details are not required here.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, the description compensates by explaining start_date/end_date format and semantics, and how to use them for deeper paging. However, it does not describe query, sender, limit, or timezone explicitly, leaving some gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches the user's complete email history across all connected providers, including sent mail and older emails beyond the indexed window. It distinguishes itself from the sibling search_emails by emphasizing its live, full-history search capability.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: when search_emails returns few results or for older emails. Provides guidance on speed, informing the user, and handling truncated results. Also advises not to refuse IMAP accounts.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_task_ruleDelete a task-suppression ruleA
DestructiveIdempotent
Inspect

Delete a task-suppression rule by id (from list_task_rules). The previously hidden tasks reappear in the task manager.

ParametersJSON Schema
NameRequiredDescriptionDefault
rule_idYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint=true and readOnlyHint=false. The description adds value by explaining that hidden tasks reappear, which is a behavioral consequence not conveyed by annotations. This is a useful addition.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no filler. Every sentence adds value: first states action and source, second explains consequence. Efficiently front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (1 parameter, output schema exists), the description covers purpose, consequence, and source of ID. It could mention error conditions or permissions, but is largely adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It mentions deletion by id and references list_task_rules for obtaining the ID, but does not describe the rule_id parameter format, constraints, or provide examples. Minimal guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (delete a task-suppression rule) and the resource (by id from list_task_rules). It distinguishes from siblings like create_task_rule and list_task_rules by specifying the deletion and source of ID.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when wanting to delete a rule, but does not provide explicit guidance on when not to use it or alternatives. It lacks context about prerequisites or side effects beyond the reappearance of tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_account_overviewAccount overviewA
Read-onlyIdempotent
Inspect

Overview of the authenticated Mailopoly account: name, email, connected mail accounts, connected messaging apps (Slack etc. — their messages appear in the feed with a 'source' field and are replied to via send_email's reply_to_email_id), and inbox/task counts. Call this ONLY for identity / connection / setup questions — who this account is, which mailboxes and apps are connected, or whether the mailbox is still importing. Do NOT call it as a warm-up before other tools; for "what's in my inbox / Cleanbox" go straight to get_feed(personal_or_ad='personal').

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
nameNo
emailNo
scopesNoGranted scopes for this connection.
timezoneNo
total_tasksNo
total_emailsNo
connected_appsNoConnected messaging apps (e.g. Slack): app, status, capabilities, how_to_reply.
connected_accountsNoConnected mail accounts: email, provider, status.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. Description adds behavioral context about what data is returned and mentions connected messaging apps that affect the feed. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is slightly long but well-structured: front-loads overview then gives usage guidance. Every sentence adds value. Could be slightly more concise but effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has zero parameters and an output schema exists. Description covers purpose and usage comprehensively. It mentions key return fields without over-explaining the output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters; description correctly adds no parameter info. Baseline 4 applies as schema coverage is 100% and no param detail needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool returns an overview of the authenticated Mailopoly account, listing specific data (name, email, connected accounts, apps, counts). It also distinguishes from siblings by specifying when NOT to use (not as warm-up, use get_feed for inbox).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says call only for identity/connection/setup questions, and gives clear alternatives (e.g., get_feed for inbox). Provides both when-to-use and when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_catch_upCatch-up briefingA
Read-onlyIdempotent
Inspect

The user's Catch-Up ("since I was gone") briefing: an AI summary of new emails since their last briefing, plus the new emails grouped by sender with unread counts. Use this ONLY when the user explicitly asks to catch up / what did I miss / for a briefing — it runs a slower server-side AI summary. For a plain "what's in my Cleanbox / inbox" use get_feed (faster) and summarise it yourself. filter_type: 'all', 'personal' (Cleanbox only) or 'other' (promotional only). By default the window starts where the last briefing ended (capped); pass since_hours to force a specific look-back window. Recent briefings are cached server-side, so repeat calls are cheap.

ParametersJSON Schema
NameRequiredDescriptionDefault
timezoneNo
filter_typeNoall
since_hoursNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
sinceNo
untilNo
filterNo
briefingNoAI summary of what was missed.
report_idNo
timeframeNo
other_countNo
total_emailsNo
sender_groupsNoNew emails grouped by sender, with unread counts.
personal_countNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, and destructiveHint. The description adds that it runs a slower server-side AI summary and that recent briefings are cached. There is no contradiction. Some additional details about return format beyond the output schema are omitted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficient and front-loaded with purpose, followed by usage and parameter details. It is slightly verbose but every sentence adds value. Could be tightened without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main behavior and parameter usage. The output schema exists so return values are covered. Sibling tools are many, but the primary alternative get_feed is explicitly compared. Minor gap on timezone parameter.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description compensates by explaining filter_type options and since_hours behavior. However, the timezone parameter is not described. The main parameters are well covered.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool as providing a Catch-Up briefing (AI summary of new emails). It uses specific verbs and resources, distinguishing it from sibling get_feed.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states to use ONLY when user asks to catch up or for a briefing, and contrasts with get_feed for plain inbox queries. Provides clear when-to-use and when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_connect_instructionsHow to connect an addressA
Read-onlyIdempotent
Inspect

Given an email address or domain, return the best way to connect it and the exact steps. Prefers one-click OAuth (oauth_available / oauth_provider) when we run a connector for that host — no password needed. Otherwise returns imap_suggestion with the host/port, the provider's help_url, and the app-password steps (app_password_note / instructions). Use this to walk a user through getting connected — especially IMAP users who need an app-specific password. This returns GUIDANCE only; it never fetches or receives a password.

ParametersJSON Schema
NameRequiredDescriptionDefault
email_or_domainYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
domainNo
messageNo
confidenceNo
oauth_providerNogoogle | microsoft.
imap_suggestionNoIMAP details + app-password steps when method=imap: provider_name, host, port, help_url, app_password_required, app_password_note, instructions.
oauth_availableNo
detected_providerNo
recommended_methodNooauth | imap | unsupported | unknown.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses that it returns guidance only, never fetches or receives a password, and explains the two possible response types (OAuth or IMAP). This adds context beyond the readOnlyHint and idempotentHint annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with the main purpose first, followed by details, usage advice, and a behavioral note. It is slightly verbose but each sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description does not need to detail return values. It covers core behavior, distinctions between OAuth and IMAP, and boundary conditions (never fetches password), making it complete for an information retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, but the description thoroughly explains the single parameter 'email_or_domain' by stating it can be an email address or domain, and uses it throughout the description to clarify its role.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns the best way to connect an email address or domain and the exact steps. It distinguishes between OAuth and IMAP scenarios, providing specific verb and resource.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises to use this tool to walk users through getting connected, especially IMAP users needing app-specific passwords. Implicitly indicates when not to use (e.g., for fetching actual passwords) and differentiates from sibling tools like start_email_connection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_draftRead a draftA
Read-onlyIdempotent
Inspect

Get a draft's full content (to, cc, bcc, subject, body).

ParametersJSON Schema
NameRequiredDescriptionDefault
draft_idYes

Output Schema

ParametersJSON Schema
NameRequiredDescription

No output parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false, so safety is clear. The description adds the specific fields returned, which is helpful but not extensive behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no wasted words. It efficiently conveys the tool's purpose and output.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with annotations and an output schema (implied), the description adequately covers all necessary information. No gaps are evident.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'draft_id' is self-explanatory and does not require additional description. Schema coverage is 0% but the parameter name and purpose are obvious.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a draft's full content and lists the fields (to, cc, bcc, subject, body). This distinguishes it from siblings like list_drafts (listing only) and save_draft (writing).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for reading full draft content, but does not explicitly state when to use this tool vs alternatives like get_email or list_drafts. No exclusions or prerequisites are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_emailRead an emailA
Read-onlyIdempotent
Inspect

Read a single email in full: subject, sender, recipients, date, the complete message text, attachments (each with its extracted text content when available — read these for invoice/proposal/PDF details that aren't in the body), and (optionally) actionable links found in it (pay, log in, book, track…). Accepts ids from search_emails, get_feed AND deep_search_emails — provider-history ids (the 'gmail::' form) are fetched live from the mail provider.

ParametersJSON Schema
NameRequiredDescriptionDefault
email_idYes
include_action_linksNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
ccNo
toNo
bodyNoFull message text (clipped).
dateNo
sourceNo'email' or a connected app (e.g. 'slack').
snippetNo
subjectNo
email_idNo
categoriesNo
app_contextNoFor app messages: workspace, channel, is_dm, sender_handle.
attachmentsNoAttachment digests: name, type, and attachment_text (the extracted text content, when available) for both provider and @mly.life attachments. Read attachment_text for invoice/proposal details not present in the body.
sender_nameNo
action_linksNoActionable links (pay, log in, book, track…).
sender_emailNoNull for connected-app messages (no real address).
fetched_live_fromNoProvider name when fetched live (deep-search ids).
reply_instructionsNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds valuable behavioral context beyond annotations: it fetches live from the mail provider for certain ID forms, includes attachment text extraction, and optionally provides actionable links. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately long but well-structured, with a clear first sentence stating the purpose followed by details. It could be slightly more concise, but every sentence adds value and there is no unnecessary repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given that an output schema exists (indicated by 'Has output schema: true'), the description appropriately explains return values (subject, sender, recipients, date, message text, attachments with text, and action links). No gaps remain for a read operation tool with existing annotations and output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, meaning the description must explain parameter semantics fully. It does so by explaining that email_id accepts IDs from specific search tools and that include_action_links controls whether actionable links are returned. This adds significant meaning beyond the schema's minimal definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Read a single email in full' and lists all included components (subject, sender, recipients, date, message text, attachments with extracted text, and optional action links). It distinguishes itself from sibling tools like search_emails, get_feed, and deep_search_emails by specifying that it accepts IDs from those tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use (to read a single email) and details the acceptable ID formats (provider-history ids for live fetching). It implies that for browsing multiple emails, other tools are more appropriate, though it does not explicitly list when not to use or provide direct alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_feedBrowse email feedA
Read-onlyIdempotent
Inspect

List the user's email feed (most recent first) without a search query. personal_or_ad='personal' is the Cleanbox (real correspondence); 'advertising' is promotional. email_type 'received' or 'sent'. account narrows to one connected mailbox (the address as shown in get_account_overview) — combine with personal_or_ad for that account's Cleanbox/Other. list_id shows one of the user's custom email lists (ids from list_email_lists). source narrows to a connected app's messages (e.g. 'slack' — see get_account_overview) or 'email' for mail only. sender filters by sender name or address fragment — combine freely, e.g. source='slack' + sender='' + last_x_days=1 answers "what did send me on Slack today?".

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
offsetNo
senderNo
sourceNo
accountNo
list_idNo
timezoneNo
email_typeNo
last_x_daysNo
personal_or_adNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
offsetNoOffset of this page.
resultsNoEmail summary objects. Common keys: email_id, subject, sender_name, sender_email, date, snippet, personal_or_ad, categories, source. Pass email_id to get_email / get_action_links.
total_countNoTotal items in this feed.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false, so safety profile is clear. Description adds order (most recent first) and filtering behavior, which is valuable but not critical beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear first sentence stating purpose, followed by parameter explanations. While verbose, every sentence adds value given the complexity of filtering options. Could be slightly more concise but is appropriate.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 10 parameters and an output schema existing, the description covers all filtering aspects and usage patterns. It does not need to describe return format due to output schema. Complex tool with many sibling tools, yet the description is thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description fully compensates by explaining all 10 parameters: personal_or_ad, email_type, account, list_id, source, sender, last_x_days, limit, offset, timezone. Provides examples and valid values for key parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'List the user's email feed (most recent first) without a search query', specifying the verb 'list' and resource 'email feed'. It clearly distinguishes from search tools (e.g., search_emails) by explicitly stating 'without a search query'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides extensive usage guidance: differentiates personal_or_ad values ('personal' for Cleanbox, 'advertising' for promotional), email_type options, account narrowing, list_id retrieval from list_email_lists, source narrowing (including 'slack'), sender filter, and combination examples (e.g., source='slack' + sender='<name>' + last_x_days=1).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_my_dayMy Day overviewB
Read-onlyIdempotent
Inspect

The user's My Day view: today's tasks, events and commitments.

ParametersJSON Schema
NameRequiredDescriptionDefault
timezoneNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
countNo
tasksNoTask objects aggregated from manual tasks, email actions, invoices, events, shipments and replies. Common keys: id/task_id, title, task_type, due_date, status, priority, email_id.
timeframeNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, destructiveHint, so the core safety profile is clear. The description adds the scope ('today's') and content types, which is useful but not extensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short and to the point. While it could include more detail without becoming verbose, the current length is appropriate for a simple read tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having an output schema, the description fails to differentiate from similar sibling tools like get_catch_up and does not address the timezone parameter. The context is incomplete for an agent to confidently choose this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter, timezone, is not mentioned in the description. With 0% schema coverage, the description should explain its purpose or effect, but it does not.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool provides today's tasks, events, and commitments in the user's My Day view. It clearly distinguishes from siblings like list_tasks (all tasks) and list_events (all events) by scoping to 'today'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like get_catch_up or list_tasks. The description lacks context about prerequisites or scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_widgetsGet dashboard widgetsA
Read-onlyIdempotent
Inspect

The user's My Data dashboard widgets (weather, news, stocks, custom) with their latest cached data. refresh=true re-fetches any widget whose rate limit allows it (e.g. weather every 10 min) before returning. include_catalog=true also lists the available widget types.

ParametersJSON Schema
NameRequiredDescriptionDefault
refreshNo
include_catalogNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
catalogNoAvailable widget types (only when include_catalog=true).
widgetsNoMy Data widgets (weather/news/stocks/custom) with cached data. Common keys: widget_id, widget_type, title, enabled, data.
refreshed_nowNoWidget ids re-fetched on this call (refresh=true).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, non-destructive. The description adds behavioral context: widgets returned with latest cached data, refresh respects rate limits (e.g., weather every 10 min), and include_catalog lists available types. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (two sentences) and front-loaded. However, the first sentence is a fragment ('The user's My Data dashboard widgets... with their latest cached data.'), which could be more grammatically structured. Overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given an output schema exists (though not shown), the description explains the tool's purpose and parameter behavior. It covers key aspects: cached data, rate limits, optional catalog listing. Could mention default behavior or error handling, but sufficient for typical use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so description carries full burden. It explains both parameters: refresh re-fetches respecting rate limits, include_catalog lists available widget types. This adds meaning beyond schema defaults.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool retrieves the user's dashboard widgets (weather, news, stocks, custom) with cached data. The title 'Get dashboard widgets' reinforces this. It distinguishes from siblings by specifying 'My Data dashboard widgets', which is unique among listed tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the refresh and include_catalog parameters, providing context such as rate limits for refresh and listing widget types for include_catalog. However, it does not explicitly state when not to use the tool or mention alternatives, but the specific nature of dashboard widgets makes it clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hide_emailsHide emails from feedA
Idempotent
Inspect

Hide one or more emails from the user's feed (does not delete them). Optional reason, e.g. 'spam', 'not_interested', 'never_show_this_sender'.

ParametersJSON Schema
NameRequiredDescriptionDefault
reasonNo
email_idsYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true, destructiveHint=false, and readOnlyHint=false. The description adds that hide does not delete, clarifying the non-destructive behavior. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with action ('Hide one or more emails'), no fluff. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with 2 parameters. The description fully covers purpose, behavior, and parameter semantics. An output schema exists but is not needed to explain return values. For this complexity, the description is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description bears full burden. It explains the 'reason' parameter with concrete examples and implies 'email_ids' is a list of email identifiers. This adds meaningful context beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the action 'hide' and the resource 'emails', and clearly distinguishes from deletion. Among sibling tools like 'mark_email_read' or 'remove_email_from_list', this tool's purpose is unique and well-defined.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides optional reason examples (e.g., 'spam'), which guides usage. However, it does not explicitly state when to use this tool versus alternatives like 'mark_email_read' or 'delete' operations, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_draftsList email draftsB
Read-onlyIdempotent
Inspect

List the user's email drafts (created in Mailopoly), newest first.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
searchNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
draftsNoDraft summaries (newest first).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, idempotentHint, destructiveHint) already declare a safe read operation. The description adds ordering ('newest first') but omits other behaviors like pagination or filtering limits. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single, concise sentence with no redundant words. The purpose is front-loaded and the entire description is easily scannable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

An output schema exists, so return values are not needed. However, the description does not explain how to use the listed parameters, which are essential for effective querying. The tool's complexity is low, but the missing parameter context limits completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, yet the description does not explain the 'limit' or 'search' parameters. It fails to compensate for the lack of schema descriptions, leaving the agent to guess their usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('List') and resource ('the user's email drafts') with additional context ('created in Mailopoly, newest first'), clearly distinguishing it from siblings like 'get_draft' or 'search_emails'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs. alternatives (e.g., 'get_draft' for a single draft) or when not to use it. The description implies it's for Mailopoly drafts but offers no exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_email_accountsList connected email accountsA
Read-onlyIdempotent
Inspect

List the user's connected email accounts with their connection health and onboarding state. Use this to diagnose "I can't connect" / "my email isn't showing up" problems. Each account has a status: active, onboarding, pending, not_syncing (connected but never activated), reauthorization_required (needs reconnecting), or inactive (paused). Also returns mailbox_verdict and last_check_error when present. Connected messaging apps (Slack etc.) appear with kind='app'.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
countNo
accountsNoConnected accounts with lifecycle detail. Common keys: id, email, provider, kind (mailbox|app), status (active|onboarding|pending|not_syncing|reauthorization_required|inactive|connected), is_syncing, reauthorization_required, mailbox_verdict, last_check_error.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly/idempotent. Description adds useful behavioral info: statuses, mailbox_verdict, last_check_error, and kind='app' for messaging apps. Does not mention any performance implications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four concise, front-loaded sentences. Uses inline enumeration for statuses. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers main purpose, status meanings, additional fields. Might be improved by mentioning if pagination applies or if only connected accounts are returned, but overall complete for a list tool with output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No input parameters; schema coverage 100%. Description explains output fields meaningfully, which indirectly aids understanding of what the tool does. Baseline 3, but extra detail on statuses elevates it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists connected email accounts with health and onboarding state. Uses specific verb and resource, distinguishes from siblings like get_account_overview and check_email_sync.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to use for diagnosing connection issues ('I can't connect' / 'my email isn't showing up' problems). Lacks explicit 'when not to use' but context makes it clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_email_listsList email listsA
Read-onlyIdempotent
Inspect

The user's custom email lists (smart folders): name, rules, unread and total counts. Browse a list's emails with get_feed(list_id=...); file/unfile specific emails with add_email_to_list / remove_email_from_list.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
listsNoCustom email lists (smart folders). Common keys: list_id, name, match_rules, exclude_from_cleanbox, unread_count, total_count.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds behavioral context by revealing the specific fields returned (name, rules, unread/total counts). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, each serving a distinct purpose: first defines the tool's result, second provides usage context. No redundant words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero parameters and an existing output schema, the description is complete. It names the output fields and links to related tools, providing sufficient context for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has no parameters (100% coverage). The description adds meaning by stating the output includes name, rules, unread and total counts, which compensates for the empty schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists the user's custom email lists (smart folders) and specifies the output fields: name, rules, unread and total counts. This differentiates it from sibling tools like get_feed, add_email_to_list, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides guidance on related actions: browsing a list's emails with get_feed and filing/unfiling emails with add_email_to_list/remove_email_from_list. However, it doesn't explicitly state when to use this tool versus alternatives like list_drafts or deep_search_emails.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_eventsList calendar eventsC
Read-onlyIdempotent
Inspect

List the user's calendar events extracted from their email (meetings, bookings, appointments). Dates are YYYY-MM-DD; defaults to upcoming.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
searchNo
end_dateNo
timezoneNo
start_dateNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
countNo
eventsNoCalendar events from email. Common keys: id, title, event_start, event_end, location, participants, email_id.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, idempotentHint, destructiveHint; description adds that events are extracted from email and defaults to upcoming. No additional behavioral traits (e.g., pagination, data freshness) are disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, front-loaded sentence with no fluff. Could be slightly more structured but is efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 5 parameters and 0% schema description coverage, the description is too minimal. Missing details on pagination, sorting, filter semantics, though output schema covers return values. Incomplete for complex usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%. The description only mentions date format and default behavior but does not explain any parameter meanings or usage, leaving agents without sufficient guidance for parameters like search, limit, timezone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists calendar events from email, specifies date format (YYYY-MM-DD) and default behavior (upcoming). However, it does not explicitly differentiate from sibling tools like list_tasks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings; no exclusions, prerequisites, or context provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_invoicesList invoices and paymentsA
Read-onlyIdempotent
Inspect

List invoices, receipts, bills and payments extracted from the user's email. Backed by the SAME schema-aware finance query Poly uses, so results are COMPLETE across every vendor (not just the first one).

  • invoice_or_payment: the RECORD TYPE to filter by — e.g. "receipt", "invoice", "payment", "refund", "statement", "order". Matched fuzzily, so "receipt" also catches "payment receipt" / "order". Pass the TYPE here, NOT a vendor name. Omit for all types.

  • search: a VENDOR / payee / description / category substring (a shop or supplier name). Omit to list across all vendors.

  • time_range examples: this_month, last_month, this_year, last_year, or last_30_days; or pass start_date / end_date as YYYY-MM-DD.

Returns the most recent limit records, the total matched, a per-vendor vendor_breakdown and grand/paid/outstanding totals — all computed over the FULL matching set, not just the returned page. To answer 'which vendors' or 'any receipts other than X', read vendor_breakdown instead of paging.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
searchNo
end_dateNo
timezoneNo
start_dateNo
time_rangeNo
invoice_or_paymentNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
countNoRecords returned in this page.
summaryNoTotals over the FULL matching set (every vendor, not just the returned page): grand_total, total_paid, total_outstanding, record_count, vendor_count, capped.
invoicesNoInvoice/payment records. Common keys: email_id, due_date, amount, amount_paid, payee, category, invoice_or_payment.
total_matchingNoTotal matching records (across all vendors). If `summary.capped` is true there may be even more.
vendor_breakdownNoPer-vendor rollup over the full matching set, biggest first: [{vendor, count, total}]. Use this to answer 'which vendors' / 'anyone other than X' without paging every record.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, destructiveHint=false, and idempotentHint=true. The description adds valuable behavioral context: returns most recent records, totals computed over full set, and how to use vendor_breakdown. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat lengthy but well-structured with bullet points and front-loaded purpose. Each sentence adds value, though minor redundancy could be trimmed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters, no required ones, and an output schema, the description fully covers usage, return values (limit, total, vendor_breakdown, totals), and how to interpret results. Complete for a listing tool with such complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description fully compensates by detailing each parameter: invoice_or_payment (fuzzy matching, examples), search (substring match), time_range (examples), start_date/end_date (format), and limit (default). Adds significant meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists invoices, receipts, bills, and payments from the user's email, distinguishing it from sibling tools like search_emails or list_drafts. It uses specific verbs and resources and emphasizes completeness across vendors.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly explains when to use the tool: to query financial records from email. It provides parameter guidance (e.g., invoice_or_payment filters by record type, not vendor name) and gives examples. It implicitly contrasts with other tools by highlighting its schema-aware, complete results.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_task_rulesList task-suppression rulesA
Read-onlyIdempotent
Inspect

List the user's task-suppression rules (rules that hide tasks from the task manager by sender, domain or subject). Returns rule ids usable with delete_task_rule.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
countNo
rulesNoTask-suppression rules; ids usable with delete_task_rule.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only and idempotent. Description adds that it returns rule IDs and explains what suppression rules are, but does not detail additional behavioral aspects like user-scoping or result ordering. Adequate but not extensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, highly efficient, no unnecessary words. All information is front-loaded and essential.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With zero parameters, informative annotations, and the presence of an output schema, the description is fully adequate. It explains the tool's output and ties to a sibling tool, leaving no gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so baseline 4 applies. Description does not need to add parameter info.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists task-suppression rules, defines what they are (hide tasks by sender/domain/subject), and distinguishes from siblings like delete_task_rule and create_task_rule.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions that returned IDs are usable with delete_task_rule, providing a clear use case. No explicit when-not guidance, but the tool's purpose is self-evident given the sibling set.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_tasksList tasksC
Read-onlyIdempotent
Inspect

List the user's tasks (aggregated from manual tasks, email actions, invoices, events, shipments and replies — the same data as the app's task manager / My Day). timeframe: relevant | today | tomorrow | this_week | this_month | overdue | last_7_days | next_30_days | all. task_type: all | action | event | invoice | shipment | reply.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
searchNo
statusNo
timezoneNo
task_typeNoall
timeframeNorelevant
include_completedNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
countNo
tasksNoTask objects aggregated from manual tasks, email actions, invoices, events, shipments and replies. Common keys: id/task_id, title, task_type, due_date, status, priority, email_id.
timeframeNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false, so the tool's safe behavior is clear. The description adds that it aggregates tasks from various sources but does not specify pagination, ordering, or other behavioral details. With annotations, a score of 3 is appropriate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loads the main purpose. It presents the parameter lists inline without excess words. However, it could be more structured, e.g., using bullet points for parameter options.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 7 parameters with no schema descriptions and an output schema, the description is incomplete. It covers only two parameters and does not explain filtering, search, or limit behavior. The tool is complex, and the description leaves significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description explicitly defines only two parameters (timeframe and task_type) with their allowed values, but ignores the other five parameters (limit, search, status, timezone, include_completed). With schema description coverage at 0%, the description does not sufficiently compensate for the missing parameter meanings.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists the user's tasks aggregated from multiple sources, equivalent to the app's My Day. It is specific about the data included but does not differentiate from the sibling tool 'get_my_day', which likely serves a similar purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool over alternatives like 'get_my_day' or other list tools. Does not specify prerequisites or context for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

mark_email_readMark email read or unreadA
Idempotent
Inspect

Mark an email as read (or unread with read=false).

ParametersJSON Schema
NameRequiredDescriptionDefault
readNo
email_idYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotentHint=true and destructiveHint=false, informing the agent that this mutation is safe to repeat and not destructive. The description adds the default behavior (read=true) but nothing beyond that, so the added value is modest.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single 13-word sentence that is concise and front-loaded. Every word serves a purpose, and there is no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple mutation with two parameters and an output schema, the description covers the core functionality. It lacks mention of side effects on other views or confirmation of idempotency, but overall it is reasonably complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It explains the read parameter's effect (read or unread) and its default value (true), but the required email_id parameter is not described, leaving a small gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool marks an email as read or unread, with the verb 'mark' and resource 'email'. The phrase 'or unread with read=false' adds specificity and distinguishes it from sibling tools like search_emails or hide_emails.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide explicit guidance on when to use this tool versus alternatives like hide_emails or complete_task. Usage is implied by the name and function, but no when-not or context is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pause_email_accountPause a connected accountA
Idempotent
Inspect

Pause (turn off) a connected account: stop downloading new mail while KEEPING everything already brought in. account is the connected email address. Reversible with resume_email_account. Confirm with the user before pausing — it stops their email from updating.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotent and non-destructive. Description adds that it stops email from updating and keeps existing data, providing useful behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each serving a purpose: core action, parameter definition, usage note. Front-loaded and no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, parameter, side effects, reversibility, and user confirmation required. Output schema exists, so return values need not be described.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, the description compensates by explaining 'account is the connected email address', giving meaning to the sole parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Pause (turn off) a connected account') and the effect ('stop downloading new mail while KEEPING everything already brought in'). It distinguishes from sibling tools like resume_email_account and start_email_account_sync.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states reversibility via resume_email_account and instructs to confirm with user before pausing. Provides clear when-to-use and when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove_email_from_listRemove an email from a listA
Idempotent
Inspect

Remove an email from one of the user's email lists. Note: if the email matches the list's rules it may be re-added when the rules are re-evaluated — edit the list's rules in the app for a permanent exclusion.

ParametersJSON Schema
NameRequiredDescriptionDefault
list_idYes
email_idYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses important behavioral details (non-permanent removal due to rule re-evaluation) that go beyond the annotations (which mark idempotentHint=true but no destructive hint), adding significant context for the agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first states the core function, the second adds a critical caveat. No redundant words, perfectly front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the tool's purpose, its non-permanent nature, and suggests an alternative workflow. With an output schema present, return values need not be described. Complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While schema coverage is 0%, the parameters list_id and email_id are inherently clear from the tool name and description ('Remove an email from one of the user's email lists'), but the description does not explicitly define them, leaving slight ambiguity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Remove an email') and the target resource ('from one of the user's email lists'), distinguishing it from siblings like 'add_email_to_list' or 'create_email_list'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly warns that removal may be temporary if rules re-add the email, and advises to edit list rules for permanent exclusion, providing clear guidance on when to use this tool versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

resume_email_accountResume a paused accountA
Idempotent
Inspect

Resume a previously paused account so it starts syncing again. account is the connected email address. After resuming you may need start_email_account_sync if it still shows as not syncing.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotency; description adds that resuming may not immediately start syncing, advising a separate tool if needed. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently convey purpose, parameter, and usage guidance without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and existing output schema, description fully covers usage, parameter, and follow-up steps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Description defines the sole parameter 'account' as 'the connected email address', compensating for 0% schema description coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool resumes a paused account to start syncing, distinguishing it from siblings like pause_email_account and start_email_account_sync.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates when to use (resuming paused account) and provides follow-up guidance (start_email_account_sync if not syncing), but lacks explicit 'when not to use'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_draftSave an email draftAInspect

Create an email draft (or update an existing one by draft_id). The draft appears in the user's Mailopoly drafts and can later be sent with send_email (requires the 'send' scope). to/cc/bcc are email addresses, comma-separated for multiple recipients. For a reply, pass reply_to_email_id (the original's email_id) — to is then optional and the reply is routed automatically when sent, including replies to connected-app messages (Slack etc.), which have no email address. body is plain text or HTML.

ParametersJSON Schema
NameRequiredDescriptionDefault
ccNo
toNo
bccNo
bodyYes
subjectYes
draft_idNo
reply_to_email_idNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and destructiveHint=false, which the description supports by stating it creates/updates. The description adds behavioral context: draft appears in Mailopoly drafts, requires 'send' scope for later sending, and automatic reply routing for connected-app messages. This goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

At 4 sentences, the description is moderately sized and front-loads the main purpose. Each sentence adds value: purpose, scope mention, parameter formats, reply behavior. No waste, but could be slightly more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters and complexity (reply routing, connected-app messages), the description covers the core workflow well. It mentions scope requirement for sending, but does not detail output schema behavior. With output schema existing, this is acceptable; completeness is high but not exhaustive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema coverage metric, the description explicitly covers all 7 parameters: to/cc/bcc as comma-separated emails, reply_to_email_id for replies (making 'to' optional), draft_id for updates, body as plain text/HTML. This adds significant meaning beyond the schema's types and titles.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Create an email draft (or update an existing one by draft_id)', providing a specific verb+resource. It distinguishes from sibling tool 'send_email' by noting the draft can be sent later, and from 'list_drafts' by its creation/update function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explains when to use for new drafts or updates, and when replying (using reply_to_email_id). It mentions sibling 'send_email' for sending, implying when not to use this tool. However, it lacks explicit exclusions or comparison with other siblings like 'hide_emails'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_emailsSearch emailsA
Read-onlyIdempotent
Inspect

Search the user's emails by free text (subjects, senders, bodies, attachment text). Use 2-4 plain keywords (e.g. 'AWS cost anomaly'); do NOT pass Boolean operators or long OR-lists of synonyms — search requires ALL words to appear, so extra alternatives make it match nothing. Optional filters: sender (name or address fragment), start_date/end_date (YYYY-MM-DD), last_x_days, email_type ('received' or 'sent'), personal_or_ad ('personal' or 'advertising'). Returns matching emails with ids for use in get_email / get_action_links. Covers Mailopoly's indexed history only — for older mail beyond the indexed window (the response's indexed_history_start), use deep_search_emails.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
queryYes
offsetNo
senderNo
end_dateNo
timezoneNo
email_typeNo
start_dateNo
last_x_daysNo
personal_or_adNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
offsetNoOffset of this page.
resultsNoEmail summary objects. Common keys: email_id, subject, sender_name, sender_email, date, snippet, personal_or_ad, categories, source. Pass email_id to get_email / get_action_links.
total_countNoTotal matches available (may exceed the returned page).
indexed_history_startNoOldest locally-indexed date (YYYY-MM-DD). Older mail needs deep_search_emails. Present when results are sparse or a start_date predates the index.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint and idempotentHint; description adds that results include IDs for other tools and only cover indexed history with a boundary parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear front-loading of purpose, usage guidance, filter list, and boundary note. Each sentence adds value, though slightly verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers essential aspects: query, filters, return value (IDs), and indexed window limitation. Omits limit/offset/timezone but retains completeness for typical usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Adds meaning for most parameters (query, sender, dates, email_type, personal_or_ad) beyond schema, but does not describe limit, offset, or timezone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it searches user emails by free text across subjects, senders, bodies, and attachment text. Distinguishes from siblings like deep_search_emails and get_email.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises using 2-4 plain keywords, warns against Boolean operators and OR-lists, and specifies when to use deep_search_emails instead.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_emailSend an email or app replyA
Destructive
Inspect

Send an email as the user through their connected account (Gmail, Outlook, @mly.life or IMAP), or reply to any message by passing reply_to_email_id (an email_id from search/feed/get_email).

Replies: with reply_to_email_id set, to/subject are optional — the recipient and threading come from the original. If the original is a message from a connected app (source != 'email', e.g. Slack), the reply is delivered back through that app as the user (same thread/DM) — do NOT pass an email address for app messages; their senders have none.

If draft_id is given, the draft's content fills any field not explicitly provided. from_account selects which connected address to send from (defaults to the account the original arrived on for replies, else the primary). content_type: 'HTML' or 'TEXT'. Subscription limits apply.

ParametersJSON Schema
NameRequiredDescriptionDefault
ccNo
toNo
bccNo
bodyNo
subjectNo
draft_idNo
content_typeNoHTML
from_accountNo
reply_to_email_idNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly=false, destructive=true. The description adds useful behavioral context (reply threading, draft filling, subscription limits) but does not detail error handling, rate limits, or exact consequences of destructive actions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-organized into paragraphs with clear separation of concepts. Each sentence adds value. Slightly verbose but front-loaded with essential purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers critical aspects: reply handling, draft integration, app replies, and account selection. Given the tool's complexity and the presence of an output schema, the description sufficiently prepares the agent for correct usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, the description explains key parameters (reply_to_email_id, draft_id, from_account, content_type) and their interactions. However, it does not explicitly describe cc, bcc, body, or to/subject semantics beyond the reply context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool sends an email or app reply, distinguishing between new sends and replies. It highlights the specific verb 'send' and resource 'email/app reply', and differentiates from sibling tools like save_draft by explaining separate functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use guidance for reply_to_email_id, draft_id, and from_account. Explains that to/subject are optional for replies and cautions about app messages. Lacks explicit exclusions for alternative tools but gives enough context for correct invocation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_languageSet the user's languageA
Idempotent
Inspect

Set the user's preferred language for the emails, reports and app content Mailopoly sends them — a BCP-47 code like 'es', 'fr', 'de', 'pt', 'ja', 'ar', 'zh', 'ru'. Call this ONCE, early, as soon as you can tell from the conversation which language the user actually speaks/writes (e.g. they message you in German → call set_language('de'); their first request is in Arabic → set_language('ar')). This does NOT change how you reply in this chat — it makes their Mailopoly emails and website render natively, which they otherwise can't tell you because they never see English here. If the user clearly uses English, just skip it (English is the default). Takes effect immediately and never overrides a language the user picked themselves in Mailopoly's settings.

ParametersJSON Schema
NameRequiredDescriptionDefault
languageYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations show idempotent and non-destructive. Description adds context: takes effect immediately, never overrides user-set language in Mailopoly settings, and scopes effect to Mailopoly emails/website only.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with front-loaded purpose and clear usage guidance. Slightly verbose but every sentence adds value; could be trimmed slightly without losing meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple single-parameter tool and presence of output schema, the description covers purpose, usage, behavior, and parameter format completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. The description fully compensates by specifying the BCP-47 format, listing example codes, and explaining when to use each based on user language.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (set) and resource (user's preferred language for emails, reports, and app content), with specific examples of BCP-47 codes. It distinguishes itself from the chat's reply language, which is not affected.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance: call once, early, based on user's language; skip if English (default); does not change chat replies. Provides concrete examples like 'message in German → set_language('de')'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_timezoneSet the user's timezoneA
Idempotent
Inspect

Update the user's timezone. timezone must be an IANA name like 'America/New_York', 'Europe/London' or 'Asia/Kolkata' — NOT an abbreviation (EST) or a UTC offset. Call this whenever the user says they've moved or are travelling, gives their location/timezone, or tells you the times you're showing are off by a fixed number of hours: the server localizes every timestamp it returns to this zone, so fixing it here corrects all of them. The change takes effect immediately.

ParametersJSON Schema
NameRequiredDescriptionDefault
timezoneYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint true and destructiveHint false. The description adds that the change takes effect immediately, without contradicting any annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, efficiently conveying purpose, parameter format, and usage context. No extraneous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With a single parameter and an existing output schema, the description gives all necessary details: parameter constraints, when to use, and immediate effect. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The timezone parameter is fully explained: it must be an IANA name, not an abbreviation or UTC offset. This compensates for the schema's 0% description coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool updates the user's timezone, specifies the required format (IANA name), and distinguishes it from sibling tools, none of which deal with timezone settings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly lists scenarios when to call the tool: user moved, traveling, provided location, or noticed timestamps are off. It also explains the effect on all returned timestamps.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

snooze_taskSnooze a taskAInspect

Snooze a task. Either pass snooze_until (ISO datetime, in the given timezone — defaults to the user's own timezone) or a relative duration (duration_value + duration_unit, e.g. 3 + 'days', 2 + 'hours', 1 + 'weeks').

ParametersJSON Schema
NameRequiredDescriptionDefault
task_idYes
timezoneNo
snooze_untilNo
duration_unitNo
duration_valueNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=false and destructiveHint=false. The description adds context about timezone handling and the two ways to specify snooze time, but does not detail side effects (e.g., whether notifications are suppressed or task state changes). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the action. It is concise and to the point, though a slightly more structured presentation (e.g., bullet points) could improve readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (which presumably defines return values), the description covers the core inputs adequately. It lacks mention of confirmation or state changes, but for a simple tool this is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It effectively explains snooze_until (ISO datetime) and the duration parameters (duration_value + duration_unit with examples), adding meaning beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('snooze a task') and explains the two input modes for timing (ISO datetime or relative duration). It distinguishes from sibling tools like 'complete_task' by specifying a deferral operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the two methods for specifying the snooze duration and mentions the default timezone behavior. It does not explicitly state when not to use this tool or provide comparisons to alternatives, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

start_email_account_syncStart syncing a connected accountAInspect

Start (or retry) syncing a connected account that isn't syncing yet — e.g. one with status 'not_syncing' (connected during signup but never activated) or 'pending'. account is the connected email address. This pulls in the account's mail, tasks, bills and events. If activating it would use one of the plan's paid email-account slots, the first call returns requires_confirmation with a message; confirm with the user, then call again with confirm_billing=true.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountYes
confirm_billingNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
errorNoError message when it failed.
messageNoHuman-readable status/result.
successNoWhether the action succeeded.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses the effect of the tool: it pulls in mail, tasks, bills, and events. It also explains the requires_confirmation response and how to proceed, which is beyond the annotations (readOnlyHint=false, destructiveHint=false). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (around 100 words), front-loaded with the core purpose, and every sentence adds value. It structures information logically: purpose, conditions, effects, billing note. No redundant text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers all essential aspects: purpose, when to use, side effects, parameter meanings, and the confirmation flow. Since an output schema exists, the lack of return value description is acceptable. The tool's complexity is fully addressed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema description coverage, the description explains both parameters: account is 'the connected email address', and confirm_billing is used in the billing confirmation context. This adds meaning beyond the schema's minimal titles.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: start or retry syncing a connected account that isn't syncing yet, with specific statuses mentioned ('not_syncing', 'pending'). It distinguishes from sibling tools like check_email_sync, which merely checks sync status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool (for accounts with 'not_syncing' or 'pending' status) and provides a clear workflow for the billing confirmation scenario, including instructions to call again with confirm_billing=true after user confirmation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

start_email_connectionStart connecting an email accountAInspect

Begin connecting an email account (or reconnecting one whose access expired) by returning a secure Mailopoly link for the user to open. Pass email_or_provider (the address or provider they want to add) for a NEW connection, or account (an existing connected address) to RECONNECT one flagged reauthorization_required. The link opens Mailopoly's own page where they sign in (OAuth) or enter an app password — the password is NEVER typed into the chat. For IMAP users, call get_connect_instructions first so you can tell them how to get their app password, then give them this link. Relay the returned url to the user.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountNo
email_or_providerNo

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlNoIn-app URL for the user to open.
errorNo
methodNoAlways 'link'.
reasonNoconnect | reconnect.
messageNo
successNo
instructionsNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses important behavioral traits beyond annotations: the tool returns a secure Mailopoly link, the password is never entered in chat, and it uses OAuth flow. Annotations only provide basic hints (readOnlyHint=false, etc.), so the description carries the full burden and does so excellently.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately long but every sentence adds essential value. It is well-structured: purpose first, then parameter roles, then IMAP prerequisite, and finally the security note. No redundant or irrelevant content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (2 optional parameters, output schema exists, no annotations to lean on), the description covers all necessary aspects: what the tool does, how to use it for new vs reconnection, prerequisite steps for IMAP, and the nature of the output (a url to relay). It is fully adequate for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, but the description fully explains both parameters: email_or_provider for new connections and account for reconnections, including their mutually exclusive use. This adds significant meaning beyond the schema's minimal type information.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool begins connecting an email account, distinguishing between new connections (using email_or_provider) and reconnections (using account). It uses specific verbs and resources, and differentiates from sibling tools like verify_email_account or get_connect_instructions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use each parameter (new vs reconnection) and advises calling get_connect_instructions for IMAP users first. However, it does not explicitly state when not to use this tool or list alternatives beyond that, which reduces the score from a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_email_accountVerify a connected mailboxA
Read-onlyIdempotent
Inspect

Run a live check on one connected account and explain what's wrong, if anything. account is the connected email address (from list_email_accounts). Returns a status: ok, wrong_mailbox (signed in but the real mail is hosted elsewhere — see imap_suggestion), provider_mismatch, empty_mailbox, token_expired (needs reconnecting via start_email_connection), or wrong_provider. Use this when a user says their email isn't syncing to tell them precisely why and what to do next.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountYes

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusNook | wrong_mailbox | provider_mismatch | empty_mailbox | token_expired | wrong_provider | unsupported_provider | no_token | not_found.
messageNo
verifiedNo
email_addressNo
message_countNo
imap_suggestionNo
suggested_actionNo
detected_providerNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses behavioral traits beyond annotations: readOnlyHint=true is supported by 'live check', and the description details possible return statuses and their implications, including next steps. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Compact (~70 words) yet packed with essential information: purpose, parameter, return values, and usage context. Front-loaded with the verb and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (1 param, output schema exists), the description fully covers behavior: what it does, what it returns, and how to interpret results. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Single parameter 'account' is explained: '`account` is the connected email address (from list_email_accounts).' This adds meaning beyond the schema's 'Account' and provides a sourcing hint.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Run a live check on one connected account and explain what's wrong, if anything.' It specifies the verb (verify/check) and resource (connected mailbox/account), and distinguishes from siblings like check_email_sync by detailing the specific statuses returned.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit when to use: 'Use this when a user says their email isn't syncing.' Also mentions alternative action for token_expired (reconnect via start_email_connection). Lacks explicit 'when not to use' but provides clear context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources