Courier

Name: Courier
Author: trycourier

by io.github.trycourier

Server Details

Send notifications, manage templates, and configure integrations with Courier.

Status: Healthy
Last Tested: 2026-07-05 08:43
Transport: Streamable HTTP
URL
Repository: trycourier/courier-mcp
GitHub Stars: 2
Server Listing: Courier MCP Server

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

B3.4/5.0

Tool DescriptionsB

Average 3.7/5 across 123 of 123 tools scored. Lowest: 2.8/5.

Server CoherenceA

Disambiguation4/5

Despite having 123 tools, each has a distinct purpose clearly explained in the description. Some pairs like send_message vs send_message_template are similar but differentiated by inline vs template usage. A few tools like cancel_automation and invoke_ad_hoc_automation are distinct but might cause minor confusion. Overall, the surface is well-differentiated.

Naming Consistency3/5

The naming pattern is mostly verb_noun (create_brand, delete_list, get_message). However, there are inconsistencies like 'add_subscribers_to_list' vs 'bulk_subscribe_to_list' and 'replace_profile' vs 'patch_profile'. Some tools use 'list_' while others use 'get_' for similar operations (list_messages vs get_message). This mix, while readable, lacks a uniform style.

Tool Count3/5

123 tools is a large number for a single server. The domain is broad (notifications, users, lists, tenants, journeys, automations, etc.), but some tools feel redundant (multiple send_message variants) or tangential (courier_installation_guide). The count is on the high side but still somewhat justified by the platform's scope.

Completeness5/5

The tool surface covers nearly every operation for all major entities: CRUD, publish/archive, versioning, bulk operations, preferences, and more. Missing operations are rare (e.g., no direct 'get_bulk_job_users' but that's minor). The set feels comprehensive for a notification infrastructure platform.

Available Tools

123 tools

add_bulk_usersAInspect

Add users to an existing bulk job.

ParametersJSON Schema

Name	Required	Description	Default
`users`	Yes	Array of user objects to add
`job_id`	Yes	The bulk job ID

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate mutation (readOnlyHint=false, idempotentHint=false). The description adds no extra behavioral details beyond the action itself, such as error handling or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, eight words, no unnecessary information. Front-loaded with verb and object. Efficient for a simple tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Minimal but covers the essential purpose. However, lacks context about constraints (e.g., duplicate users, job state requirements). Adequate for a straightforward tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters with descriptions. The tool description does not provide additional meaning beyond what is in the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Add users') and the target resource ('existing bulk job'). It distinguishes from siblings like create_bulk_job and run_bulk_job.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like create_bulk_job or subscribe_user_to_list. The description does not specify prerequisites or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_subscribers_to_listAInspect

Append subscribers to a list without removing existing subscribers.

ParametersJSON Schema

Name	Required	Description	Default
`list_id`	Yes	The list ID
`recipients`	Yes	Recipients to set on the list

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate a write operation that is not idempotent. The description adds that existing subscribers are not removed, but does not disclose behavior for duplicate recipients, authorization needs, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence that conveys the essential purpose without any superfluous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple append operation, the description covers the core behavior and a key constraint. However, it lacks details on prerequisites (list existence), outcomes, and edge cases, but remains sufficient for basic usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has full description coverage for both parameters. The description does not add additional meaning beyond naming the parameters, so it meets the baseline expectation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (append subscribers) and the key nuance (without removing existing subscribers), distinguishing it from other list manipulation tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for adding subscribers without replacement, but does not provide explicit guidance on when to use this tool versus alternatives like subscribe_user_to_list or bulk_subscribe_to_list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

add_user_to_tenantC

Idempotent

Inspect

Add a user to a tenant.

ParametersJSON Schema

Name	Required	Description
`profile`	No	Tenant-scoped profile overrides
`user_id`	Yes	The user ID
`tenant_id`	Yes	The tenant ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate mutation (readOnlyHint=false) and idempotency (idempotentHint=true). The description adds no behavioral context beyond the action name. It does not mention side effects, permissions, or error scenarios.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single sentence with no superfluous words. Front-loaded with the action. However, could be expanded slightly to add value without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of sibling tools and the presence of a nested object parameter ('profile'), the description is too minimal. It does not explain return values, preconditions, or the effect of the operation on existing relationships.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (all parameters have descriptions). The description does not add any parameter details, so it matches the baseline for high coverage. The 'profile' object remains vague, but schema describes it adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Add a user to a tenant' clearly states the action and resource. It distinguishes from siblings like 'remove_user_from_tenant' and 'bulk_add_user_tenants'. However, it lacks specificity about what 'adding' entails (e.g., does it create a relationship, any default roles?).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'bulk_add_user_tenants' or 'add_bulk_users'. No mention of prerequisites (e.g., tenant existence, user existence) or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

archive_journeyA

DestructiveIdempotent

Inspect

Archive a journey. Archived journeys cannot be invoked but existing runs continue to completion.

ParametersJSON Schema

Name	Required	Description	Default
`journey_id`	Yes	The journey template ID to archive

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds behavioral context beyond the annotations (idempotentHint=true, destructiveHint=true) by explaining that archived journeys cannot be invoked but existing runs continue. This is useful for understanding the tool's impact. However, it does not mention reversibility or permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences, no wasted words. It is front-loaded with the key action and provides essential consequences efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple one-parameter tool with no output schema, the description adequately explains what happens. It covers the key behavioral nuance about existing runs, which is important for decision-making. Lacking details on success/error responses, but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the parameter description 'The journey template ID to archive' is clear. The description does not add extra meaning beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'Archive a journey' and provides the consequence: 'Archived journeys cannot be invoked but existing runs continue to completion.' It distinguishes from related tools like cancel or delete, though it does not explicitly differentiate from the sibling 'archive_journey_template'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives minimal guidance on when to use this tool versus alternatives. It implies use for archiving, but does not explicitly state when not to use it or mention alternatives like cancel_automation or delete_audience.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

archive_journey_templateA

DestructiveIdempotent

Inspect

Archive a journey-scoped notification template. Archived templates cannot be sent.

ParametersJSON Schema

Name	Required	Description	Default
`journey_id`	Yes	The journey template ID that owns this notification
`notification_id`	Yes	The notification template ID

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare idempotentHint (true) and destructiveHint (true). The description adds that archived templates cannot be sent, but does not elaborate on other side effects, reversibility, or permissions. Neutral contribution.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two short sentences, no redundant words. Information is front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the annotations and simple parameter schema, the description is adequate but lacks information on reversibility, impacts on other operations, or permissions. Leaves some gaps for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover 100% of parameters with clear explanations. The tool description adds no extra meaning beyond what the schema provides, so baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (archive) and the resource (journey-scoped notification template), plus a key effect (cannot be sent). This distinguishes it from sibling tools like archive_journey and archive_notification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like archive_notification. The description does not mention prerequisites, exclusions, or context for choosing this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

archive_notificationA

DestructiveIdempotent

Inspect

Archive a notification template by ID.

ParametersJSON Schema

Name	Required	Description	Default
`notification_id`	Yes	The notification template ID to archive

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare idempotentHint=true and destructiveHint=true. The description adds no additional behavioral context, such as whether archiving is reversible or what side effects occur. It is consistent but not enhancing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence with no extraneous information. It is appropriately sized for the tool's simplicity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema), the description is functionally complete in defining the action, but lacks contextual information about when to use it among many sibling tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the parameter description matches the tool description. The description adds no extra meaning beyond what the schema already provides, so baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'archive' and the specific resource 'notification template', distinguishing it from sibling tools like 'archive_routing_strategy'. It is precise and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There is no mention of prerequisites, such as whether the template must be in a certain state, or when to use 'delete' instead of 'archive'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

archive_requestB

DestructiveIdempotent

Inspect

Archive a send request and all its associated messages by request ID.

ParametersJSON Schema

Name	Required	Description	Default
`request_id`	Yes	The request ID (requestId returned from /send)

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations provide idempotentHint and destructiveHint, so the description adds little beyond stating 'Archive'. It does not clarify whether the action is reversible, the fate of associated messages, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence (8 words) that conveys the essential information without any unnecessary words. It is front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple one-parameter tool with clear annotations and no output schema, the description is nearly complete. It could mention if the operation is reversible or what happens to the request status, but the current level is adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a clear description for the only parameter. The tool description simply echoes 'by request ID', adding no extra meaning beyond the schema's documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Archive'), the resource ('a send request and all its associated messages'), and the method ('by request ID'). This distinguishes it from sibling tools like delete_audience or other archive tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives (e.g., delete or other archive operations). The description does not mention prerequisites or when archival is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

archive_routing_strategyA

DestructiveIdempotent

Inspect

Archive a routing strategy. The strategy must not have associated notification templates; unlink all templates before archiving.

ParametersJSON Schema

Name	Required	Description	Default
`routing_strategy_id`	Yes	The routing strategy ID to archive

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark destructiveHint=true, so the description carries lower burden. It adds valuable context beyond annotations by specifying the precondition about notification templates. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences: first states purpose, second adds the critical precondition. No extraneous information, and the most important details are front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple, single-parameter tool with annotations covering destructiveness, the description is complete. It explains what the tool does and the key constraint necessary for successful invocation. No output schema needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage with a clear description of the single required parameter. The tool description adds no additional semantic information beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Archive a routing strategy' with a specific verb and resource. Distinguishes from sibling archive tools like archive_notification and archive_request.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides a clear precondition: strategy must not have associated notification templates. This tells the agent when it is appropriate to use and what to do before invoking (unlink templates). Does not explicitly name alternative tools, but the precondition serves as implicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bulk_add_user_tenantsA

Idempotent

Inspect

Add a user to multiple tenants at once. A custom profile can be supplied per tenant.

ParametersJSON Schema

Name	Required	Description	Default
`tenants`	Yes	Array of tenant associations
`user_id`	Yes	The user ID

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description states the basic write operation. Annotations indicate idempotentHint=true and readOnlyHint=false, which are consistent. No additional behavioral traits (e.g., behavior on existing associations) are disclosed beyond what annotations convey.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no redundant information. Every word contributes to understanding the tool's purpose and key capability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 params, no output schema), the description adequately covers the core functionality. It does not specify return values, but without an output schema this is not a gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all parameters. The description's mention of custom profile matches the profile field. No additional semantic meaning is added beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Add a user to multiple tenants at once') and includes the ability to supply a custom profile per tenant. It effectively distinguishes from siblings like 'add_user_to_tenant' which is for a single tenant.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the tool is for bulk operations, contrasting with single-tenant alternative. However, it does not explicitly mention when not to use it or prerequisites like user/tenant existence.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bulk_add_user_tokensAInspect

Add multiple push/device tokens for a user in one request. Overwrites matching existing tokens.

ParametersJSON Schema

Name	Required	Description	Default
`tokens`	Yes	Token records to upsert
`user_id`	Yes	The user ID

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description indicates mutation ('Add', 'Overwrites') aligning with readOnlyHint=false. Lacks details on limits, error handling, or side effects beyond basic upsert behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two short sentences, front-loaded with action, no redundant words. Efficient for quick understanding.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key aspects but lacks constraints (e.g., max tokens) and error conditions. For a simple bulk tool with no output schema, it's partially complete but could add more context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. Description adds 'in one request' context but 'overwrites' is already implied by schema's 'upsert' description. Adequate but not beyond baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool adds multiple push/device tokens for a user, specifying the resource (push/device tokens) and action (add). It distinguishes from siblings like create_or_replace_user_push_token which handles single tokens.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies bulk use ('in one request') but does not explicitly contrast with single-token tools. No guidance on when to use this vs create_or_replace_user_push_token or patch_user_token.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bulk_subscribe_to_listA

Idempotent

Inspect

Replace all subscribers on a list with the given recipients.

ParametersJSON Schema

Name	Required	Description	Default
`list_id`	Yes	The list ID
`recipients`	Yes	Recipients to set on the list

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses that the operation replaces all subscribers, indicating destructive behavior beyond the idempotentHint annotation. It adds value by clarifying the effect on existing subscribers.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single clear sentence that is front-loaded and concise, containing no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with two parameters and annotations present. The description adequately explains the core action, but could mention potential side effects or error scenarios for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with descriptions for both parameters. The description does not add any additional meaning beyond what is already in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses the specific verb 'replace' and resource 'subscribers on a list', making the purpose clear. It distinguishes from sibling tools like add_subscribers_to_list and unsubscribe_user_from_list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies that this tool is for replacing all subscribers, which gives clear context. However, it does not explicitly state when to use this vs alternatives like add_subscribers_to_list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cancel_automationA

Idempotent

Inspect

Cancel a running automation by its cancelation_token. This invokes a second ad-hoc automation with a single cancel step. The token must match the cancelation_token set when the original automation was started. Note: spelling is "cancelation_token" (single "l").

ParametersJSON Schema

Name	Required	Description	Default
`cancelation_token`	Yes	The cancelation_token that was set when the automation was originally invoked

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds context beyond annotations: it notes that cancellation invokes a second ad-hoc automation and highlights the unusual spelling of 'cancelation_token'. No contradiction with idempotentHint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each providing essential information: purpose, mechanism, and token requirement with spelling note. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with one parameter and no output schema, the description fully covers purpose, mechanism, and critical token match requirement.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. Description adds value by explaining the token must match the original and noting spelling, exceeding baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Cancel a running automation by its cancelation_token'. It distinguishes from sibling cancel tools like cancel_message and cancel_notification_submission.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description specifies that the token must match the original cancelation_token, providing clear context for use. However, it does not explicitly state when not to use or mention alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cancel_messageA

DestructiveIdempotent

Inspect

Cancel a message that is currently being delivered. Returns the message details with updated status.

ParametersJSON Schema

Name	Required	Description	Default
`message_id`	Yes	The message ID to cancel

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide idempotentHint and destructiveHint. The description adds that the message must be 'currently being delivered' and that it returns 'updated status', which is useful behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two concise sentences with no unnecessary words. It front-loads the action ('Cancel') and clearly states the result.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple tool (one parameter, no output schema, clear annotations), the description is complete: it explains the action, the condition (currently being delivered), and the result (returns updated status).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema fully describes the single parameter 'message_id'. The description does not add any additional meaning beyond what the schema provides, so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Cancel') and resource ('message that is currently being delivered'), clearly distinguishing it from sibling tools like 'cancel_automation' or 'cancel_notification_submission'. It also notes the return of updated message details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not explicitly state when to use this tool over alternatives or when not to use it. It only describes the function, leaving usage context implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cancel_notification_submissionC

DestructiveIdempotent

Inspect

Cancel a notification template submission.

ParametersJSON Schema

Name	Required	Description	Default
`submission_id`	Yes	The submission ID to cancel
`notification_id`	Yes	The notification template ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate destructiveHint=true and idempotentHint=true, but the description adds no extra context about side effects, permanence, or what happens to the submission. It relies solely on the term 'Cancel' which is generic.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no wasted words. However, it is too brief to be fully informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks information about return value, outcomes, or side effects. For a destructive operation, more context about what happens after cancellation would be useful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Both parameters have clear descriptions in the schema (submission_id and notification_id). The tool description adds no additional meaning beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Cancel' and the resource 'notification template submission'. It differentiates from siblings like cancel_message by specifying submission, though it could be more explicit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives like archive_notification or cancel_message. No context on prerequisites or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

courier_installation_guideA

Read-only

Inspect

Get the Courier SDK installation guide for a specific platform. For client-side SDKs (React, iOS, Android, Flutter, React Native), also generates a sample JWT.

ParametersJSON Schema

Name	Required	Description	Default
`user_id`	No	User ID for JWT generation (client-side SDKs only). Defaults to "example_user".
`platform`	Yes	The platform to get installation guide for

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds the behavioral trait of generating a sample JWT for client-side SDKs, which goes beyond the readOnlyHint annotation. No mention of authentication requirements or error handling, but the core behavior is well-described.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, front-loaded sentence with no filler. Every word is necessary and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple, and the description covers the main behaviors. However, lacking an output schema, the description does not specify the return format (e.g., text or JSON), which could leave an agent uncertain about parsing the response.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already documents both parameters. The description reinforces that 'user_id' is only relevant for client-side platforms but does not add new semantic depth beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a Courier SDK installation guide for a specific platform and, for client-side SDKs, generates a sample JWT. The verb 'Get' and resource 'Courier SDK installation guide' are specific, and the tool is clearly distinct from sibling CRUD tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description conveys a clear context: use when needing an installation guide for a platform. However, it does not explicitly state when not to use or mention alternatives like 'generate_jwt_for_user' for JWT-only needs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_brandAInspect

Create a new brand. The API requires settings — omitting it returns a 400. If you do not have specific brand colors, omit settings and a safe default will be used automatically (black primary, white secondary). Example: { name: "Acme", settings: { colors: { primary: "#1a73e8", secondary: "#ffffff" } } }.

ParametersJSON Schema

Name	Required	Description
`id`	No	Optional brand ID; auto-generated if omitted
`name`	Yes	Brand display name
`settings`	No	Brand appearance settings. If omitted, defaults to { colors: { primary: "#000000", secondary: "#ffffff" } }.
`snippets`	No	Brand snippets

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false, confirming mutation. The description adds that default colors are applied when settings omitted, but does not explain side effects, auth needs, or success behavior. Adequate but not rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences and flows well, but the potentially misleading statement about settings being required could have been clarified. It is mostly concise but not entirely accurate.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a creation tool with 4 parameters and no output schema, the description covers creation and default behavior but omits return values and snippets parameter. It is incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for each parameter. The description adds an example and clarifies default behavior for settings, including specific hex values. This goes beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with 'Create a new brand,' which is a clear verb+resource statement. It distinguishes from siblings like delete_brand, update_brand, get_brand, and list_brands by focusing on creation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description warns that omitting settings returns a 400, which provides a usage constraint, and explains when to omit settings (when no specific colors) to use defaults. However, this contradicts the schema where settings is not required, causing confusion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_bulk_jobAInspect

Create a new bulk job for sending messages to multiple recipients. Workflow: create_bulk_job → add_bulk_users → run_bulk_job.

ParametersJSON Schema

Name	Required	Description	Default
`message`	Yes	Bulk message definition with event/template and content

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=false and idempotentHint=false; description confirms mutation ('Create'). No additional behavioral context (e.g., side effects, permissions, rate limits) beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states purpose, second gives workflow. Front-loaded and zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Provides workflow context but omits return value and parameter structure details. With no output schema, description could be more complete about expected response.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with description 'Bulk message definition with event/template and content'. Description adds no further detail beyond schema, so baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Create a new bulk job for sending messages to multiple recipients.' Provides specific verb and resource, and distinguishes from siblings by mentioning the workflow sequence.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit workflow (create_bulk_job → add_bulk_users → run_bulk_job) indicates when to use this tool as the first step. Implicitly differentiates from single message tools, but lacks explicit when-not-to-use or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_journeyAInspect

Create a new journey. Defaults to DRAFT state. Send nodes are not allowed on create — create the shell with a trigger node, then call replace_journey to add send nodes after linking notification templates. Call publish_journey to make it live. Node ids are server-generated; do NOT include an id field. Example: { name: "Welcome Journey", nodes: [{ type: "trigger", trigger_type: "api-invoke" }], enabled: true }.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Journey display name
`nodes`	Yes	Array of journey node objects. Node ids are server-generated — do NOT include an id field. Trigger node example: { type: "trigger", trigger_type: "api-invoke" }. Send node example: { type: "send", template: "nt_abc" }. Delay node example: { type: "delay", mode: "duration", duration: "PT1H" }.
`state`	No	Create as DRAFT (default) or PUBLISHED immediately.
`enabled`	No	Whether the journey is active. Defaults to true.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses defaults (DRAFT state), server-generated node IDs, and the restriction on send nodes. Annotations (readOnlyHint=false, idempotentHint=false) indicate mutation and non-idempotency; description adds valuable behavioral context without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise and well-structured: the main action first, then constraints, then reference to related tools. Every sentence adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Complete for a creation tool: covers state defaults, required fields, constraints, sibling tools for subsequent steps, and an example. No output schema but creation behavior is fully described.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good descriptions. Description adds a full example and clarifies constraints (no id in nodes, trigger node required). Exceeds baseline by providing actionable context beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Create a new journey' and distinguishes from siblings by noting that send nodes are not allowed on create, requiring replace_journey instead. Includes specific constraints and example.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly explains when to use this tool (initial shell creation) and when to use alternatives (replace_journey for send nodes, publish_journey to make live). Provides clear workflow guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_journey_templateAInspect

Create a notification template scoped to a journey. Defaults to DRAFT; pass state: "PUBLISHED" to publish on create. The template can then be referenced in journey send nodes. Example: { journey_id: "j-abc", channel: "email", notification: { name: "Welcome Email", tags: [], brand: null, subscription: null, content: { version: "2022-01-01", elements: [{ type: "text", content: "Hello!" }] } } }.

ParametersJSON Schema

Name	Required	Description
`state`	No	Initial state: "DRAFT" (default) or "PUBLISHED"
`channel`	Yes	Channel for this template (e.g. "email", "push", "sms", "inbox")
`journey_id`	Yes	The journey template ID
`notification`	Yes	Notification template definition
`provider_key`	No	Specific provider key to target

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations only indicate readOnlyHint=false and idempotentHint=false. The description reveals the creation behavior and state defaults but does not mention idempotency, side effects of multiple calls, or required permissions. More transparency is needed given the minimal annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences plus an example, which is relatively concise. The key information is front-loaded. The example, though long, provides concrete context. Could be slightly more compact by summarizing the example, but it's well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (5 params, nested objects, no output schema), the description covers purpose and state options but does not explain return values or error scenarios. It also does not differentiate from sibling tools like 'replace_journey_template' or 'publish_journey_template'. More context is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value with an example showing nested structure, but the schema already has detailed descriptions for all parameters. The example is helpful but does not significantly enhance parameter semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Create a notification template scoped to a journey.' This is a specific action with verb and resource, and it distinguishes from siblings like 'create_notification' (standalone) and 'create_journey' (journey itself).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains default behavior (DRAFT) and how to publish immediately, and notes that templates are referenced in journey send nodes. However, it lacks explicit guidance on when to use this tool versus alternatives like 'create_notification' or 'replace_journey_template'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_listB

Idempotent

Inspect

Create or update a list by list ID.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Display name for the list
`list_id`	Yes	The list ID

Tool Definition Quality

B3.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint true and readOnlyHint false, so the description does not need to re-state safety profile. The description adds the key behavioral trait that this tool performs an upsert (create or update). However, it does not disclose what happens on conflict, error conditions, or any side effects beyond name changes. Adequate but not rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is one short sentence, well front-loaded with the action. It avoids unnecessary words. However, it could be more helpful without becoming verbose, so not perfect (5 would require equal brevity with full clarity).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description should explain what the tool returns or confirms (e.g., the updated list object). It does not mention whether the operation is synchronous or async, nor any side effects. For a simple tool, the description is incomplete; the agent lacks key context to correctly interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for both parameters. The description repeats 'by list ID' but adds no additional semantic context beyond what the schema already provides. It does not explain expected formats, validation, or default behaviors for the parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create or update a list') and identifies the key parameter ('by list ID'). However, the ambiguity between create and update is not resolved: it's unclear whether the tool upserts based on existence or requires different parameters. Sibling tools like 'get_list' and 'delete_list' are distinct, so purpose is mostly clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There is no mention of prerequisites, such as whether the list ID must already exist for updates or how to generate it for creation. The agent receives no help in deciding between this and similar tools like 'update_audience'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_notificationAInspect

Create a V2 notification template. name is required. Provide content inline or set it immediately after creation via put_notification_content. To send with this template you must publish it first via publish_notification (or pass state: 'PUBLISHED' on create). Link a routing strategy via notification.routing.strategy_id to control which channels are used. Example: { notification: { name: 'welcome-email', tags: [], brand: null, subscription: null, routing: { strategy_id: 'rs_01abc' }, content: { version: '2022-01-01', elements: [] } } }.

ParametersJSON Schema

Name	Required	Description	Default
`state`	No	Template state after creation (defaults to DRAFT)
`notification`	Yes	Notification template payload

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Disclosures go beyond annotations: states that content can be set later via put_notification_content, that state controls publish status, and that routing strategy can be linked. No contradiction with annotations (readOnlyHint=false, idempotentHint=false).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two clear sentences plus an example, all front-loaded with key information. No filler; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers essential workflow: creating with optional state, how to handle content and routing, and publishing. Does not describe return value or error cases, but for a creation tool with no output schema, this is reasonable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Adds meaning beyond the schema: emphasizes name is required, explains state enum options, and provides a concrete JSON example. Schema coverage is 100%, so baseline is 3; the description adds significant value with the example and workflow hints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a V2 notification template and distinguishes it from related tools by referencing put_notification_content and publish_notification. It uses a specific verb ('Create') and resource ('notification template').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context on when to use this tool: mentions that content can be set later, that publishing is required for sending, and that routing strategy can be linked. However, it does not explicitly exclude alternatives or state when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_or_merge_userA

Idempotent

Inspect

Create a new user profile or merge supplied values into an existing profile (POST). Existing fields not included are preserved.

ParametersJSON Schema

Name	Required	Description	Default
`profile`	No	Profile data to create or merge (e.g. { email: "...", phone_number: "..." })
`user_id`	Yes	The user ID

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds important behavioral context: the operation is a POST that merges values and preserves fields not included. The annotations indicate idempotency, which aligns. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that front-loads the key action and behavior. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a create/merge tool with no output schema, the description fully explains the input and the merge behavior. It is complete enough for an agent to understand and use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are well-documented in the schema. The description adds a brief example for profile, but the baseline is adequate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (create or merge), the resource (user profile), and the specific behavior (merge supplied values, preserve existing fields). It distinguishes from sibling tools like patch_profile.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use this tool (for creating or updating user profiles with merge semantics), but does not explicitly exclude alternatives like patch_profile or replace_profile. The guidance is clear but not exhaustive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_or_replace_user_push_tokenA

Idempotent

Inspect

Create or replace a push/device token for a user.

ParametersJSON Schema

Name	Required	Description
`token`	Yes	The token string
`device`	No	Device metadata
`user_id`	Yes	The user ID
`provider_key`	Yes	Push provider

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds 'create or replace' behavior beyond the annotations (readOnlyHint=false, idempotentHint=true), but does not disclose other traits like required authentication, error states, or side effects. Annotations already cover idempotency and non-read-only status.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence that front-loads the action. It is efficient but could be slightly expanded to include usage context without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has four parameters, nested objects, and no output schema, the description is minimal. It does not explain create vs replace behavior, error conditions, or prerequisites, making it adequate but not thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All four parameters have descriptions in the input schema (100% coverage). The description does not add additional meaning or usage context beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('create or replace') and the resource ('push/device token for a user'), distinguishing it from sibling tools like 'delete_user_token' or 'get_user_push_token'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or when-not-to-use guidance is provided. It does not differentiate from similar operations like 'patch_user_token' or 'bulk_add_user_tokens', leaving the agent to infer usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_or_update_tenantC

Idempotent

Inspect

Create or replace a tenant. Tenants represent organizations or groups that users belong to.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Display name for the tenant
`brand_id`	No	Brand ID to associate with this tenant
`tenant_id`	Yes	The tenant ID
`properties`	No	Custom properties for the tenant
`user_profile`	No	Default profile data for users in this tenant
`parent_tenant_id`	No	Parent tenant ID for hierarchical tenants
`default_preferences`	No	Default notification preferences for users in this tenant

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds no behavioral context beyond the annotations. Annotations already indicate idempotentHint: true and readOnlyHint: false. The description does not explain side effects, required permissions, or what happens on replace (e.g., whether existing data is preserved or overwritten).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence that efficiently communicates the core action. It is concise without unnecessary words, though it could be more informative without sacrificing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 7 parameters including nested objects (properties, user_profile, default_preferences) and no output schema. The description is minimal and does not explain parameter usage, return values, or behavior for complex cases. It is incomplete given the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the input schema already documents each parameter. The tool description adds no additional parameter-level information. Baseline score of 3 is appropriate since the schema covers parameters adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates or replaces a tenant and defines what a tenant represents. However, the description uses 'replace' while the tool name says 'update', which could cause minor ambiguity. It does not differentiate among sibling tenant operations, but purpose is still clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like create_or_merge_user or update_tenant_preference. There is no context about prerequisites, limitations, or when to use create vs. update scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_providerAInspect

Create a new provider (integration) configuration. Once routing strategies or notification templates reference this config, credential or settings mistakes can affect live sends—confirm provider key and settings against list_provider_catalog before saving. The provider field must be a known Courier provider key.

ParametersJSON Schema

Name	Required	Description
`alias`	No	Short alias for referencing this provider
`title`	No	Display name for this provider configuration
`provider`	Yes	Provider key from the catalog (e.g. sendgrid, twilio, firebase-fcm)
`settings`	No	Provider-specific settings (API keys, credentials, etc.)

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate this is a mutable, non-idempotent operation. The description adds significant behavioral context by warning that credential or settings mistakes can affect live sends once referenced by routing strategies or templates, which is beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no waste: first sentence states purpose, second delivers critical warnings and constraints. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a creation tool with a single required parameter and no output schema, the description covers the main risk and prerequisite. Lacks mention of return value but that is minor given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds value by specifying that the 'provider' field must be a known Courier provider key and emphasizing the impact of 'settings' errors on live sends.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create a new provider (integration) configuration') with a specific verb and resource. It differentiates from sibling tools like list_provider_catalog, get_provider, update_provider, and delete_provider by focusing solely on creation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises confirming provider key and settings against list_provider_catalog before saving, and warns that mistakes can affect live sends. While it does not explicitly name alternatives like update_provider for editing, it provides clear context for safe usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_routing_strategyAInspect

Create a routing strategy defining how notifications are delivered across channels and providers.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Human-readable name for the routing strategy
`tags`	No	Tags for categorization
`routing`	Yes	Routing tree defining channel selection method and order
`channels`	No	Per-channel delivery configuration
`providers`	No	Per-provider delivery configuration
`description`	No	Description of the routing strategy

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=false and idempotentHint=false, consistent with a create operation. The description adds the context that it defines delivery across providers, but does not elaborate on any side effects, permissions, or constraints beyond that.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, 14 words, no redundant information. Perfectly concise and front-loaded with the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks explanation of return value (no output schema) and does not clarify the routing structure or dependencies on existing providers. However, the schema details are rich, and the context of sibling tools is clear. Somewhat minimal for a creation tool with nested objects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents each parameter. The description mentions 'channels and providers' which correspond to parameters, but adds no new meaning or constraints beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Create' and the resource 'routing strategy', and specifies its function (defining delivery across channels and providers). It effectively differentiates from sibling tools like archive, list, get, and replace.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., replace_routing_strategy for updates). The description only states what it does, not the context or prerequisites for creation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_audienceB

DestructiveIdempotent

Inspect

Delete an audience by its ID.

ParametersJSON Schema

Name	Required	Description	Default
`audience_id`	Yes	The audience ID to delete

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description simply repeats 'delete' which is consistent with annotations (destructiveHint, idempotentHint). However, it adds no additional context beyond those annotations, such as irreversibility, potential cascading effects, or permission requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that immediately states the action and object. No extraneous words or unnecessary details are present.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a destructive action, the description lacks completeness. It does not explain the consequences of deletion (e.g., removal from related automations or journeys), nor does it mention if the operation is irreversible or if it requires specific permissions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already describes the sole parameter 'audience_id' with a description. The tool description does not add any extra meaning or context about the parameter, so it meets the baseline but does not exceed it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Delete an audience by its ID' clearly specifies the action (delete), the resource (audience), and the identifier method (by ID). It distinguishes well from siblings like list_audiences, get_audience, and update_audience.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives, such as delete_list or delete_brand. There is no mention of prerequisites, dependencies, or conditions under which deletion should be avoided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_brandA

DestructiveIdempotent

Inspect

Delete a brand by its ID.

ParametersJSON Schema

Name	Required	Description	Default
`brand_id`	Yes	The brand ID to delete

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint=true, so the description adds minimal behavioral context. The idempotentHint=true annotation is not clarified; deleting a non-existent brand may cause an error, potentially contradicting idempotency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

One short sentence, no unnecessary words, front-loaded with the key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Minimally complete for a simple delete operation, but lacks details on return value, error handling, or side effects. No output schema to compensate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the single parameter 'brand_id' is already well-documented. The description simply repeats 'by its ID', adding no new meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'Delete' and the resource 'brand' with the method 'by its ID', distinguishing it from sibling tools like create_brand, update_brand, and list_brands.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives. The usage is implied by the name and description, but lacking context such as prerequisites or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_listB

DestructiveIdempotent

Inspect

Delete a list by its ID.

ParametersJSON Schema

Name	Required	Description	Default
`list_id`	Yes	The list ID

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide destructiveHint=true and idempotentHint=true. Description adds 'Delete' which aligns, but does not elaborate on side effects or what happens to associated data (e.g., subscribers). Minimal additional value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that is concise and to the point. Could be considered too minimal, but it efficiently conveys the core action without fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple delete with annotations and one parameter, the description is adequate but lacks context about implications (e.g., whether deletion is permanent or reversible, nor mention of related restore_list sibling).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter. Description does not add meaning beyond the schema's description 'The list ID'. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Delete' and the resource 'list', with the parameter 'by its ID'. It distinguishes from sibling tools like create_list, get_list, and restore_list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., restore_list for undoing, or update_list for modifying). No prerequisites or contexts mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_profileA

DestructiveIdempotent

Inspect

Delete a user profile permanently.

ParametersJSON Schema

Name	Required	Description	Default
`user_id`	Yes	The user ID to delete

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds 'permanently' to reinforce destructiveness, but the annotations already declare destructiveHint: true and idempotentHint: true. The additional value is minimal, and no contradictions are present.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no extraneous words. It is front-loaded and easily parsed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description is adequate for a simple delete operation with one parameter and no output schema, but it lacks context on error conditions, the state after deletion, or any required permissions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter 'user_id', with a clear description in the schema. The tool description does not supplement the parameter meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Delete') and the resource ('user profile'). It uses a specific verb and noun combination, distinguishing it from sibling tools that delete other entities like audiences, brands, or lists.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not mention any prerequisites, exclusions, or related tools for retrieving or modifying profiles before deletion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_providerA

DestructiveIdempotent

Inspect

Delete a provider configuration. Returns 409 if the provider is still referenced by routing or notifications.

ParametersJSON Schema

Name	Required	Description	Default
`provider_id`	Yes	The provider configuration ID to delete

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds the important behavior of returning 409 on conflict, which goes beyond annotations (destructiveHint=true). IdempotentHint is not contradicted or expanded, but the conflict detail is valuable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no wasted words. Action-first structure with the key behavior stated immediately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple delete tool with one parameter and no output schema, the description covers purpose and a key behavior (409 conflict). Could mention irreversibility, but destructiveHint already covers that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear parameter description. The description does not add additional meaning beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Delete a provider configuration' which is specific verb+resource. Distinguishes from siblings like create_provider, update_provider, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Mentions the condition of returning 409 if referenced, guiding the agent when deletion is possible. Implicitly suggests checking for references first, but does not explicitly compare to alternatives like archiving.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_tenantA

DestructiveIdempotent

Inspect

Delete a tenant by its ID.

ParametersJSON Schema

Name	Required	Description	Default
`tenant_id`	Yes	The tenant ID to delete

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotentHint=true and destructiveHint=true. The description 'Delete' is consistent but adds no additional behavioral context (e.g., irreversibility, cascade effects, required permissions).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single sentence with no redundant information. Every word is necessary.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple delete tool with one parameter and full schema coverage, the description is minimally adequate but lacks any additional context about consequences or prerequisites.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema provides full coverage for the tenant_id parameter. The description mentions 'by its ID' but adds no semantic detail beyond the schema's description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Delete') and the specific resource ('tenant'), distinguishing it from sibling delete tools like delete_audience or delete_brand.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like delete_tenant_template or remove_user_from_tenant. The description lacks context for choosing this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_tenant_preferenceB

DestructiveIdempotent

Inspect

Remove default notification preference for a topic from a tenant.

ParametersJSON Schema

Name	Required	Description	Default
`topic_id`	Yes	The subscription topic ID
`tenant_id`	Yes	The tenant ID

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveness and idempotency. The description adds no additional behavioral context, such as behavior when the preference does not exist or side effects on related entities.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence with no unnecessary words. It efficiently conveys the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple delete operation with two well-defined parameters and annotations covering key behavioral traits, the description is largely complete. It could mention idempotency implications, but that's already in annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The description does not add further meaning beyond what the schema provides, but it is not misleading.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Remove'), the resource ('default notification preference'), and the scope ('for a topic from a tenant'). It effectively distinguishes from siblings like 'delete_tenant' or 'update_tenant_preference'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, such as user-level preference tools (e.g., 'update_user_preference_topic'). No exclusions or context about prerequisites are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_tenant_templateA

Destructive

Inspect

Delete a tenant notification template. Returns 204 on success, 404 if the template does not exist for this tenant.

ParametersJSON Schema

Name	Required	Description	Default
`tenant_id`	Yes	The tenant ID that owns the template
`template_id`	Yes	The notification template ID to delete

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint=true, so the bar is lower. The description adds valuable detail about return codes (204 on success, 404 if not found), which enhances behavioral understanding beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two short, focused sentences with no extraneous information. Every word adds value, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple delete operation with two documented parameters and no output schema, the description is complete. It covers purpose, response codes, and reasonable behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with both parameters described in the input schema. The description does not add additional meaning to the parameters beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Delete') and the resource ('tenant notification template'), leaving no ambiguity. It is distinct from sibling tools like archive_*, which imply a different operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies it is for permanent deletion (destructiveHint), but does not explicitly state when to use this vs alternatives like archiving. No guidance on when not to use it is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_user_list_subscriptionsB

DestructiveIdempotent

Inspect

Delete all list subscriptions for a user.

ParametersJSON Schema

Name	Required	Description	Default
`user_id`	Yes	The user ID

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint=true and idempotentHint=true. The description adds no behavioral context beyond stating the deletion. It does not disclose whether the action is reversible, if any data is preserved, or if rate limits apply. With annotations covering the safety profile, the description contributes minimal additional transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence with no extraneous information. Every word earns its place, and the core action is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple one-parameter tool with annotations, the description adequately states its purpose. However, it could hint at the irreversibility (though destructiveHint covers that) or mention that it affects all list subscriptions. The lack of output schema is acceptable here. Overall, it is nearly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single required parameter user_id, which is described as 'The user ID'. The description does not add further meaning, such as expected format or source of the ID. Baseline 3 is appropriate given full schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Delete all list subscriptions for a user.' clearly specifies the action (delete), the resource (list subscriptions), and scope (all). It distinguishes from siblings like get_user_list_subscriptions, subscribe_user_to_list, and unsubscribe_user_from_list by indicating a bulk removal.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like unsubscribe_user_from_list (removes a single subscription) or delete_list (deletes the list itself). The description does not mention prerequisites or contextual cues.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_user_tokenB

DestructiveIdempotent

Inspect

Delete a specific push token for a user.

ParametersJSON Schema

Name	Required	Description	Default
`token`	Yes	The token identifier to delete
`user_id`	Yes	The user ID

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint=true and idempotentHint=true. The description does not add behavioral context such as whether tokens can be restored, if the operation requires specific permissions, or what happens after deletion. It adds no value beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is concise and to the point. However, it lacks structured details like when to use or return information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple nature of the tool (delete a token) and the presence of annotations and complete schema, the description is adequate. However, it could mention that the operation is destructive and idempotent, which annotations already cover.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with each parameter described. The description adds no additional meaning or constraints beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (Delete), the target (a specific push token), and the owner (for a user). It is specific and accurately distinguishes from sibling tools like get_user_push_token or bulk_add_user_tokens.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like patch_user_token (for updating) or get_user_push_token (for reading). No context on prerequisites or side effects is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_jwt_for_userAInspect

Generate a JWT authentication token for a user. Used for client-side SDK auth (Inbox, Preferences, etc.).

ParametersJSON Schema

Name	Required	Description	Default
`scopes`	No	Permission scopes for the token
`user_id`	Yes	The user ID to scope the token to
`expires_in`	No	Token expiry duration (e.g. "1h", "2 days")	1h

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with annotations (readOnlyHint=false, idempotentHint=false) by indicating a mutation (generating a token). It adds context about the token's purpose for client SDKs but does not elaborate on side effects like token storage or invalidation. For a straightforward tool, this is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: first states the function, second states the usage context. No filler words, front-loaded, and every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (3 params, no output schema), the description covers the essential action and purpose. It mentions the token's use for client-side SDKs, which provides sufficient context for an agent to select this tool. Minor omission: could note that the token is returned, but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description does not add extra meaning to parameters beyond what the schema provides (e.g., default scopes, expiry format). The use-case context is helpful but not parameter-specific.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates a JWT authentication token for a user, specifying a specific verb ('Generate') and resource ('JWT authentication token'). It also distinguishes from siblings by noting it's used for client-side SDK auth, which no other sibling tool does.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates the tool is used for client-side SDK authentication (Inbox, Preferences, etc.), providing clear context on when to use it. While it doesn't explicitly state when not to use or list alternatives, the unique functionality among siblings makes the guidance sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_audienceA

Read-only

Inspect

Get an audience by its ID, including its filter definition.

ParametersJSON Schema

Name	Required	Description	Default
`audience_id`	Yes	The audience ID

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. Description adds that the result includes the filter definition, but no additional behavioral traits beyond what annotations imply.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, concise, front-loaded with key information. No redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read operation with one parameter and no output schema, the description fully covers what the tool does and returns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (audience_id described as 'The audience ID'). Description only says 'by its ID', adding no extra meaning. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'Get', the resource 'audience', and the method 'by its ID', along with what is included (filter definition). It distinguishes from sibling tools like list_audiences and delete_audience.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., list_audiences or update_audience). Lacks context for optimal selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_audit_eventB

Read-only

Inspect

Get a specific audit event by its ID.

ParametersJSON Schema

Name	Required	Description	Default
`audit_event_id`	Yes	The audit event ID

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, but the description adds no behavioral traits beyond the functional purpose. It does not disclose what happens on missing ID, auth requirements, or return behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no wasted words. It efficiently communicates the tool's core function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple get-by-ID tool, the description is adequate but not fully complete. It does not mention the response format (full event object) or error handling, which would be beneficial given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents the parameter. The description adds minimal extra meaning beyond referencing the ID; baseline is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'audit event', and specifies the method 'by its ID'. This distinguishes it from sibling tools like list_audit_events and other get_* tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide any guidance on when to use this tool versus alternatives (e.g., list_audit_events for multiple events). It lacks explicit when/when-not/alternatives context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_brandB

Read-only

Inspect

Get a brand by its ID.

ParametersJSON Schema

Name	Required	Description	Default
`brand_id`	Yes	The brand ID

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds no behavioral context beyond the annotation readOnlyHint: true. It does not mention error cases, authentication requirements, rate limits, or that the operation is idempotent and safe.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence that immediately conveys the tool's purpose. It contains no extraneous information, is front-loaded, and every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple get-by-ID operation, a single parameter, and the readOnlyHint annotation, the description is mostly complete. It could optionally mention that the return value is the brand object, but the absence of an output schema reduces the need.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema describes brand_id as 'The brand ID' with 100% schema coverage. The description does not add any additional semantic meaning, such as format or source, so the baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses the specific verb 'Get' with the resource 'brand' and identifies the retrieval method 'by its ID'. This clearly distinguishes it from siblings like create_brand, update_brand, delete_brand, and list_brands.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not specify when not to use it or mention alternative tools like list_brands for multiple results.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_bulk_jobC

Read-only

Inspect

Get the status of a bulk job.

ParametersJSON Schema

Name	Required	Description	Default
`job_id`	Yes	The bulk job ID

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true. The description adds no additional behavioral context beyond that. It does not disclose whether the tool polls, returns immediately, or any error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence and concise, but it lacks structure. It could benefit from briefly describing the return value or expected outputs without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a one-parameter tool with no output schema, the description should at least hint at the return format or possible status values. The current description is insufficient for an agent to fully understand the tool's behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with job_id described as 'The bulk job ID'. The description does not add new parameter meaning beyond what the schema provides, achieving baseline expectation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Get the status of a bulk job' clearly states the action (get) and resource (bulk job status). It distinguishes from sibling tools like create_bulk_job and run_bulk_job, but could be more specific about what 'status' entails.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. Does not indicate prerequisites, such as needing a job ID from create_bulk_job, or mention that it is a simple read operation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_journeyA

Read-only

Inspect

Get a journey by ID. Pass version=draft to retrieve the working draft, or version=vN for a historical version. Defaults to published.

ParametersJSON Schema

Name	Required	Description	Default
`version`	No	Version to retrieve: "draft", "published" (default), or a version string like "v001"
`journey_id`	Yes	The journey template ID

Tool Definition Quality

A5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with the readOnlyHint annotation, confirming this is a read operation. It adds behavioral context by specifying the version retrieval behavior, which is not present in annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, informative and efficient. First sentence states the core purpose, second adds critical version context. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple read-only nature of the tool (confirmed by readOnlyHint), the description sufficiently covers the operation. No output schema is needed as the return is intuitive for a get-by-ID operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds extra meaning by explaining the version parameter values ('draft', 'published', 'vN'), going beyond the schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Get a journey by ID' which clearly identifies the verb (Get) and resource (journey). It distinguishes from siblings like get_journey_template and list_journeys by specifying that it retrieves a single journey by ID with optional version param.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance on how to retrieve different versions: 'Pass version=draft to retrieve the working draft, or version=vN for a historical version. Defaults to published.' This tells the agent when to use each parameter value and the default behavior.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_journey_templateA

Read-only

Inspect

Get a journey-scoped notification template by notification ID. Pass version=draft to retrieve the working draft (required before the template has been published). Defaults to published.

ParametersJSON Schema

Name	Required	Description
`version`	No	Version to retrieve: "draft", "published" (default), or "vN"
`journey_id`	Yes	The journey template ID that owns this notification
`notification_id`	Yes	The notification template ID

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, indicating safe read operation. The description adds valuable context about version behavior (draft vs published) beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise: two sentences delivering purpose and key usage detail without redundancy. Every sentence is necessary and well-placed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description could explain the return object format, but for a simple GET retrieval, the provided context is sufficient for basic usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all parameters with descriptions (100% coverage). The description adds context about the draft parameter's necessity before publishing, which helps the AI understand usage nuance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves a journey-scoped notification template by notification ID, with version distinction. This differentiates it from sibling tools like get_notification or get_journey templates by specifying 'journey-scoped'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides explicit guidance on when to use version=draft versus default published, which is crucial for correct invocation. However, it does not mention when to use alternative tools or any exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_listA

Read-only

Inspect

Get a list by its ID.

ParametersJSON Schema

Name	Required	Description	Default
`list_id`	Yes	The list ID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The readOnlyHint annotation already declares the tool is safe and read-only. The description adds no extra behavioral context beyond that, such as error handling, data format, or performance implications. It does not contradict the annotation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no extraneous information. Every word is necessary and contributes to clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with one parameter and annotations, the description is mostly complete. However, it does not mention the return value or structure (no output schema); the name implies the list object, which is sufficient but could be explicit.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% as the only parameter list_id has a description. The tool description does not add additional meaning beyond the schema (e.g., format or constraints). Baseline of 3 applies due to high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Get a list by its ID' clearly specifies the action (get), the resource (list), and the identifier (by ID). It differentiates from sibling tools like create_list and delete_list, and from other get_* tools by explicitly naming 'list'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the tool is used when you need a single list by ID, but it does not explicitly state when to use this versus alternatives like list_lists (to get all lists) or other get_* tools. No when-not-to-use guidance is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_list_subscribersB

Read-only

Inspect

Get all subscribers of a list.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor
`list_id`	Yes	The list ID

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the bar is lower. The description does not add behavioral context beyond stating the tool fetches subscribers. It omits details like pagination behavior (despite a cursor parameter) or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words. It is efficient, though it could be more informative while remaining concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple, but the description lacks mention of pagination (despite cursor parameter) and return format (e.g., subscriber IDs or full profiles). No output schema is provided, so the description should compensate, which it does not fully.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with both parameters documented. The description adds no extra meaning beyond what the schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get') and the resource ('all subscribers of a list'). It is specific enough to distinguish from many sibling tools that operate on lists, but does not explicitly differentiate from similar list-reading tools like list_audience_members.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There are siblings like list_audience_members and get_user_list_subscriptions, but the description offers no context for choosing between them.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_messageA

Read-only

Inspect

Get the full details and status of a single message by its ID.

ParametersJSON Schema

Name	Required	Description	Default
`message_id`	Yes	The message ID to retrieve

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already set readOnlyHint=true. The description adds that it returns 'full details and status,' which is helpful but not specific about what fields are included. No contradictions, but no additional behavioral traits beyond the annotation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence of 12 words, front-loaded with the action and purpose. No extraneous information, every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple retrieval tool with good annotations and minimal parameters, the description covers the purpose and return type. However, it lacks mention of possible errors or what 'status' entails, so not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with only one parameter ('message_id') already described. The description simply restates 'by its ID' without adding new semantic context, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get the full details and status of a single message by its ID.' This specific verb and resource, along with the singular nature, distinguish it from sibling tools like 'list_messages' (plural) and 'get_message_content' (partial).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage (when you need details of one message by ID) but lacks explicit when-not-to-use or alternatives. For example, it doesn't mention that for message content only, one should use 'get_message_content' instead.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_message_contentA

Read-only

Inspect

Get the rendered content (HTML, text, subject) of a previously sent message.

ParametersJSON Schema

Name	Required	Description	Default
`message_id`	Yes	The message ID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, indicating a safe read operation. The description adds that it returns rendered content (HTML, text, subject), but does not disclose any other behaviors like authentication needs or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that front-loads the action and resource. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-only tool with one parameter and no output schema, the description is largely complete. It specifies what is returned (HTML, text, subject), though it could mention that the message must be sent. Overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (message_id with description 'The message ID'). The description does not add any further semantics beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves the rendered content (HTML, text, subject) of a previously sent message, using the verb 'Get' and specifying the resource. This distinguishes it from siblings like get_message (metadata) and get_message_history (history).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for sent messages but does not explicitly state when to use this tool over alternatives like get_notification_content or get_message. There are no exclusions or alternative recommendations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_message_historyA

Read-only

Inspect

Get the event history for a message, showing each step in the delivery pipeline (enqueued, sent, delivered, etc.).

ParametersJSON Schema

Name	Required	Description	Default
`type`	No	Filter by event type
`message_id`	Yes	The message ID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, and description adds context about pipeline steps. No contradictions. Additional info could cover ordering or pagination.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that is front-loaded with purpose and efficiently conveys core functionality without excess.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple two-parameter read-only tool, the description adequately explains what it does. Could be enhanced by mentioning pagination or event type enumeration.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds minimal extra meaning beyond repeating the schema's param descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves event history for a message, detailing delivery pipeline steps. This distinguishes it from siblings like get_message (message details) and get_message_content (content).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for examining event history but does not explicitly state when to use this tool over alternatives or mention exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_notificationA

Read-only

Inspect

Retrieve a notification template by ID. Optionally request draft, published, or a version such as v001.

ParametersJSON Schema

Name	Required	Description	Default
`version`	No	Version to retrieve: draft, published, or a string like v001
`notification_id`	Yes	The notification template ID

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the description's addition of version retrieval is useful but minimal. No further behavioral traits (e.g., access requirements) are disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence with no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description is adequate for a simple retrieval tool, but without an output schema, it could mention what the response contains. However, the version detail adds needed context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for both parameters. The description adds little beyond rephrasing 'Optionally request draft, published, or a version such as v001,' which is already in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Retrieve') and the resource ('notification template by ID'), and the optional version parameter distinguishes it from similar tools like get_notification_content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage (retrieving a template by ID with optional version) but does not explicitly state when to use this tool versus siblings like get_notification_content or get_notification_draft_content.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_notification_contentA

Read-only

Inspect

Get the published content blocks of a notification template.

ParametersJSON Schema

Name	Required	Description	Default
`notification_id`	Yes	The notification template ID

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. Description adds no further behavioral details (e.g., about rate limits, auth, or empty results). It is adequate but not additive beyond the annotation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence of 10 words, front-loaded with the verb. No unnecessary text, highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with one parameter and no output schema, the description fully captures what it does and what input is needed. Complete given the tool's simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already describes the single parameter. Description adds no additional meaning beyond the schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Get the published content blocks of a notification template', using a specific verb and resource. It distinguishes from siblings like 'get_notification' and 'get_notification_draft_content' by specifying 'published' content blocks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like 'get_notification_draft_content'. The description implies use for published content, but does not provide when-not or alternative tool names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_notification_draft_contentB

Read-only

Inspect

Get the draft (unpublished) content blocks of a notification template.

ParametersJSON Schema

Name	Required	Description	Default
`notification_id`	Yes	The notification template ID

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds no behavioral context beyond what annotations already convey (readOnlyHint: true). It does not explain error behavior, pagination, or what exactly 'content blocks' entails.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is front-loaded with the core action and resource. No filler or redundant words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of an output schema, the description should clarify the return structure (e.g., list of content blocks). It fails to do so, leaving ambiguity about what the response contains.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the existing schema adequately documents the single parameter. The description does not add additional meaning or usage tips beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'draft (unpublished) content blocks of a notification template,' using a specific verb and resource. This distinguishes it from sibling tools like get_notification_content (published) and get_notification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like get_notification_content or get_notification. There is no mention of use cases, prerequisites, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_providerA

Read-only

Inspect

Fetch a single provider configuration by ID.

ParametersJSON Schema

Name	Required	Description	Default
`provider_id`	Yes	The provider configuration ID

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, and the description adds no further behavioral context (e.g., response format, error conditions). The description is neutral and does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, 7 words, zero fluff. Every word is necessary.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Simple tool with 1 param and no output schema. The description is adequate but could mention the return value (e.g., 'returns the provider configuration object').

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description does not add additional meaning beyond the schema's description for 'provider_id'. Baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'fetch', the resource 'provider configuration', and the scope 'by ID'. It distinguishes the tool from siblings like 'list_providers' and mutation tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., list_providers). No prerequisites or exclusions mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_routing_strategyA

Read-only

Inspect

Retrieve a routing strategy by ID. Returns the full entity including routing, channels, and providers.

ParametersJSON Schema

Name	Required	Description	Default
`routing_strategy_id`	Yes	The routing strategy ID (rs_ prefix)

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true. Description adds that the return includes 'routing, channels, and providers', which is useful context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the main action, no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple retrieval tool with one parameter and readOnly annotation, the description sufficiently explains what the tool does and what it returns. No output schema, so the description covers return expectations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; the schema already describes the parameter well. The description adds no new semantic meaning about the parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (retrieve), the resource (routing strategy), and the scope (by ID, returns full entity). It distinguishes from sibling tools like list_routing_strategies which likely returns summaries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies use when an ID is available, but does not explicitly state when to prefer this over list_routing_strategies or other retrieval methods. No when-not or alternative guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_tenantC

Read-only

Inspect

Get a tenant by its ID.

ParametersJSON Schema

Name	Required	Description	Default
`tenant_id`	Yes	The tenant ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The readOnlyHint annotation already declares the tool is read-only, but the description adds no further behavioral details (e.g., what is returned, error cases, or required permissions). Without an output schema, the agent has no information about the response structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence with no unnecessary words. It is front-loaded with the key information and efficiently conveys the tool's functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple get operation with no output schema, the description lacks completeness. It does not hint at the return format, fields, or whether the tenant object includes nested data. Useful context for an agent is missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a single parameter described as 'The tenant ID'. The description repeats this by saying 'by its ID', adding no extra meaning. Baseline score of 3 is appropriate as the schema already documents the parameter adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Get a tenant by its ID' clearly specifies the action (get) and resource (tenant), with the qualifier 'by its ID' distinguishing it from listing operations. However, it does not explicitly differentiate from sibling tools like list_tenants, which prevents a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as list_tenants or create_or_update_tenant. The description lacks context on prerequisites or appropriate scenarios, relying solely on the tool name.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_tenant_templateB

Read-only

Inspect

Get a tenant notification template association by template ID.

ParametersJSON Schema

Name	Required	Description	Default
`tenant_id`	Yes	The tenant ID
`template_id`	Yes	The template ID

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, and the description simply states 'Get', which is consistent. However, no additional behavioral traits (e.g., return format, pagination, permissions) are disclosed beyond the annotation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that immediately conveys the action and resource, with no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description is adequate for a simple getter, but lacks information about the return value or structure. Given no output schema, the agent might need more context on what is returned.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema fully covers both parameters with descriptions (tenant_id and template_id). The description adds no extra meaning beyond the schema, which is expected given 100% coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'tenant notification template association' identified by template ID. It distinguishes from siblings like get_tenant_template_version and list_tenant_templates by focusing on a single association.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as get_tenant_template_version or list_tenant_templates. The description lacks context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_tenant_template_versionA

Read-only

Inspect

Get a specific version of a tenant notification template (e.g. latest, published, or v1).

ParametersJSON Schema

Name	Required	Description
`version`	Yes	Version identifier (latest, published, or v-prefixed)
`tenant_id`	Yes	The tenant ID
`template_id`	Yes	The template ID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare 'readOnlyHint=true', so the read-only nature is covered. The description adds no further behavioral context beyond stating it is a 'Get' operation. No contradictions or additional behavioral traits disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that is front-loaded with the action and resource. Every word serves a purpose, with no redundancy or irrelevant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the purpose and provides examples, but lacks details about the return format or error conditions. For a simple retrieval tool with no output schema, it is mostly complete but could mention what data the response contains.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and descriptions are provided for all parameters. The description's example ('latest, published, or v1') reinforces the version parameter's allowed values but does not add new meaning beyond what the schema's description already says.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'specific version of a tenant notification template' with concrete examples like 'latest, published, or v1'. It distinguishes from siblings like 'get_tenant_template' and 'list_tenant_templates' by focusing on version retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for fetching a specific template version but does not explicitly state when to use this tool over alternatives like 'get_tenant_template' or list tools. No when-not-to-use or prerequisite information is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_translationA

Read-only

Inspect

Get a translation for a specific locale (e.g. "en_US", "fr_FR").

ParametersJSON Schema

Name	Required	Description	Default
`domain`	No	Translation domain (only "default" is supported currently)	default
`locale`	Yes	Locale code (e.g. en_US, fr_FR)

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations include readOnlyHint=true, which is consistent with the description. The description adds no further behavioral details beyond the inherent read operation. No mention of error behavior or return format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence clearly stating the purpose, front-loaded with the action and resource. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple getter with 2 parameters and no output schema, the description is adequate but could mention the return value (e.g., the translation string). However, it is sufficient given the context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The tool description reinforces the locale format example but adds no significant new information beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it retrieves a translation for a specific locale, with verb 'get' and resource 'translation'. Distinguishes from sibling 'update_translation' and other unrelated tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide explicit guidance on when to use this tool versus alternatives (e.g., update_translation). The purpose is clear, but no additional context on when not to use or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_user_list_subscriptionsA

Read-only

Inspect

Get all list subscriptions for a user.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor
`user_id`	Yes	The user ID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with the readOnlyHint annotation. It states 'Get all' which could imply unlimited results, but the cursor parameter suggests pagination is possible. The description adds little beyond annotations, being adequate but not rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words, clearly communicating the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple read operation, good annotations, and full schema coverage, the description is sufficient. It does not explicitly mention pagination behavior, but the cursor parameter implies it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and descriptions are provided for both parameters. The tool description reinforces the role of user_id but does not add new semantic detail beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb (Get) and identifies the resource (list subscriptions) and scope (for a user). It clearly distinguishes from siblings like get_list_subscribers or delete_user_list_subscriptions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for retrieving a user's list subscriptions but does not provide explicit guidance on when to use versus alternatives like subscribe_user_to_list or get_list_subscribers.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_user_preferencesA

Read-only

Inspect

Get a user's notification preferences (subscriptions, opt-outs, channel preferences).

ParametersJSON Schema

Name	Required	Description	Default
`user_id`	Yes	The user ID
`tenant_id`	No	Scope preferences to a specific tenant

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations already declare readOnlyHint=true, indicating a safe read operation. The description adds that it retrieves notification preferences, but does not disclose any additional behavioral traits such as permissions needed or rate limits. The description is consistent with annotations, so no contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, focused sentence with no extraneous information. Every word contributes to clarifying the tool's purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description partially compensates by listing the kinds of preferences returned (subscriptions, opt-outs, channel preferences). However, it does not describe the response format or structure, leaving some ambiguity for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 100% coverage with clear descriptions for both parameters (user_id and tenant_id). The description does not add substantial new semantic information beyond the schema; it mentions the types of preferences returned but does not explain how parameters affect the output. Baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'user's notification preferences', specifying the types of preferences (subscriptions, opt-outs, channel preferences). It distinguishes from the sibling tool 'get_user_preference_topic' which is more granular.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_user_preference_topic' or other get tools. It does not mention prerequisites or conditions for use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_user_preference_topicA

Read-only

Inspect

Get a user's preference for a specific subscription topic.

ParametersJSON Schema

Name	Required	Description
`user_id`	Yes	The user ID
`topic_id`	Yes	The subscription topic ID
`tenant_id`	No	Scope to a specific tenant

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true. Description adds 'Get' which aligns, but no additional behavioral context (e.g., whether tenant_id is optional, result format). Minimal value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no redundant words, front-loaded with action and resource. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple getter with read-only annotations, the description is mostly complete. However, it lacks info on return value format and optionality of tenant_id, which would be helpful for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions like 'The user ID'. Description does not add any additional meaning beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Get a user's preference for a specific subscription topic', which is a specific verb+resource combination. It effectively distinguishes from sibling tools like 'get_user_preferences' (plural) and 'update_user_preference_topic'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives (e.g., 'get_user_preferences' for all preferences). Usage is implied but not stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_user_profile_by_idA

Read-only

Inspect

Get a user profile by their ID. Returns profile data including email, phone, and custom properties.

ParametersJSON Schema

Name	Required	Description	Default
`user_id`	Yes	The user ID to look up

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint. Description adds return field details (email, phone, custom properties) beyond the schema, but doesn't discuss authorization needs or rate limits. With annotations present, adds some value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with action and resource. Each sentence adds value: informs and specifies return data.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-only tool, description is adequate. It specifies return fields. Could mention error cases or required permissions, but not essential given readOnlyHint.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers the only parameter with description. Description doesn't add new meaning beyond restating 'by their ID'. Baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action (Get) and resource (user profile by ID). It differentiates from many sibling get tools by specifying the unique identifier and return fields.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives like get_audience or list tools. Usage is implied, but no exclusions or alternatives mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_user_push_tokenA

Read-only

Inspect

Get a specific push/device token for a user.

ParametersJSON Schema

Name	Required	Description	Default
`token`	Yes	The token identifier
`user_id`	Yes	The user ID

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the description adds limited behavioral context. No mention of error responses, permissions, or rate limits. The description is accurate but doesn't exceed annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence, front-loaded with key information. While very brief, it contains no unnecessary words. Could be slightly more descriptive without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple retrieval tool with two parameters, the description is adequate but lacks details about the response format (no output schema) and edge cases like token not found. With many sibling tools, more context on the return value would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The tool description adds no additional meaning beyond what the schema provides, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get') and the resource ('specific push/device token for a user'). It distinguishes this tool from sibling tools like list_user_push_tokens (which lists all tokens) and mutation tools (create_or_replace, delete, patch).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when retrieving a single token by ID, but lacks explicit guidance on when to prefer this over alternatives like list_user_push_tokens, or what prerequisites exist (e.g., both user_id and token required).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

invoke_ad_hoc_automationAInspect

Invoke an ad-hoc automation with inline steps. Valid step actions: send, send-list, delay, cancel, update-profile, invoke, fetch-data. To cancel a previously started automation, use the cancel_automation tool instead.

ParametersJSON Schema

Name	Required	Description
`data`	No
`brand`	No
`profile`	No
`template`	No
`recipient`	No
`automation`	Yes	The automation definition with typed steps

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and idempotentHint=false, so mutating nature is already signaled. Description adds the list of actions (send, delay, etc.) which imply side effects, but does not explicitly state that steps are executed sequentially or that actions like send cause message transmission.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states purpose and lists actions, second provides cancellation guidance. No redundant information, front-loaded with core functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has complex nested objects and many sibling tools, yet the description is minimal. It does not explain return values (no output schema), prerequisites, error scenarios, or provide example usage. More detail would be warranted.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With schema description coverage at 17%, the description should compensate by explaining parameter usage, but it only lists valid step actions. No description of parameters like brand, profile, data, or the automation object's structure beyond what is in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool invokes an ad-hoc automation with inline steps, lists valid step actions, and distinguishes from cancel_automation. It is specific and helps differentiate from sibling tools like invoke_automation_template.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when not to use this tool for cancellation, directing to cancel_automation. Implicitly distinguishes from template-based automation invocation, but does not directly compare to invoke_automation_template.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

invoke_automation_templateAInspect

Invoke an automation run from an existing automation template. Call list_automations first to get the template_id. Example: { template_id: "auto-onboarding", recipient: "user-123", data: { plan: "pro" } }.

ParametersJSON Schema

Name	Required	Description
`data`	No	Data to pass to the automation
`brand`	No	Brand ID override
`profile`	No	Profile data for the recipient
`template`	No	Notification template override
`recipient`	Yes	Recipient user ID
`template_id`	Yes	The automation template ID

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=false and idempotentHint=false, indicating a non-idempotent mutation. The description adds value by specifying the prerequisite step but does not disclose potential side effects, failure modes, or whether duplicate invocations are harmful. Given annotation coverage, a 3 is appropriate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: one sentence plus an example. Every element serves a purpose—action, prerequisite, and concrete usage. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 6 parameters, nested objects, no output schema, and is a mutation, the description is too brief. It omits what the invocation returns (e.g., job ID or status), fails to explain the structure of the data parameter, and does not address potential errors or limits. The example helps but is insufficient for full understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so all parameters are documented. The description adds value by providing an example that illustrates the relationship between template_id, recipient, and data, clarifying how to use the core parameters. The other parameters (brand, profile, template) are not mentioned, but the schema covers them adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('invoke') and resource ('automation template'), uses a specific verb-resource pair, and distinguishes from sibling tools by mentioning 'from an existing automation template' and the prerequisite to call list_automations first, providing an example.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states a prerequisite ('Call list_automations first to get the template_id'), guiding the agent on the correct usage flow. However, it does not mention when to avoid using this tool (e.g., for ad-hoc automations) nor explicitly contrast with siblings like invoke_ad_hoc_automation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

invoke_journeyBInspect

Invoke a journey run from a journey template. Call list_journeys first to find the template_id. Example: { template_id: "j-onboarding", user_id: "user-123", data: { plan: "pro" } }.

ParametersJSON Schema

Name	Required	Description
`data`	No	Data payload passed to the journey for conditions and template variables
`profile`	No	Profile data for the user (email, phone, custom fields)
`user_id`	No	Recipient user ID. Can also be resolved from profile or data.
`template_id`	Yes	The journey template ID

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate it's not read-only and not idempotent. The description adds that it starts a journey run, but does not elaborate on side effects, auth needs, or rate limits. Minimal added value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with a front-loaded purpose statement and a clear example. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 params, nested objects, and no output schema, the description covers purpose and usage but lacks return value info, error conditions, or sync/async behavior. Adequate but incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions. The tool description adds an example showing typical usage of template_id, user_id, and data, which enhances understanding beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the specific action ('invoke a journey run') and resource ('journey template'), which is clear. However, it does not differentiate from sibling tools like invoke_automation_template or invoke_ad_hoc_automation, so it misses the highest score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a prerequisite ('call list_journeys first') but no guidance on when to use this tool versus alternatives. No exclusions or when-to-use contexts are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_audience_membersB

Read-only

Inspect

List all members of an audience.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor
`audience_id`	Yes	The audience ID

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. The description adds no new behavioral context (e.g., pagination via cursor, ordering, or permissions). It is consistent but adds no extra value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no wasted words. Front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 2 parameters including a cursor for pagination, and no output schema. The description fails to mention pagination behavior or return format, which is needed for a complete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description does not add any additional meaning beyond the schema, meeting the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'List all members of an audience' uses a specific verb (list) and resource (members of an audience), clearly distinguishing it from siblings like get_audience or list_audiences.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives (e.g., get_audience) or exclusions. The description is too minimal to help with selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_audiencesA

Read-only

Inspect

List all audiences in the workspace.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint: true, so the agent knows it's a read operation. The description does not add further behavioral context such as pagination behavior, rate limits, or side effects. With annotations covering the safety profile, a 3 is appropriate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is one succinct sentence that front-loads the action. Every word earns its place; no wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple (list with optional pagination). The description covers the core purpose but does not explain pagination behavior or what the response looks like. Given no output schema, a bit more detail would be helpful, but it's not severely incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (cursor is described in schema). The description does not add any parameter details beyond what's already in the schema. Baseline is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description is a clear single sentence: 'List all audiences in the workspace.' It specifies the verb (list), resource (audiences), and scope (all, in workspace). It distinguishes from siblings like get_audience (single audience) and list_audience_members (members of an audience).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description clearly implies when to use: when you need all audiences. There is no explicit when-not-to-use or alternatives mentioned, but the context is clear because there is no other tool that lists audiences. The lack of exclusions is a minor gap.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_audit_eventsA

Read-only

Inspect

List audit events in the workspace. Useful for tracking API usage and changes.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the description's mention of 'tracking' adds marginal context. However, it does not explain pagination behavior, ordering, or other operational details beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences that are front-loaded with the core purpose, with no extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and one optional parameter, the description explains the use case but lacks details on return format and pagination behavior, which would be helpful for a listing tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the cursor parameter is adequately described in the schema. The description adds no further meaning to the parameter beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('list') and resource ('audit events'), and adds context ('tracking API usage and changes'), distinguishing it from many sibling tools focused on other resources or actions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for monitoring, but does not explicitly state when to use this tool versus related alternatives like 'get_audit_event' or when not to use it. Guidance is minimal.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_automationsA

Read-only

Inspect

List automation templates in the workspace. Always call this first to discover template_id values before calling invoke_automation_template. Optionally filter by version.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor
`version`	No	Filter by version state

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. Description adds context by explaining the purpose is to fetch template_ids and supports optional filtering, consistent with read-only behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no wasted words. First states purpose, second gives usage guidance and optional filter.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with read-only annotations and 100% schema coverage, the description covers the primary use case. Lacks mention of pagination behavior, but schema documents cursor.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters. Description adds meaning for the version parameter (filter by version state) and overall purpose, but does not explain cursor beyond schema. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists automation templates and specifically mentions discovering template_id values, which distinguishes it from siblings like invoke_automation_template.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to call this first before invoke_automation_template, providing clear sequencing context. Does not mention when not to use, but context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_brandsA

Read-only

Inspect

List all brands in the workspace.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true. The description adds scope ('all brands') but no additional behavioral traits. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no unnecessary words. Efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity and one parameter, the description is largely sufficient. However, lack of output schema means return format is unspecified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for the single optional parameter 'cursor'. The description adds no extra meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'List all brands in the workspace' clearly identifies the action (list) and resource (brands), and distinguishes it from siblings like get_brand or create_brand.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like get_brand or other list tools. Usage is implied but not clarified.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_bulk_usersC

Read-only

Inspect

List the users in a bulk job.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor
`job_id`	Yes	The bulk job ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds minimal behavior beyond the readOnlyHint annotation. It does not mention pagination, the structure of returned data, or any side effects, even though the cursor parameter suggests pagination.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence with no unnecessary words. However, it could be slightly more informative without losing conciseness, such as mentioning that it returns a paginated list.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description should clarify what the tool returns (e.g., list of user objects or IDs). It does not, making the definition incomplete for an agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the description is not required to explain parameters. It adds no extra meaning beyond the schema, which already documents job_id and cursor.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'List the users in a bulk job' with a specific verb and resource. It distinguishes from sibling tools like 'get_bulk_job' (which likely returns the job itself) and 'add_bulk_users', but does not explicitly contrast with them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. There is no mention of context, prerequisites, or exclusions, leaving the agent to infer usage solely from the name and schema.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_journeysA

Read-only

Inspect

List journey templates in the workspace. Call this first to discover journey IDs before calling invoke_journey, get_journey, or replace_journey. Optionally filter by version (published or draft).

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor
`version`	No	Filter by version state. Defaults to published.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true. Description adds that this is a discovery/list operation, confirming safe, non-destructive behavior. No contradictions, but doesn't detail pagination or rate limits beyond schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two well-structured sentences. First sentence states core purpose, second adds usage context and filter option. No fluff, all relevant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers main functionality and usage context. Missing details on sorting or pagination behavior, but these are minor for a simple list tool. Lacks output schema info, but description doesn't require it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% so baseline is 3. Description adds value by explaining version parameter filters and implying default is published (not in schema). Cursor is not elaborated but schema handles it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Describes 'list journey templates' with clear verb and resource. Explicitly distinguishes from siblings by mentioning invoke_journey, get_journey, replace_journey as subsequent tools to use after discovery.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Tells agent to call this first to discover journey IDs before using specific tools. Provides example filter by version with 'published or draft' options, giving clear when-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_journey_templatesA

Read-only

Inspect

List notification templates scoped to a journey. Journey-scoped templates can only be used by send nodes within the same journey. Call this to discover template IDs before wiring send nodes in replace_journey.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Page size (1–100)
`cursor`	No	Pagination cursor
`journey_id`	Yes	The journey template ID

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotation readOnlyHint: true is consistent with 'List'. The description adds behavioral context that journey-scoped templates are restricted to use within the same journey, which is valuable beyond the annotation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences: first defines purpose, second provides usage guidance and context. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with pagination (evident from limit/cursor), the description covers purpose, scoping, and usage context. It does not explicitly mention return format, but for a list operation this is acceptable given the schema implies a list of templates.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters (journey_id, limit, cursor) have descriptive schema properties. The description does not add additional meaning beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'List notification templates scoped to a journey', which is a specific verb (List) and resource (notification templates). It effectively distinguishes this tool from sibling list tools (e.g., list_notifications) by adding the journey scoping constraint.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Call this to discover template IDs before wiring send nodes in replace_journey', providing a clear when-to-use scenario and implying alternatives (e.g., list_notifications) would not be appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_journey_template_versionsA

Read-only

Inspect

List published versions of a journey-scoped notification template, ordered most recent first.

ParametersJSON Schema

Name	Required	Description	Default
`journey_id`	Yes	The journey template ID that owns this notification
`notification_id`	Yes	The notification template ID

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The readOnlyHint annotation already indicates a safe read operation. The description adds behavioral details: it lists only published versions and orders them most recent first. However, it does not mention pagination, rate limits, or whether the response includes version metadata beyond ordering. The added value over annotations is moderate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence of 10 words, front-loading the verb 'List'. Every word contributes meaning, with no redundancy or fluff. It is appropriately concise for the tool's simplicity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the essential purpose and ordering, but given the absence of an output schema, it does not describe the structure of each version (e.g., version number, status, timestamps). It also omits details on pagination, limit, or filtering. For a list tool, this is adequate but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptive parameter names and descriptions. The description does not add any additional meaning beyond what the schema already provides. Therefore, the description meets the baseline but does not exceed it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists published versions of a journey-scoped notification template, ordered most recent first. It uses a specific verb ('List') and resource ('published versions'), distinguishing it from siblings like list_notification_versions (which is not journey-scoped) and get_journey_template (which returns a single template, not versions).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when retrieving version history for a notification template within a journey, but it does not explicitly state when to use this tool versus alternatives like list_notification_versions or get_notification_content. No exclusions or prerequisites are provided, so the guidance is only implicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_journey_versionsA

Read-only

Inspect

List published versions of a journey, ordered most recent first.

ParametersJSON Schema

Name	Required	Description	Default
`journey_id`	Yes	The journey template ID

Tool Definition Quality

A3.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only. Description adds that it returns only published versions and orders them most recent first, which goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with verb and resource, no waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-parameter read-only tool with no output schema, the description covers the main behavior. Could mention potential pagination or limits, but overall adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a single parameter 'journey_id' described as 'The journey template ID'. The description does not add any additional parameter meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool lists published versions of a journey ordered most recent first. It is distinguishable from siblings like 'list_journey_templates' and 'list_journey_template_versions', but does not explicitly differentiate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool over alternatives. For example, it does not explain when to use 'list_journey_versions' vs 'list_journey_template_versions'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_listsA

Read-only

Inspect

Get all lists. Optionally filter by pattern (e.g. 'example.list.*').

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor
`pattern`	No	Filter pattern (e.g. 'example.list.*')

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, and the description is consistent. It adds the optional filter pattern behavior, but does not explain pagination or other behaviors beyond what the schema (with 100% coverage) already provides.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise – two sentences with no wasted words. It front-loads the primary purpose and includes optional details efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description should indicate return format or behavior, but it does not. It also lacks mention of pagination or cursor usage, which are important for a listing tool. The information provided is minimal.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds an example for the pattern parameter ('e.g. 'example.list.*''), providing practical context beyond the schema definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool gets all lists with optional filtering by pattern. It uses a specific verb and resource, distinguishing it from siblings like get_list which retrieves a single list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for listing lists with optional filtering, but does not explicitly guide when to choose this tool over alternatives like get_list or other list operations. No exclusions or alternatives are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_messagesB

Read-only

Inspect

List messages you've previously sent. Filter by status, recipient, notification, provider, tags, or tenant.

ParametersJSON Schema

Name	Required	Description
`tag`	No	Filter by metadata tags
`list`	No	Filter by list ID
`tags`	No	Comma-delimited list of tags
`event`	No	Filter by event ID
`cursor`	No	Pagination cursor for fetching the next page
`status`	No	Filter by status (e.g. DELIVERED, UNDELIVERABLE)
`traceId`	No	Filter by trace ID
`archived`	No	Include archived messages
`provider`	No	Filter by provider key (e.g. sendgrid, twilio)
`messageId`	No	Filter by message ID
`recipient`	No	Filter by recipient user ID
`tenant_id`	No	Filter by tenant ID
`notification`	No	Filter by notification ID
`enqueued_after`	No	ISO 8601 timestamp; only return messages enqueued after this time

Tool Definition Quality

B3.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, confirming read-only behavior. The description adds that it lists 'previously sent' messages, implying historical data. No further traits like pagination behavior or latency are disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with a list of filters is concise and readable. No wasted words, but could be more structured by separating the listing of filters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite 14 optional parameters and no output schema, the description omits crucial details like pagination (cursor parameter not mentioned) and the structure of the response. Leaves the agent guessing about return format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all 14 parameters. The description only mentions a subset of filter options, adding minimal meaning beyond the schema. Baseline score is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'List messages you've previously sent' with a specific verb and resource. It lists common filter dimensions, distinguishing it from single-message retrieval. However, it could explicitly differentiate from sibling list tools like list_notifications.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as get_message for a single message or list_notifications for notifications. No exclusions or conditions provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_notification_checksC

Read-only

Inspect

List checks for a notification submission.

ParametersJSON Schema

Name	Required	Description	Default
`submission_id`	Yes	The submission ID for the checks resource
`notification_id`	Yes	The notification template ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, indicating safe read operation. The description adds no additional behavioral traits (e.g., pagination, order, filtering). With annotations covering read-only, the description could have contributed more context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no wasted words, and front-loaded with purpose. Could benefit from slightly more detail but remains efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with 2 parameters and no output schema, the description covers the basic purpose. However, it lacks explanation of what 'checks' are or relationship to notification submissions, which would aid understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (both parameters have descriptions). The tool description adds no extra meaning beyond what the schema provides. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action and resource ('List checks for a notification submission'), using verb+resource format. However, it does not differentiate from sibling tools like update_notification_checks, which is the only closely related tool. A bit more specificity could improve clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, such as listing checks via another method or when checks are available. The description lacks context for the AI agent to decide usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_notificationsA

Read-only

Inspect

List notification templates. Optionally filter by cursor.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, and description does not add significant behavioral context beyond the schema's cursor filtering.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single concise sentence with no wasted words, effectively communicating purpose and optional filter.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with one optional parameter, description covers essential information. Lacks return value details but adequate given low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and description merely repeats that cursor is optional. No additional parameter meaning provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'list' and the resource 'notification templates', distinguishing it from sibling tools like list_notification_checks and list_notification_versions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus other list tools. Usage is implied but not explicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_notification_versionsB

Read-only

Inspect

List version history for a notification template.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max versions per page (default 10, max 10)
`cursor`	No	Pagination cursor from a previous response
`notification_id`	Yes	The notification template ID

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already set readOnlyHint=true, so the read-only nature is clear. The description adds no further behavioral context (e.g., pagination, ordering, or result structure). Adequate but minimal beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no waste. Efficient and direct, though it could include a structured format. Still, it is concise and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema and no description of return values (e.g., list of versions, pagination details). Lacks context on what the response contains, ordering, or limit behavior. Incomplete for a list operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage for all 3 parameters. The tool description adds no extra meaning; the schema already explains each parameter's role. Baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'list' and resource 'version history for a notification template'. It distinguishes this from sibling tools like list_notifications (which lists notifications) and get_notification (which gets a single notification).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. Does not mention when to use list_notification_versions vs. get_notification or list_notifications. Lacks context on prerequisites or scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_provider_catalogA

Read-only

Inspect

List available provider types from the catalog with their configuration schemas.

ParametersJSON Schema

Name	Required	Description
`keys`	No	Comma-separated provider keys to filter by
`name`	No	Substring match on provider name
`channel`	No	Filter by channel type (email, sms, push, etc.)

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. Description adds that it returns configuration schemas but no further behavioral details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no redundant words; front-loaded with action and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, and description does not elaborate on return shape or pagination; adequate but could be more complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so description adds no extra meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'list' and resource 'provider types from catalog', clearly distinguishing from siblings like 'list_providers' and 'get_provider'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies listing catalog types vs. instances, but no explicit guidance on when to use this tool over siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_providersA

Read-only

Inspect

List configured provider integrations for the workspace.

ParametersJSON Schema

Name	Required	Description	Default
`cursor`	No	Pagination cursor

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. Description adds 'configured' context but does not disclose pagination behavior or other traits beyond the schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence of 7 words, efficiently conveying purpose without unnecessary detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple list tool with one optional parameter. No output schema but return format is implied. Could mention pagination explicitly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter 'cursor' already described. Tool description adds no additional parameter meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'List' and resource 'configured provider integrations', distinguishing it from sibling tools like 'create_provider' and 'list_provider_catalog'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as 'get_provider' or 'list_provider_catalog'. Lacks explicit context or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_routing_strategiesA

Read-only

Inspect

List routing strategies in the workspace. Returns metadata only; use get for full details.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Max results per page (default 20, max 100)
`cursor`	No	Pagination cursor

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. Description adds that it returns metadata only, which is useful beyond annotations. No mention of auth or rate limits, but acceptable for a list operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no wasted words, front-loaded with purpose and immediate guidance. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple two-parameter schema and read-only annotation, the description covers all necessary information: scope, what is returned, and pointer to more details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for both parameters. Description adds no extra parameter information beyond what's in the schema, meeting the baseline without improvement.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists routing strategies in the workspace, with a specific verb and resource. Distinguishes from get_routing_strategy by noting it returns only metadata.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context: listing metadata only, and explicitly directs to use get for full details. No when-not-to-use statement, but adequate for differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_routing_strategy_notificationsA

Read-only

Inspect

List notification templates associated with a routing strategy. Useful for checking linked templates before archiving.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max results per page (default 20, max 100)
`cursor`	No	Pagination cursor
`routing_strategy_id`	Yes	The routing strategy ID (rs_ prefix)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, description adds minimal behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Short, direct, no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Sufficient for a straightforward list operation with well-documented schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers parameter descriptions fully, description adds no additional parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Describes listing notification templates for a routing strategy, clearly distinguishes from sibling tools like list_routing_strategies.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides a use case (before archiving) but no explicit exclusions or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_tenantsA

Read-only

Inspect

List all tenants in the workspace.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Max results per page
`cursor`	No	Pagination cursor

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, and description is consistent. No additional behavioral traits disclosed (e.g., default limit, rate limits).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no fluff, appropriately sized for a simple tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with full schema coverage and readOnly annotation, the description is almost complete. Minor gap: does not clarify pagination behavior or default limit.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers both parameters (limit, cursor) with descriptions at 100% coverage. Description adds no extra meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'List' and resource 'tenants' with scope 'in the workspace', distinguishing it from sibling tools like get_tenant (single) and list_tenant_templates.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or when-not-to-use guidance. The description implies usage for retrieving all tenants, but lacks context on pagination or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_tenant_templatesB

Read-only

Inspect

List notification templates configured for a tenant.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max results per page (default 20, max 100)
`cursor`	No	Pagination cursor
`tenant_id`	Yes	The tenant ID

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, consistent with the description. The description adds no further behavioral traits (e.g., pagination behavior, performance characteristics).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no redundant information. Every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description could mention what the response contains (e.g., list of template objects or IDs). However, for a simple list tool with well-defined parameters, the description is minimally adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for all 3 parameters. The description does not add additional meaning beyond what the schema already provides, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly identifies the action (list), resource (notification templates), and scope (for a tenant). It distinguishes from sibling tools like get_tenant_template and publish_tenant_template, though it doesn't explicitly differentiate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like get_tenant_template or list messages. No exclusion criteria or context for when this tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_tenant_usersB

Read-only

Inspect

List users associated with a tenant.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max results per page (default 20, max 100)
`cursor`	No	Pagination cursor
`tenant_id`	Yes	The tenant ID

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description 'List' is consistent with the readOnlyHint annotation, but it adds no further behavioral details such as pagination behavior, rate limits, or the scope of users returned. The annotation already conveys safety, so the description adds minimal value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence with no redundant information. It is perfectly sized for its simplicity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with good annotations and full schema coverage, the description is minimally adequate. However, it could mention pagination or provide a hint about the output format. Given the context, a score of 3 is fair.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for all three parameters (tenant_id, limit, cursor). The description adds no additional meaning beyond the schema, so a baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'List users' and the resource 'associated with a tenant'. It directly conveys the tool's function, though it does not explicitly differentiate from sibling tools like list_user_tenants or list_audience_members.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as get_tenant for retrieving tenant details, add_user_to_tenant for adding users, or list_tenants for listing all tenants.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_user_push_tokensA

Read-only

Inspect

List all push/device tokens for a user.

ParametersJSON Schema

Name	Required	Description	Default
`user_id`	Yes	The user ID

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

ReadOnlyHint annotation already declares read-only nature. Description merely restates that it lists tokens, adding no behavioral context beyond the annotation. No side effects or additional traits disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no wasted words. Clearly states verb, resource, and scope. Appropriate length for a simple list tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple listing operation with one parameter. Missing details like pagination or ordering, but given the tool's simplicity and lack of output schema, not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with single parameter 'user_id' described as 'The user ID'. Description adds no further meaning; it only says 'for a user' which is redundant with the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states action 'list', resource 'push/device tokens', and scope 'for a user'. Distinguishes from sibling tools like 'get_user_push_token' (single token) and 'create_or_replace_user_push_token' (mutation).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or when-not-to-use. Implied usage from context, but lacks alternatives or exclusions. Among siblings, 'get_user_push_token' could be used for a single token, but not mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_user_tenantsB

Read-only

Inspect

List all tenants a user belongs to.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Max results per page
`cursor`	No	Pagination cursor
`user_id`	Yes	The user ID

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The readOnlyHint annotation already declares the tool is read-only. The description adds no additional behavioral context (e.g., pagination behavior, ordering, or any side effects).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence that is front-loaded and contains no wasted words. Ideal conciseness for a simple listing tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description is adequate for a simple read operation with well-documented parameters, but it lacks mention of pagination or the fact that results may be limited. No output schema exists, so description could provide more on return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with descriptions for all three parameters. The tool description does not add any meaning beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists all tenants for a user, which is a specific verb+resource. It is distinguishable from siblings like list_tenants (system-wide) and list_tenant_users (inverse). However, 'all' may be misleading due to pagination.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as list_tenants or list_tenant_users. The context suggests it's for a specific user's tenants, but no explicit when-to-use or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

patch_profileAInspect

Partially update a user profile via JSON Patch (RFC 6902). Use add/replace/remove operations on specific profile paths.

ParametersJSON Schema

Name	Required	Description	Default
`patch`	Yes	Array of JSON Patch operations to apply to the profile
`user_id`	Yes	The user ID

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=false and idempotentHint=false, so the agent knows it's a mutable, non-idempotent operation. The description adds that it follows JSON Patch standard, which implies atomicity and error handling, but no further behavioral details (e.g., auth, side effects).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundant information. Front-loaded with action verb and resource, followed by protocol detail. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With low complexity (2 required params, no output schema), the description is largely complete. It could mention what happens on invalid patch or missing user, but 4 is appropriate given the clear intent and annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already describes both parameters (user_id, patch) with types and descriptions. The description mentions operations and paths, providing context but not adding new meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it partially updates a user profile using JSON Patch (RFC 6902), specifying verb 'partially update' and resource 'user profile'. It distinguishes from sibling tools like replace_profile (full replacement) and delete_profile (deletion), making purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says to use add/replace/remove operations for specific profile paths, implying partial updates. However, it does not explicitly compare to alternatives like replace_profile or state when not to use this tool, but the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

patch_user_tokenAInspect

Apply a JSON Patch (RFC 6902) to a specific push token.

ParametersJSON Schema

Name	Required	Description
`patch`	Yes	Array of JSON Patch operations
`token`	Yes	The token identifier
`user_id`	Yes	The user ID

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and idempotentHint=false, but the description adds no further behavioral context (e.g., atomicity, partial failure, version tracking). For a mutation tool, more disclosure is expected.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is direct and front-loaded, with no superfluous information. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 3 required parameters and no output schema. While the description covers the core action, it does not mention return values or side effects (e.g., whether the token is updated in place). Adequate but not comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for all parameters (patch, token, user_id). The description adds no additional meaning beyond 'specific push token', so a baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (apply a JSON Patch), the standard (RFC 6902), and the target (specific push token). This precisely defines the tool's purpose and distinguishes it from sibling tools like 'delete_user_token' or 'create_or_replace_user_push_token'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for applying a patch to a push token but does not provide when to use this tool vs. alternatives such as 'create_or_replace_user_push_token' or 'bulk_add_user_tokens'. No explicit guidance on context or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

publish_journeyAInspect

Publish the current draft of a journey, making it live and invokable. Pass version to roll back to a prior published version instead of publishing the draft. Returns 404 if there is no draft to publish.

ParametersJSON Schema

Name	Required	Description	Default
`version`	No	Historical version to roll back to (e.g. "v001"). Omit to publish the current draft.
`journey_id`	Yes	The journey template ID to publish

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate mutation (readOnlyHint=false) and non-idempotency. The description adds context about making the journey 'live and invokable' and the rollback behavior, plus the 404 error. This supplements the annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two front-loaded sentences: first defines the primary purpose, second covers the alternative use and error. Every word adds value, no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (two params, no output schema), the description covers core functionality, the rollback variant, and a key error. It lacks details on side effects or persistence semantics, but remains sufficient for the agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already has 100% coverage with clear descriptions for both parameters. The tool description reiterates the 'version' parameter's purpose and adds the 404 error context, but does not significantly extend the schema's explanation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'publish' and the resource 'current draft of a journey', and distinguishes between publishing the draft and rolling back to a prior version. This avoids confusion with sibling tools like publish_journey_template.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use the 'version' parameter (rollback) and notes a specific error (404 if no draft). This provides clear guidance for the two primary use cases, though it does not compare with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

publish_journey_templateAInspect

Publish the current draft of a journey-scoped notification template. Optionally pass version to roll back to a prior version.

ParametersJSON Schema

Name	Required	Description
`version`	No	Version to roll back to (e.g. "v1"). Omit to publish current draft.
`journey_id`	Yes	The journey template ID that owns this notification
`notification_id`	Yes	The notification template ID

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds behavioral context beyond annotations: mentions rollback capability and that it publishes the current draft. But does not disclose whether operation is reversible, effects on previous versions, or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded with main action. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers main action and optional version rollback, but lacks explanation of return value (no output schema) and prerequisites (e.g., existence of a draft). Not fully complete for a publication operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so description adds little beyond schema. It restates the version parameter's purpose without new insight.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'publish' and resource 'journey-scoped notification template'. It distinguishes from siblings like publish_journey and publish_notification by specifying the scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage: use to publish current draft or roll back to a prior version via optional version parameter. However, no explicit when-to-use vs alternatives (e.g., publish_notification) is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

publish_notificationAInspect

Publish a notification template, making it available for sending. Must be called before send_message_template unless the template was created with state: 'PUBLISHED'. Publishes the current draft by default; pass version (e.g. 'v001') to publish a specific historical version. Returns 204 on success.

ParametersJSON Schema

Name	Required	Description	Default
`version`	No	Historical version to publish (e.g. v001); omit to publish current draft
`notification_id`	Yes	The notification template ID to publish

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses return value (204 on success) and default behavior (publishes current draft unless version specified). No annotations contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences covering purpose, usage, and parameter detail. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-param tool with no output schema, description is complete: purpose, prerequisite, version behavior, and return status.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (baseline 3). Description adds value by explaining the version parameter default behavior and providing example format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it publishes a notification template for sending, and distinguishes from siblings like 'send_message_template' by noting the prerequisite call.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'Must be called before send_message_template unless the template was created with state: PUBLISHED', also mentions optional version parameter.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

publish_tenant_templateCInspect

Publish a version of a tenant notification template.

ParametersJSON Schema

Name	Required	Description
`version`	No	Version to publish (e.g. v1, latest); defaults to latest if omitted
`tenant_id`	Yes	The tenant ID
`template_id`	Yes	The template ID

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations indicate a write operation (readOnlyHint: false), and the description implies mutation. However, it does not disclose side effects like whether it overwrites existing published versions, requires the version to exist, or any constraints. More context is needed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words. It is front-loaded with the key action. However, it omits necessary details, so it leans more toward under-specification than true conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and the presence of sibling tools, the description is insufficient. It does not clarify the return value, confirmation, or how publishing affects the template lifecycle. More completeness is needed for a mutation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description adds no value beyond parameter names and descriptions. The 'version' parameter with default 'latest' is important but not highlighted in the description. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses the verb 'Publish' and the resource 'version of a tenant notification template', clearly indicating the action and object. However, it does not differentiate this tool from siblings like 'replace_tenant_template' or 'publish_notification'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'replace_tenant_template' or 'list_tenant_templates'. There is no mention of context, prerequisites, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

put_notification_contentA

Idempotent

Inspect

Replace the elemental content of a V2 notification template. Overwrites all elements. Use channel elements to target specific channels. Multi-channel example: elements: [{ type: "channel", channel: "email", elements: [{ type: "meta", title: "Hello" }, { type: "text", content: "Email body" }] }, { type: "channel", channel: "push", elements: [{ type: "meta", title: "Hello" }, { type: "text", content: "Push body" }] }, { type: "channel", channel: "inbox", elements: [{ type: "text", content: "Inbox plain text only" }] }].

ParametersJSON Schema

Name	Required	Description
`state`	No	Template state after update
`version`	No	Content version string
`elements`	Yes	Array of elemental content nodes
`notification_id`	Yes	The notification template ID (nt_ prefix)

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true and readOnlyHint=false, so the tool is a write operation that is idempotent. The description adds that it 'overwrites all elements,' which is consistent and provides additional behavioral insight beyond the annotations. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: one introductory sentence, then a succinct example. No filler words. The key action and behavior are front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main purpose and provides a comprehensive example for multi-channel usage. However, it lacks information about return values or behavior on error, but since there is no output schema, this is less critical. It is largely complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, but the description adds significant meaning by showing the nested structure of the 'elements' array with a detailed multi-channel example, including type, channel, and nested elements. This goes beyond the schema which only says 'Array of elemental content nodes'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it replaces the elemental content of a V2 notification template and overwrites all elements. It uses a specific verb 'Replace' and resource 'notification template (V2)', distinguishing it from siblings like put_notification_element which likely targets individual elements.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides an explicit multi-channel example and instructs to 'Use channel elements to target specific channels.' While it doesn't explicitly state when not to use, the sibling tool put_notification_element serves as a clear alternative, and the example gives practical usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

put_notification_elementA

Idempotent

Inspect

Update a single element within a V2 notification template.

ParametersJSON Schema

Name	Required	Description
`if`	No	Conditional expression for element visibility
`ref`	No	Reference identifier
`data`	No	Element data payload
`loop`	No	Loop expression for repeating elements
`type`	Yes	Element type (e.g. text, action, image, divider, meta)
`state`	No	Template state after update
`channels`	No	Channels this element applies to
`element_id`	Yes	The element ID to update
`notification_id`	Yes	The notification template ID (nt_ prefix)

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare idempotentHint=true and readOnlyHint=false, so the description adds no additional behavioral context beyond stating the update action. No mention of side effects, required permissions, or error states.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence with no unnecessary words. It is front-loaded with the action and resource, making it easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (9 parameters, nested objects, no output schema), the description is too brief. It does not explain what the function returns, error conditions, or how to use the element-specific parameters. The lack of output schema amplifies the need for more detail.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so the baseline is 3. The description does not add any meaning beyond what the schema already provides, such as clarifying the purpose of specific parameters like 'if' or 'loop'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Update'), the resource ('a single element within a V2 notification template'), and distinguishes it from sibling tools like put_notification_content and put_notification_locale, which handle different parts of a notification template.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for updating an element within a template, but it does not explicitly state when to use this tool versus other update tools (e.g., put_notification_content) or provide alternatives, leaving the agent to infer the context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

put_notification_localeC

Idempotent

Inspect

Set locale-specific content overrides for a V2 notification template. Each element override must reference an existing element by its id. Example for Spanish locale: { notification_id: "nt_01abc", locale_id: "es", elements: [{ id: "elem_meta_1", title: "Restablecer contraseña" }, { id: "elem_text_1", content: "Haga clic en el enlace para restablecer su contraseña." }] }.

ParametersJSON Schema

Name	Required	Description
`state`	No	Template state after update
`elements`	Yes	Array of element overrides with id and locale-specific content
`locale_id`	Yes	Locale identifier (e.g. es, fr, pt-BR)
`notification_id`	Yes	The notification template ID (nt_ prefix)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true and readOnlyHint=false, consistent with a write operation. The description does not add behavioral context beyond the example, such as failure modes for missing element IDs or locale conflicts.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences plus an example. It is front-loaded and directly informative, though the example could be briefer without losing meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple set operation with no output schema, the description covers the main use case. However, it lacks details on return values, error handling (e.g., if element id does not exist), and state transitions beyond the optional 'state' parameter.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with existing descriptions. The description provides an illustrative example that clarifies the structure of the 'elements' array and the usage of locale identifiers, adding marginal value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool's purpose as setting locale-specific content overrides for a V2 notification template. It uses specific verbs and resource types, and while it doesn't explicitly differentiate from similar tools like put_notification_content, the focus on locales provides clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like put_notification_content or put_notification_element. The description lacks usage context, prerequisites, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove_all_user_tenantsA

DestructiveIdempotent

Inspect

Remove a user from all tenants.

ParametersJSON Schema

Name	Required	Description	Default
`user_id`	Yes	The user ID

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare destructiveHint=true, indicating mutation. The description adds context by specifying 'from all tenants', which implies broad impact. However, it does not disclose side effects like loss of access. With annotations present, this is adequate but not enhanced.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, front-loaded with action and scope, with no unnecessary words. It earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple 1-parameter tool with annotations and no output schema, the description is sufficient. It explains the operation and the broad effect. Could mention return value or confirmation, but not required.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of the single parameter 'user_id' with a description. The description adds no additional meaning beyond the schema, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Remove a user from all tenants' clearly states the action (remove), the resource (user), and the scope (all tenants), distinguishing it from siblings like 'remove_user_from_tenant' which operates on a single tenant.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool vs alternatives, such as 'remove_user_from_tenant'. It does not mention prerequisites or context for its use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

remove_user_from_tenantA

DestructiveIdempotent

Inspect

Remove a user from a tenant.

ParametersJSON Schema

Name	Required	Description	Default
`user_id`	Yes	The user ID
`tenant_id`	Yes	The tenant ID

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide destructiveHint=true and idempotentHint=true, covering the safety profile. The description adds no extra behavioral information (e.g., consequences, permissions, reversibility). Does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no wasted words. Concisely conveys the tool's core function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple two-parameter tool with good annotations, the description is minimally adequate. However, it could hint at idempotency or mention it's the reverse of 'add_user_to_tenant' to improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with simple descriptions for both parameters. The description adds no additional semantic meaning beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Remove') and the resource ('a user from a tenant'), making the tool's purpose unambiguous. It is distinct from sibling tools like 'add_user_to_tenant' or 'delete_user_token'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. For example, no mention of 'remove_all_user_tenants' or when removal differs from deletion. The description lacks context for appropriate use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

replace_journeyA

Idempotent

Inspect

Replace (update) a journey draft. Full document replacement — include all nodes and properties in the body. Call publish_journey afterwards to make changes live, or pass state: "PUBLISHED" to publish immediately. Send node template IDs must already be scoped to this journey.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Journey display name
`nodes`	Yes	Complete array of journey nodes. Use server-assigned node ids from get_journey — do NOT invent new ids. Each node requires type plus type-specific fields.
`state`	No	Set to PUBLISHED to publish immediately after replace.
`enabled`	No	Whether the journey is active.
`journey_id`	Yes	The journey template ID to update

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true and readOnlyHint=false, consistent with a replacement operation. Description adds context about full document replacement and immediate publishing via state flag. No contradictions. It could mention that previous draft is overwritten, but annotations already imply safety.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, all essential: purpose, workflow guidance, and constraint. No filler. Front-loaded with the core action. Each sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema is provided, and the description does not explain the return value (e.g., whether it returns the updated journey or a success status). For a 5-parameter tool with no output schema, the description should clarify what the caller receives on success. Otherwise, it sufficiently covers inputs and constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for all 5 parameters. Description reinforces semantics: 'Full document replacement — include all nodes and properties in the body' and 'Send node template IDs must already be scoped to this journey.' This adds significant meaning beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Replace (update) a journey draft. Full document replacement.' This specifies the verb (replace/update), resource (journey draft), and scope (full document). It distinguishes from siblings like 'create_journey' and 'publish_journey' by mentioning them explicitly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides clear guidance: 'Call publish_journey afterwards to make changes live, or pass state: "PUBLISHED" to publish immediately.' It also notes that node template IDs must be scoped to the journey. While it doesn't explicitly list when not to use, the context makes it clear this is for updating drafts.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

replace_journey_templateA

Idempotent

Inspect

Replace the draft of a journey-scoped notification template. Full document replacement. Call publish_journey_template afterwards to make it live.

ParametersJSON Schema

Name	Required	Description
`state`	No	"PUBLISHED" to publish immediately after replace
`journey_id`	Yes	The journey template ID that owns this notification
`notification`	Yes	Full notification template definition
`notification_id`	Yes	The notification template ID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide idempotentHint=true and readOnlyHint=false. The description adds 'Full document replacement' context and clarifies that it operates on the draft, not the live version. However, no additional behavioral details (e.g., side effects, error handling) are provided, making it adequate but not enhanced.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with clear front-loading of the action. Every sentence adds value—first states the purpose, second gives a follow-up instruction. No redundant or missing words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having no output schema, the description does not explain return values or potential errors. For a complex tool with nested objects and multiple parameters, more context (e.g., what happens if the draft does not exist) would be needed for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are fully documented in the schema. The description does not add new meaning beyond what the schema provides. Baseline score 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Replace the draft of a journey-scoped notification template') and specifies 'Full document replacement'. This distinguishes it from siblings like replace_journey (replacing a journey) and replace_notification (replacing a notification, not necessarily journey-scoped).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says to call publish_journey_template afterwards, providing a clear usage sequence. However, it does not explicitly state when not to use this tool or compare it with similar tools like replace_notification, which could reduce confusion for agents.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

replace_notificationA

Idempotent

Inspect

Replace a notification template entirely (full document PUT).

ParametersJSON Schema

Name	Required	Description
`state`	No	Template state after update (defaults to DRAFT)
`notification`	Yes	Full notification template payload
`notification_id`	Yes	The notification template ID to replace

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate write and idempotent operations. Description adds 'replace entirely', which aligns but does not go beyond annotations. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with action, no unnecessary words. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters, nested object, and no output schema, the description is adequate. It could optionally mention that the full object must be provided or note default state, but the schema covers required fields.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds no additional meaning to parameters beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Replace a notification template entirely (full document PUT)' using specific verb and resource, distinguishing it from siblings like create_notification or archive_notification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., update_notification or put_notification_content). The description does not mention when-not or provide context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

replace_profileA

DestructiveIdempotent

Inspect

Fully replace a user profile (PUT). All existing data is overwritten; include every field you want to keep.

ParametersJSON Schema

Name	Required	Description	Default
`profile`	Yes	Complete profile data to replace with
`user_id`	Yes	The user ID

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructive and idempotent behavior. The description adds the PUT method and overwrite detail, but doesn't significantly expand beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, clear sentence with no wasted words. Front-loaded with the core action and method.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple tool with two parameters and annotations covering safety, the description is adequate. No output schema, but not a significant gap for this operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are documented. The description adds the 'include every field' guidance but doesn't elaborate on the profile object structure, which is acceptable given the open schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Fully replace a user profile (PUT)' and specifies that all existing data is overwritten, distinguishing it from partial updates like patch_profile.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It advises to include every field you want to keep, implying when to use full replacement. While it doesn't explicitly name alternatives, the context from sibling tools makes it clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

replace_routing_strategyA

Idempotent

Inspect

Replace a routing strategy. Full document replacement; missing optional fields are cleared.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Human-readable name
`tags`	No	Tags. Omit to clear.
`routing`	Yes	Routing tree
`channels`	No	Per-channel delivery configuration. Omit to clear.
`providers`	No	Per-provider delivery configuration. Omit to clear.
`description`	No	Description. Omit to clear.
`routing_strategy_id`	Yes	The routing strategy ID

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide idempotentHint (true) but no destructive hint. The description explicitly adds that missing optional fields are cleared, which is a key behavioral trait beyond what annotations convey. It does not, however, mention return behavior or permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences: first states purpose, second clarifies key behavioral detail. Every word adds value; no redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters including a nested object and no output schema, the description is short but covers the replacement essence. It is adequate but could mention success response (e.g., returns updated strategy) or error conditions for missing IDs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so each parameter is already documented. The description adds overarching semantics: omitting optional fields clears them. This adds meaning beyond individual field descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Replace a routing strategy' with a specific verb and resource, and adds 'Full document replacement; missing optional fields are cleared.' This distinguishes it from create_routing_strategy (which creates new) and get/archive siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The tool name and description imply it should be used to update an existing routing strategy, as there is no separate 'update' tool among siblings. However, no explicit when-to-use or when-not-to-use guidance is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

replace_tenant_templateA

Idempotent

Inspect

Create or replace a tenant notification template (draft unless published is true).

ParametersJSON Schema

Name	Required	Description
`title`	No	Optional title merged into template content when provided
`content`	No	Elemental content object (e.g. elements and version per Courier Elemental schema)
`routing`	No	Message routing configuration
`channels`	No	Channel-specific delivery configuration
`providers`	No	Provider-specific routing configuration
`published`	No	When true, publish immediately after save
`tenant_id`	Yes	The tenant ID
`template_id`	Yes	The template ID

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds the draft/publish nuance beyond annotations (readOnlyHint=false, idempotentHint=true). It explains that the template is created as a draft unless published is true. However, it does not disclose other behavioral details like permissions, error handling, or the effect of multiple calls (though idempotency is annotated).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence of 11 words. It conveys the core purpose and the key condition without any filler or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the 8 parameters (with nested objects not detailed in schema), the description is reasonably complete. It covers the essential behavior (create/replace, draft/publish) and is supplemented by annotations (idempotency) and schema (100% coverage). Lacks return value details but no output schema exists.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with individual parameter descriptions. The description adds context that the template is tenant-scoped and can be draft/published, but this does not significantly enhance understanding beyond the schema's parameter descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates or replaces a tenant notification template, with a specific condition ('draft unless published is true'). It distinguishes from sibling tools like 'get_tenant_template' and 'publish_tenant_template' by combining creation/replacement and draft/publish behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus similar tools such as 'replace_notification' or 'publish_tenant_template'. It implicitly suggests using the 'published' parameter to toggle publishing, but lacks explicit usage context or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

restore_listA

Idempotent

Inspect

Restore a previously deleted list.

ParametersJSON Schema

Name	Required	Description	Default
`list_id`	Yes	The list ID

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotent mutation. Description aligns but adds no extra context beyond 'restore', which is consistent. No disclosure of what happens if list doesn't exist or time limits on recoverability.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single concise sentence with no wasted words. Front-loaded and efficiently communicates the tool's purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description is largely adequate. However, it lacks details on the exact effects of restoration (e.g., list visibility, subscriber state).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter (list_id) with schema coverage 100%. The description adds no extra meaning beyond 'The list ID', so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Restore a previously deleted list' clearly states the action (restore) and the resource (list), distinguishing it from sibling tools like delete_list and create_list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use restore_list vs alternatives. Implied usage is for restoring deleted lists, but no prerequisites or conditions (e.g., list must be recoverable) are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

run_bulk_jobAInspect

Run a bulk job, triggering delivery to all added users.

ParametersJSON Schema

Name	Required	Description	Default
`job_id`	Yes	The bulk job ID to run

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate mutation (readOnlyHint=false, idempotentHint=false). The description adds 'triggering delivery,' which confirms the side effect. However, it does not disclose behavior like idempotency, error handling, or cancellation possibilities.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no unnecessary words. Every word adds value, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple one-parameter input and no output schema, the description adequately conveys the tool's purpose and outcome. It could mention prerequisites or typical flow, but overall is sufficient for a straightforward action.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema covers 100% of parameters with a basic description. The description does not add extra meaning or constraints (e.g., format, source, or validation) beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses the specific verb 'run' and identifies 'bulk job' as the resource, clearly stating the effect 'triggering delivery to all added users.' This distinguishes it from sibling tools like create_bulk_job and get_bulk_job.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies that the tool is used after adding users to a bulk job, but does not provide explicit guidance on when to use this tool versus alternatives, nor does it mention prerequisites or cases where it should not be used.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_messageAInspect

Send a message to a user using inline title and body content (no template). Optionally specify routing channels.

ParametersJSON Schema

Name	Required	Description	Default
`body`	Yes	Message body
`data`	No	Key-value data to include with the message
`title`	Yes	Message title
`method`	No	Routing method: deliver to all channels or stop after first success	all
`user_id`	Yes	The recipient user ID
`channels`	No	Channel names to route through (e.g. email, sms, push). Omit to use default routing.

Tool Definition Quality

A3.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate mutation (readOnlyHint=false) and non-idempotence (idempotentHint=false). The description adds no behavioral details beyond what annotations provide, such as side effects, delivery semantics, or error behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that conveys essential information without redundancy. Every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having 6 parameters and no output schema, the description omits details about the 'data' parameter, 'method' enum behavior, and return value. An agent would need additional context for confident invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the description adds little beyond restating 'inline title and body' and 'optionally specify routing channels'. It does not explain the 'data' object or 'method' enum, but baseline 3 is appropriate given schema richness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly indicates the action (send), resource (message), and context (to a user, using inline content without a template). It distinguishes from template-based and list-based alternatives among siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context by stating 'no template' and optional routing, implying use for ad-hoc messages. It does not explicitly exclude scenarios or name alternatives, but the distinction from siblings like send_message_template is implicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_message_templateAInspect

Send a message to a user using a published notification template. The template must be published before sending — call publish_notification first if needed. Example: { user_id: "user-123", template: "nt_01abc123", data: { name: "Alex", resetUrl: "https://app.example.com/reset" } }.

ParametersJSON Schema

Name	Required	Description	Default
`data`	No	Key-value data for template variables
`method`	No	Routing method	all
`user_id`	Yes	The recipient user ID
`channels`	No	Channel names to route through. Omit to use template routing config.
`template`	Yes	Template ID or notification slug

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds behavioral context beyond annotations by specifying the prerequisite (template must be published). It could detail idempotency or error handling but is still helpful. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences plus an example. Information is front-loaded and every sentence adds value. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers the core use case and prerequisite. Lacks details on return value or error states, but for a send tool this is acceptable given the simplicity and annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds a useful example for the 'data' parameter, but doesn't significantly expand on parameter semantics beyond the schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (send) and resource (message using a notification template), with a concrete example. It distinguishes from sibling tools like send_message and send_message_to_list by specifying the template requirement.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states that the template must be published before sending and recommends calling publish_notification if needed. This provides clear when-to-use and when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_message_to_listAInspect

Send a message to all subscribers of a list using inline title and body content.

ParametersJSON Schema

Name	Required	Description	Default
`body`	Yes	Message body
`data`	No	Key-value data to include
`title`	Yes	Message title
`method`	No	Routing method	all
`list_id`	Yes	The list ID to send to
`channels`	No	Channel names to route through. Omit to use default routing.

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate it's a write and non-idempotent operation, which the description aligns with but does not add new behavioral detail (e.g., side effects, rate limits, auth needs). The description merely restates the action without deeper transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, front-loading the key action and scope. Every word earns its place, with no redundancy or unnecessary detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 6 parameters and no output schema, the description is minimal. It covers the overall purpose but does not explain parameters like method, channels, or data. Given the sibling context (template vs inline), it is adequate but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are already documented. The description adds no additional meaning beyond what is in the schema, thus baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('send'), the resource ('subscribers of a list'), and the method ('using inline title and body content'). This distinguishes it from sibling tools like send_message (likely to individuals) and send_message_template (uses templates).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for sending to all list subscribers with inline content, but does not explicitly state when to use this tool versus alternatives like send_message_to_list_template or send_message. No when-not or exclusions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_message_to_list_templateAInspect

Send a message to all subscribers of a list using a notification template.

ParametersJSON Schema

Name	Required	Description	Default
`data`	No	Key-value data for template variables
`method`	No	Routing method	all
`list_id`	Yes	The list ID to send to
`channels`	No	Channel names to route through. Omit to use template routing config.
`template`	Yes	Template ID or notification slug

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate mutating, non-idempotent behavior. The description adds that it uses a notification template, but does not disclose additional side effects like asynchronicity or permission requirements. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, 14 words, front-loaded with the action verb. No redundant or filler content. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 5 parameters, no output schema, and no return value description, the description is adequate but leaves gaps (e.g., immediate vs. queued, tracking). It could better highlight the distinction from 'send_message_to_list' without template.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all five parameters. The description adds no additional parameter information beyond what the schema provides, fitting the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (send), target (all subscribers of a list), and method (using a notification template). It differentiates from siblings like 'send_message' and 'send_message_to_list' by specifying template use.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (send to list subscribers with template) but does not explicitly state alternatives or when not to use. No 'use this when' or 'instead of' guidance is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

subscribe_user_to_listA

Idempotent

Inspect

Subscribe a user to a list. Creates the list if it doesn't exist.

ParametersJSON Schema

Name	Required	Description
`list_id`	Yes	The list ID
`user_id`	Yes	The user ID to subscribe
`preferences`	No	Optional notification preferences

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=false and idempotentHint=true, but the description adds value by disclosing the side effect of automatically creating the list if it doesn't exist. This goes beyond annotations, providing important behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at two sentences, with the action front-loaded. Every word serves a purpose, and there is no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's relative simplicity with three parameters (one nested), the description covers the main action and side effect. Absence of output schema and error notes is acceptable; the description is sufficient for agent comprehension.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all three parameters. The tool description does not add any extra semantic meaning beyond what the schema already provides. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'subscribe' and resources 'user to a list', and adds a notable side effect of creating the list if it doesn't exist. This distinguishes it from siblings like 'add_subscribers_to_list' (likely bulk) and 'subscribe_user_to_lists' (plural).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for single subscription with list creation, but does not explicitly state when to use this tool versus alternatives like bulk subscription tools. No guidance on when not to use it or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

subscribe_user_to_listsA

Idempotent

Inspect

Subscribe a user to one or more lists. Creates lists that do not exist.

ParametersJSON Schema

Name	Required	Description	Default
`lists`	Yes	Array of lists to subscribe to
`user_id`	Yes	The user ID

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotent and non-read-only behavior. The description adds the important behavioral detail that lists are created if they do not exist, which is valuable beyond annotations. No contradictions found.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences (14 words) with no redundancy. The main action is front-loaded, and every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (2 parameters, no output schema), the description is fairly complete. It covers the action and key behavior (auto-creation), though it omits return value or error handling. Sibling differentiation is minimal but acceptable due to naming clarity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds context that the lists parameter can accept IDs that result in list creation if missing, providing meaning beyond the schema's basic description of 'List ID to subscribe to'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action (subscribe a user to lists) and explicitly mentions that missing lists are created. This distinguishes it from the singular sibling 'subscribe_user_to_list' and provides a specific verb+resource combination.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like 'bulk_subscribe_to_list' or 'subscribe_user_to_list'. The description implies the use case for multiple lists but does not state limitations or provide comparative context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

track_inbound_eventA

Idempotent

Inspect

Track an inbound event that can trigger automations. Requires event name, messageId (for deduplication), and properties.

ParametersJSON Schema

Name	Required	Description
`event`	Yes	The event name (appears as trigger in Automation Trigger node)
`userId`	No	User ID associated with the event
`messageId`	Yes	Unique ID for deduplication (returns 409 if not unique)
`properties`	Yes	Event properties payload

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate mutability (readOnlyHint=false) and idempotency (idempotentHint=true). The description adds context that it triggers automations and uses messageId for deduplication, confirming the write behavior. However, it does not describe the return value or error cases beyond the schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, no unnecessary words, and directly conveys the essential purpose and requirements.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (4 params, no output schema), the description covers the core functionality, required fields, and dedup behavior. It lacks explanation of the properties structure, but overall is fairly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers all 4 parameters with descriptions, and the description summarizes the required ones but does not add significant new meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'track' and the resource 'inbound event', and explains its purpose of triggering automations. It is distinct from sibling tools as no other tool is named similarly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for triggering automations but does not explicitly contrast with alternatives like 'invoke_*' tools or specify when not to use this tool. It lacks explicit when/when-not guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

unsubscribe_user_from_listC

DestructiveIdempotent

Inspect

Unsubscribe a user from a list.

ParametersJSON Schema

Name	Required	Description	Default
`list_id`	Yes	The list ID
`user_id`	Yes	The user ID to unsubscribe

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare idempotentHint=true and destructiveHint=true, but the description adds no additional behavioral context (e.g., whether unsubscription prevents future sends, or confirmation steps). It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence that clearly conveys the purpose. While efficient, it could benefit from being slightly more descriptive without losing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (two parameters, no output schema), the description is adequate but minimal. It lacks details about return values or side effects beyond what annotations imply.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with both parameters described in the input schema. The description adds no extra meaning beyond what the schema provides, which meets the baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Unsubscribe' and the resource 'a user from a list', making the action obvious. However, it does not differentiate from sibling tools like 'delete_user_list_subscriptions' which may perform a similar operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'subscribe_user_to_list' or 'delete_user_list_subscriptions'. The description lacks context about prerequisites or use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_audienceC

Idempotent

Inspect

Create or update an audience with a filter definition.

ParametersJSON Schema

Name	Required	Description
`name`	No	Display name
`filter`	No	Filter definition object (operator, rules)
`audience_id`	Yes	The audience ID
`description`	No	Description

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate it is not read-only and is idempotent. The description adds 'Create or update' but does not explain behavioral traits like whether existing fields are overwritten or merged, or any side effects. With no additional behavioral context beyond annotations, the description adds minimal value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence (10 words) that directly states the tool's purpose. No superfluous information. It is appropriately sized for a tool with a well-described schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite high schema coverage, the description lacks essential context: it does not explain return values, error conditions, or behavior when the audience_id does not exist (create vs update). Given the complexity of upsert logic, the description is incomplete for an agent to use reliably.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover all parameters (100% coverage), so the schema already provides meaning. The description highlights 'filter definition', which aligns with the schema's filter description, but does not add new semantic details beyond what is in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Create or update an audience with a filter definition', which clearly identifies the action and resource. However, the name 'update_audience' suggests only update, causing slight inconsistency. It does not distinguish from sibling tools like create_list, but the focus on filter definition adds specificity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., delete_audience, get_audience). No mention of prerequisites or when creation vs update applies. This leaves the agent to infer usage context from the name and schema.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_brandB

Idempotent

Inspect

Replace an existing brand with new values.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Brand display name
`brand_id`	Yes	The brand ID to update
`settings`	No	Brand settings (colors, email, inapp)
`snippets`	No	Brand snippets

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate mutation (readOnlyHint=false) and idempotency (idempotentHint=true). The description adds 'replace with new values' but does not clarify if partial updates or full replacement occurs, leaving ambiguity beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently communicates the tool's purpose with no wasted words. It is front-loaded and appropriately sized for its clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the tool (nested objects, no output schema), the description lacks crucial details such as return value, behavior when optional fields are omitted, and preconditions. It is insufficient for full understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so each parameter is already described. The tool description adds no additional meaning beyond the schema, resulting in a baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('replace') and the resource ('an existing brand') with 'new values', distinguishing it from create_brand and delete_brand siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool vs. alternatives like create_brand or delete_brand. The description lacks usage context entirely.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_notification_checksB

Idempotent

Inspect

Update check statuses for a notification submission.

ParametersJSON Schema

Name	Required	Description
`checks`	Yes	Checks to update
`submission_id`	Yes	The submission ID for the checks resource
`notification_id`	Yes	The notification template ID

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds no behavioral context beyond what annotations already provide. Annotations indicate idempotentHint=true and readOnlyHint=false, which the description merely echoes as 'update'. It fails to disclose additional behaviors like whether updates are partial or full replacements, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single 8-word sentence, very concise. However, it could be more informative without sacrificing conciseness, for example by indicating the batch nature of the checks parameter.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the schema coverage and annotations, the description is minimally complete for a basic understanding. However, without an output schema, additional context about the result (e.g., return value or side effects) would improve completeness. The description does not mention that the checks array updates multiple checks at once.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers 100% of parameters with descriptions, so the baseline is 3. The description does not add any additional meaning beyond the schema; it simply restates the purpose without elaborating on parameter usage or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Update check statuses for a notification submission' uses a specific verb ('update') and resource ('check statuses'), clearly distinguishing it from sibling tools like list_notification_checks which lists checks, and cancel_notification_submission which cancels submissions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidance is provided. The description does not specify when to use this tool over alternatives, such as when to update checks individually versus batch, or prerequisites like needing the submission to be in a certain state.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_providerA

Idempotent

Inspect

Replace an existing provider configuration. Full replacement — retrieve current config with get_provider first; omitted optional fields are cleared. Changing API keys or settings affects live delivery if this integration is in use.

ParametersJSON Schema

Name	Required	Description
`alias`	No	Short alias
`title`	No	Display name
`provider`	Yes	Provider key (must match existing; changing provider type is not supported)
`settings`	No	Provider-specific settings
`provider_id`	Yes	The provider configuration ID

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that it's a full replacement (clears omitted optional fields) and that changes affect live delivery. Annotations (readOnlyHint=false, idempotentHint=true) are consistent, and description adds context beyond them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two succinct sentences: first states purpose and nature, second provides critical usage guidance. No unnecessary words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers core aspects: purpose, usage, and side effects. Minor gap: no mention of return value (e.g., updated config or success status). But given no output schema and reasonable complexity, it's mostly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage, but description adds value by explaining that omitted optional fields are cleared due to full replacement. This clarifies parameter behavior beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action ('replace'), the resource ('provider configuration'), and emphasizes it's a full replacement. Distinguishes from siblings like create_provider, get_provider, delete_provider with explicit context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides guidance to retrieve current config first (get_provider) and warns about live delivery impact. Could be more explicit about when to use vs alternatives for partial updates, but overall clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_tenant_preferenceA

Idempotent

Inspect

Set the default notification preference for a subscription topic on a tenant. This controls tenant-level defaults — it does NOT set per-user preferences (use the user preferences API for that). The topic_id must already exist as a subscription topic in the workspace; a 404 means the topic has not been created yet. Example: { tenant_id: "acme", topic_id: "marketing-updates", status: "OPTED_IN", has_custom_routing: true, custom_routing: ["email", "push"] }.

ParametersJSON Schema

Name	Required	Description
`status`	Yes	Subscription status for the topic
`topic_id`	Yes	The subscription topic ID — must already exist in the workspace. A 404 response means the topic does not exist; create it in the Preferences Editor first.
`tenant_id`	Yes	The tenant ID
`custom_routing`	No	Default channels when has_custom_routing is enabled
`has_custom_routing`	No	When true, use custom_routing instead of template defaults

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare idempotentHint=true, and the description adds context that a 404 means the topic doesn't exist, providing behavioral insight about error states. However, it does not elaborate on other behavioral aspects like rate limits or side effects beyond what annotations already cover. The description adds value but doesn't significantly surpass annotation-provided information.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences plus an example, front-loading the core purpose. It is clear and efficient, though the example could be considered extra. No wasted words, but could be slightly more streamlined.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a mutation tool with 5 parameters and no output schema, the description covers the essential context: purpose, exclusion of per-user preferences, a prerequisite (topic existence), and an example. It does not explain return values or further side effects, but given idempotency annotation, this is acceptable. Adequately complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with adequate descriptions for all parameters. The description includes an example that illustrates parameter usage but does not add substantial new meaning beyond the schema. Given high coverage, baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it sets the default notification preference for a subscription topic on a tenant, using a specific verb and resource. It distinguishes from per-user preferences by explicitly saying it does not set per-user preferences, differentiating it from sibling tools like 'get_user_preferences' or 'update_user_preference_topic'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool (tenant-level defaults) and when not (per-user preferences), directing to 'user preferences API'. It also mentions a precondition: topic must exist in workspace, with a 404 indicating it hasn't been created. No explicit when-not-to-use beyond that, but the guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_translationC

Idempotent

Inspect

Create or update a translation for a specific locale.

ParametersJSON Schema

Name	Required	Description	Default
`body`	Yes	Translation content (PO file format)
`domain`	No	Translation domain	default
`locale`	Yes	Locale code (e.g. en_US, fr_FR)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true (upsert behavior) and readOnlyHint=false. The description adds 'Create or update' which aligns but does not disclose side effects like overwriting existing translations or authorization needs. No additional behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence clearly states purpose with no wasted words. However, it is borderline too sparse; additional brief context could be beneficial without harming conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters, no output schema, and annotations that provide limited behavioral insight, the description lacks completeness. It does not explain the body format, success indicators, error cases, or relationship to domain. Agent would need schema inspection for key details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all 3 parameters (body, domain, locale). The tool description does not add extra meaning beyond the schema, such as explaining PO file format or the role of domain. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Create or update a translation for a specific locale' clearly specifies the verb (create/update) and resource (translation) with the scope (specific locale), distinguishing it from sibling get_translation. However, it could be more precise about what constitutes a translation (locale + domain + body).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool vs alternatives. It does not mention prerequisites, when-not to use, or suggest alternatives like get_translation for reading. For a tool with many siblings, usage context is missing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_user_preference_topicA

Idempotent

Inspect

Update a user's preference for a specific subscription topic (opt in, opt out, or set channel preferences).

ParametersJSON Schema

Name	Required	Description
`status`	Yes	Preference status
`user_id`	Yes	The user ID
`topic_id`	Yes	The subscription topic ID
`custom_routing`	No	Custom channel routing order
`has_custom_routing`	No	Whether custom channel routing is set

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide idempotentHint=true and readOnlyHint=false, so the description doesn't need to reiterate idempotency. However, it adds no extra behavioral context (e.g., permissions required, side effects). With annotations covering the basics, a 3 is appropriate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no unnecessary words, efficiently conveys purpose and scope.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Five parameters with no output schema; description covers purpose but does not explain how custom_routing and has_custom_routing relate to 'set channel preferences' or clarify idempotency. A minimal but functional description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so all parameters have descriptions. The description adds general context about 'opt in, opt out, or set channel preferences' but does not detail parameter meanings beyond the schema. Baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'Update' and the resource 'user's preference for a specific subscription topic', and specifies actions (opt in, opt out, set channel preferences). This distinguishes it well from sibling tools like get_user_preference_topic (read) and subscribe_user_to_list (list subscription).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies use for updating topic preferences but does not explicitly state when to use this tool over alternatives like subscribe_user_to_list. No when-not or contextual exclusions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Server Details

Available Tools

Discussions

Your Connectors

Resources