Courier
Server Details
Send notifications, manage templates, and configure integrations with Courier.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- trycourier/courier-mcp
- GitHub Stars
- 1
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 3.3/5 across 108 of 108 tools scored. Lowest: 2.6/5.
While tools have detailed descriptions, the sheer volume of 108 tools with overlapping domains (e.g., multiple send_message variants, numerous CRUD operations) creates ambiguity. Agents may struggle to select the correct tool without careful reading.
Tool names follow a consistent verb_noun pattern (e.g., create_list, delete_brand, send_message_template). Exceptions like 'archive_request' or 'courier_installation_guide' are rare and still clear.
108 tools is excessive for an MCP server, far beyond the typical 3-15 range. This volume overwhelms agents and contradicts MCP's goal of simple, focused tool surfaces.
The tool set covers a comprehensive range of notification lifecycle operations: user management, lists, templates, providers, routing, and automation. Minor gaps like update operations for bulk jobs exist, but overall coverage is thorough.
Available Tools
108 toolsadd_bulk_usersCInspect
Add users to an existing bulk job.
| Name | Required | Description | Default |
|---|---|---|---|
| users | Yes | Array of user objects to add | |
| job_id | Yes | The bulk job ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but offers minimal behavioral insight. It states the tool adds users, implying a mutation, but doesn't disclose permissions needed, rate limits, idempotency, or what happens on failure (e.g., partial updates). For a mutation tool with zero annotation coverage, this is inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's front-loaded with the core action and resource, making it easy to parse quickly. No structural issues or redundancy are present.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It doesn't explain return values, error conditions, or side effects (e.g., whether the bulk job status changes). Given the complexity of bulk operations and lack of structured data, more context is needed for effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('job_id', 'users') documented in the schema. The description adds no additional meaning beyond the schema's details (e.g., what constitutes a valid 'user object'). Baseline score of 3 applies since the schema handles parameter documentation adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Add users to an existing bulk job' clearly states the action (add) and target (users to bulk job), but it's vague about what a 'bulk job' entails and doesn't distinguish from sibling tools like 'create_bulk_job' or 'list_bulk_users'. It provides basic purpose but lacks specificity about the resource context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites (e.g., needing an existing bulk job from 'create_bulk_job'), exclusions, or comparisons to similar tools like 'add_user_to_tenant'. Usage context is implied but not explicit.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
add_subscribers_to_listAInspect
Append subscribers to a list without removing existing subscribers.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID | |
| recipients | Yes | Recipients to set on the list |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate a write operation that is not idempotent. The description adds that existing subscribers are not removed, but does not disclose behavior for duplicate recipients, authorization needs, or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence that conveys the essential purpose without any superfluous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple append operation, the description covers the core behavior and a key constraint. However, it lacks details on prerequisites (list existence), outcomes, and edge cases, but remains sufficient for basic usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has full description coverage for both parameters. The description does not add additional meaning beyond naming the parameters, so it meets the baseline expectation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (append subscribers) and the key nuance (without removing existing subscribers), distinguishing it from other list manipulation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for adding subscribers without replacement, but does not provide explicit guidance on when to use this tool versus alternatives like subscribe_user_to_list or bulk_subscribe_to_list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
add_user_to_tenantCIdempotentInspect
Add a user to a tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| profile | No | Tenant-scoped profile overrides | |
| user_id | Yes | The user ID | |
| tenant_id | Yes | The tenant ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must fully disclose behavior. It states 'Add a user to a tenant' but fails to explain critical aspects: whether this is a mutation (implied), what permissions are required, if it's idempotent, what happens on duplicate adds, or the response format. This leaves significant gaps in understanding the tool's effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, direct sentence with no wasted words, making it highly concise and front-loaded. It efficiently communicates the core action without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a mutation tool with no annotations and no output schema, the description is insufficient. It lacks details on behavior, error conditions, return values, and how it differs from siblings. For a tool that modifies system state, more context is needed to ensure safe and correct usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for 'user_id' and 'tenant_id', and 'profile' as 'Tenant-scoped profile overrides'. The description adds no additional parameter semantics beyond the schema, but since the schema is well-documented, a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Add') and target ('a user to a tenant'), making the purpose immediately understandable. However, it does not differentiate from the sibling tool 'remove_user_from_tenant' or explain what 'adding' entails (e.g., granting access, assigning roles). This clarity is good but lacks sibling distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'create_or_merge_user' or 'list_user_tenants'. It also omits prerequisites (e.g., user and tenant must exist) or exclusions (e.g., cannot add duplicate users). Without such context, usage is ambiguous.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
archive_notificationADestructiveIdempotentInspect
Archive a notification template by ID.
| Name | Required | Description | Default |
|---|---|---|---|
| notification_id | Yes | The notification template ID to archive |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare idempotentHint=true and destructiveHint=true. The description adds no additional behavioral context, such as whether archiving is reversible or what side effects occur. It is consistent but not enhancing.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with no extraneous information. It is appropriately sized for the tool's simplicity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, no output schema), the description is functionally complete in defining the action, but lacks contextual information about when to use it among many sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the parameter description matches the tool description. The description adds no extra meaning beyond what the schema already provides, so baseline score applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'archive' and the specific resource 'notification template', distinguishing it from sibling tools like 'archive_routing_strategy'. It is precise and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. There is no mention of prerequisites, such as whether the template must be in a certain state, or when to use 'delete' instead of 'archive'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
archive_requestBDestructiveIdempotentInspect
Archive a send request and all its associated messages by request ID.
| Name | Required | Description | Default |
|---|---|---|---|
| request_id | Yes | The request ID (requestId returned from /send) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations provide idempotentHint and destructiveHint, so the description adds little beyond stating 'Archive'. It does not clarify whether the action is reversible, the fate of associated messages, or any side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence (8 words) that conveys the essential information without any unnecessary words. It is front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple one-parameter tool with clear annotations and no output schema, the description is nearly complete. It could mention if the operation is reversible or what happens to the request status, but the current level is adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with a clear description for the only parameter. The tool description simply echoes 'by request ID', adding no extra meaning beyond the schema's documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Archive'), the resource ('a send request and all its associated messages'), and the method ('by request ID'). This distinguishes it from sibling tools like delete_audience or other archive tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., delete or other archive operations). The description does not mention prerequisites or when archival is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
archive_routing_strategyADestructiveIdempotentInspect
Archive a routing strategy. The strategy must not have associated notification templates; unlink all templates before archiving.
| Name | Required | Description | Default |
|---|---|---|---|
| routing_strategy_id | Yes | The routing strategy ID to archive |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark destructiveHint=true, so the description carries lower burden. It adds valuable context beyond annotations by specifying the precondition about notification templates. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first states purpose, second adds the critical precondition. No extraneous information, and the most important details are front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple, single-parameter tool with annotations covering destructiveness, the description is complete. It explains what the tool does and the key constraint necessary for successful invocation. No output schema needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with a clear description of the single required parameter. The tool description adds no additional semantic information beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Archive a routing strategy' with a specific verb and resource. Distinguishes from sibling archive tools like archive_notification and archive_request.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides a clear precondition: strategy must not have associated notification templates. This tells the agent when it is appropriate to use and what to do before invoking (unlink templates). Does not explicitly name alternative tools, but the precondition serves as implicit guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bulk_add_user_tenantsAIdempotentInspect
Add a user to multiple tenants at once. A custom profile can be supplied per tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| tenants | Yes | Array of tenant associations | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description states the basic write operation. Annotations indicate idempotentHint=true and readOnlyHint=false, which are consistent. No additional behavioral traits (e.g., behavior on existing associations) are disclosed beyond what annotations convey.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with no redundant information. Every word contributes to understanding the tool's purpose and key capability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 params, no output schema), the description adequately covers the core functionality. It does not specify return values, but without an output schema this is not a gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all parameters. The description's mention of custom profile matches the profile field. No additional semantic meaning is added beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Add a user to multiple tenants at once') and includes the ability to supply a custom profile per tenant. It effectively distinguishes from siblings like 'add_user_to_tenant' which is for a single tenant.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies the tool is for bulk operations, contrasting with single-tenant alternative. However, it does not explicitly mention when not to use it or prerequisites like user/tenant existence.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bulk_add_user_tokensAInspect
Add multiple push/device tokens for a user in one request. Overwrites matching existing tokens.
| Name | Required | Description | Default |
|---|---|---|---|
| tokens | Yes | Token records to upsert | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description indicates mutation ('Add', 'Overwrites') aligning with readOnlyHint=false. Lacks details on limits, error handling, or side effects beyond basic upsert behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two short sentences, front-loaded with action, no redundant words. Efficient for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers key aspects but lacks constraints (e.g., max tokens) and error conditions. For a simple bulk tool with no output schema, it's partially complete but could add more context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. Description adds 'in one request' context but 'overwrites' is already implied by schema's 'upsert' description. Adequate but not beyond baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool adds multiple push/device tokens for a user, specifying the resource (push/device tokens) and action (add). It distinguishes from siblings like create_or_replace_user_push_token which handles single tokens.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description implies bulk use ('in one request') but does not explicitly contrast with single-token tools. No guidance on when to use this vs create_or_replace_user_push_token or patch_user_token.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bulk_subscribe_to_listAIdempotentInspect
Replace all subscribers on a list with the given recipients.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID | |
| recipients | Yes | Recipients to set on the list |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses that the operation replaces all subscribers, indicating destructive behavior beyond the idempotentHint annotation. It adds value by clarifying the effect on existing subscribers.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single clear sentence that is front-loaded and concise, containing no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool is simple with two parameters and annotations present. The description adequately explains the core action, but could mention potential side effects or error scenarios for completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with descriptions for both parameters. The description does not add any additional meaning beyond what is already in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses the specific verb 'replace' and resource 'subscribers on a list', making the purpose clear. It distinguishes from sibling tools like add_subscribers_to_list and unsubscribe_user_from_list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies that this tool is for replacing all subscribers, which gives clear context. However, it does not explicitly state when to use this vs alternatives like add_subscribers_to_list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cancel_messageBDestructiveIdempotentInspect
Cancel a message that is currently being delivered. Returns the message details with updated status.
| Name | Required | Description | Default |
|---|---|---|---|
| message_id | Yes | The message ID to cancel |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the action ('cancel') and return value ('message details with updated status'), but doesn't address critical behavioral aspects: whether cancellation is reversible, what permissions are required, whether there are rate limits, what happens if the message is already delivered, or what specific status changes occur. For a mutation tool with zero annotation coverage, this leaves significant gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with two sentences that each serve distinct purposes: the first states the action and target, the second describes the return value. There's zero wasted language, and the most important information (what the tool does) is front-loaded. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is insufficiently complete. While it states the basic action and return, it doesn't address error conditions, side effects, permissions, or what 'cancel' actually means operationally. The agent would need to guess about many behavioral aspects, making this description inadequate for safe tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with the single parameter 'message_id' well-documented in the schema. The description doesn't add any parameter-specific information beyond what the schema already provides (no format examples, no constraints on valid message IDs). With high schema coverage, the baseline score of 3 is appropriate as the description doesn't enhance parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('cancel') and resource ('message that is currently being delivered'), making the purpose immediately understandable. It distinguishes from siblings like 'delete_message' or 'get_message' by focusing on in-progress messages. However, it doesn't explicitly differentiate from all possible message-related operations in the sibling list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context ('message that is currently being delivered') suggesting this tool is for interrupting active deliveries rather than deleting sent messages. However, it doesn't provide explicit guidance on when NOT to use it or mention alternatives like 'delete_message' if it existed. The guidance is contextual but incomplete.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cancel_notification_submissionCDestructiveIdempotentInspect
Cancel a notification template submission.
| Name | Required | Description | Default |
|---|---|---|---|
| submission_id | Yes | The submission ID to cancel | |
| notification_id | Yes | The notification template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate destructiveHint=true and idempotentHint=true, but the description adds no extra context about side effects, permanence, or what happens to the submission. It relies solely on the term 'Cancel' which is generic.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, no wasted words. However, it is too brief to be fully informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks information about return value, outcomes, or side effects. For a destructive operation, more context about what happens after cancellation would be useful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Both parameters have clear descriptions in the schema (submission_id and notification_id). The tool description adds no additional meaning beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Cancel' and the resource 'notification template submission'. It differentiates from siblings like cancel_message by specifying submission, though it could be more explicit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives like archive_notification or cancel_message. No context on prerequisites or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
courier_installation_guideARead-onlyInspect
Get the Courier SDK installation guide for a specific platform. For client-side SDKs (React, iOS, Android, Flutter, React Native), also generates a sample JWT.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | No | User ID for JWT generation (client-side SDKs only). Defaults to "example_user". | |
| platform | Yes | The platform to get installation guide for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses that the tool retrieves installation guides and generates JWTs for client-side SDKs, which is useful behavioral context. However, it doesn't mention potential side effects, authentication requirements, rate limits, or response format details, leaving gaps for a tool that likely involves external resources.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently conveys the core functionality and conditional behavior. Every word earns its place, with no redundancy or unnecessary elaboration, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is moderately complete for a tool with 2 parameters and high schema coverage. It covers the main action and conditional JWT generation, but lacks details on output format, error handling, or dependencies, which could be important for an installation guide tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description adds marginal value by implying that 'user_id' is only relevant for client-side SDKs, but this is partially covered in the schema's description. Baseline 3 is appropriate since the schema does most of the work.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('Get', 'generates') and resources ('Courier SDK installation guide', 'sample JWT'). It distinguishes this tool from siblings by focusing on installation guides rather than user management, messaging, or other operations listed in the sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool: to get installation guides for specific platforms. It implicitly distinguishes usage by specifying that for client-side SDKs, it also generates a sample JWT, but it doesn't explicitly state when not to use it or name alternatives among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_brandCInspect
Create a new brand with name, colors, and email/inapp settings.
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Optional brand ID; auto-generated if omitted | |
| name | Yes | Brand display name | |
| settings | No | Brand settings (colors, email, inapp) | |
| snippets | No | Brand snippets |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Create' implies a write/mutation operation, the description doesn't specify permissions needed, whether the operation is idempotent, error conditions, or what happens on success (e.g., returns the created brand object). For a mutation tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action ('Create a new brand') and specifies key attributes without unnecessary words. Every part of the sentence contributes directly to understanding the tool's purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It doesn't explain what the tool returns (e.g., the created brand object), error handling, or behavioral nuances like whether 'id' generation is guaranteed to be unique. Given the complexity of nested objects in the schema, more context would be helpful for the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all 4 parameters thoroughly. The description mentions 'name, colors, and email/inapp settings', which aligns with the 'name' and 'settings' parameters in the schema but doesn't add meaningful semantics beyond what the schema provides. The baseline score of 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Create a new brand') and specifies the key attributes involved ('name, colors, and email/inapp settings'), making the purpose immediately understandable. It doesn't explicitly differentiate from sibling tools like 'list_brands' or 'get_brand', but the verb 'Create' is sufficiently distinct from 'list' or 'get' operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., authentication requirements), when not to use it, or how it relates to sibling tools like 'list_brands' for viewing existing brands. The agent must infer usage from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_bulk_jobAInspect
Create a new bulk job for sending messages to multiple recipients. Workflow: create_bulk_job → add_bulk_users → run_bulk_job.
| Name | Required | Description | Default |
|---|---|---|---|
| message | Yes | Bulk message definition with event/template and content |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While it mentions the workflow, it doesn't disclose critical behavioral traits such as permissions required, whether the job is saved or transient, error handling, or what happens if the job isn't run. For a mutation tool with zero annotation coverage, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise and front-loaded, with two sentences that efficiently convey the purpose and workflow. Every sentence earns its place, and there is no wasted verbiage or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a mutation with a nested object parameter) and no annotations or output schema, the description is moderately complete. It covers the purpose and workflow but lacks details on behavioral aspects like side effects, permissions, or return values. For a tool with these gaps, it's adequate but not fully comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'message' documented as 'Bulk message definition with event/template and content.' The description adds no additional parameter semantics beyond what the schema provides. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Create a new bulk job for sending messages to multiple recipients.' It specifies the verb ('create') and resource ('bulk job'), and distinguishes it from siblings like 'run_bulk_job' by indicating it's the first step in a workflow. However, it doesn't fully differentiate from other creation tools like 'create_list' or 'create_brand' beyond the resource type.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage context by outlining the workflow: 'create_bulk_job → add_bulk_users → run_bulk_job.' This clearly indicates when to use this tool (as the first step) and references sibling tools for subsequent steps. It doesn't explicitly state when not to use it or name alternatives, but the workflow guidance is strong.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_listCIdempotentInspect
Create or update a list by list ID.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Display name for the list | |
| list_id | Yes | The list ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'create or update' but doesn't specify whether this is an upsert operation, what happens if the list ID doesn't exist, required permissions, or error conditions. This is inadequate for a mutation tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It's front-loaded with the core action, though it could be more structured by explicitly separating creation and update scenarios. Overall, it's concise but under-specified.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It lacks details on behavior (e.g., upsert logic), error handling, or return values. Given the complexity of a 'create or update' operation, this leaves significant gaps for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('list_id' and 'name'). The description adds no additional meaning beyond what the schema provides, such as format constraints or usage examples. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the action ('create or update') and resource ('a list by list ID'), which is clear but vague about the distinction between creation and update. It doesn't differentiate from sibling tools like 'create_brand' or 'create_or_merge_user', leaving ambiguity about when to use this specific list tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites, such as whether the list ID must exist for updates, or when to choose this over tools like 'create_bulk_job' or 'update_audience'. This leaves the agent without context for decision-making.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_notificationBInspect
Create a notification template with name, tags, brand, subscription, routing, and content.
| Name | Required | Description | Default |
|---|---|---|---|
| state | No | Template state after creation (defaults to DRAFT) | |
| notification | Yes | Notification template payload |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate a write (readOnlyHint=false) and non-idempotent operation. The description adds minimal behavioral context, e.g., not mentioning that creating a template with an existing name may fail or what side effects occur.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that is front-loaded with the core purpose. It lists fields efficiently but could be slightly more concise without losing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks return value information (no output schema). Does not explain the effect of the state parameter, though the schema covers it. Comprehensive enough for a simple creation tool but leaves gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters (state and notification). The description lists the fields of the notification object but does not clarify the types or constraints for brand, content, routing, subscription beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action 'Create a notification template' and lists the fields to be set (name, tags, brand, subscription, routing, content). This distinguishes it from sibling tools like get_notification, archive_notification, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs. alternatives (e.g., publish_notification, put_notification_content). No mention of prerequisites or when not to use it, despite many related tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_or_merge_userBIdempotentInspect
Create a new user profile or merge supplied values into an existing profile (POST). Existing fields not included are preserved.
| Name | Required | Description | Default |
|---|---|---|---|
| profile | No | Profile data to create or merge (e.g. { email: "...", phone_number: "..." }) | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses key behavioral traits: it's a POST operation (implying mutation), performs an upsert (create or merge), and preserves existing fields not included. However, it misses details like authentication requirements, error conditions, rate limits, or what happens on conflicts, leaving gaps for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose ('create or merge') and includes essential behavioral detail ('Existing fields not included are preserved'). It avoids redundancy, though it could be slightly more structured for clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is moderately complete. It covers the upsert behavior and field preservation, but lacks information on response format, error handling, permissions, or side effects. Given the complexity of user profile operations, more context would be beneficial.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('user_id' and 'profile'). The description adds minimal value by hinting at the merge behavior and example profile data, but doesn't provide additional syntax, format, or constraints beyond what the schema specifies. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('create or merge') and resource ('user profile'), specifying it's a POST operation. It distinguishes from siblings like 'add_user_to_tenant' or 'replace_profile' by emphasizing the merge behavior, though it doesn't explicitly name alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context through 'create or merge' and mentions preservation of existing fields, suggesting it's for upsert operations. However, it lacks explicit guidance on when to use this versus alternatives like 'create_or_replace_user_push_token' or 'replace_profile', and doesn't specify prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_or_replace_user_push_tokenCIdempotentInspect
Create or replace a push/device token for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| token | Yes | The token string | |
| device | No | Device metadata | |
| user_id | Yes | The user ID | |
| provider_key | Yes | Push provider |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the action ('create or replace') but doesn't explain what 'replace' entails (e.g., overwriting existing tokens), potential side effects, authentication requirements, or error conditions. This leaves significant gaps for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's front-loaded with the core action and resource, making it easy to parse quickly without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is inadequate. It doesn't cover behavioral aspects like idempotency, error handling, or response format. Given the complexity of managing user tokens and the lack of structured data, more context is needed to guide proper usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description adds no additional meaning beyond the schema, such as explaining the relationship between 'token' and 'device' or clarifying the 'provider_key' enum values. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('create or replace') and resource ('push/device token for a user'), making the purpose unambiguous. It doesn't explicitly differentiate from sibling tools like 'get_user_push_token' or 'list_user_push_tokens', but the action is distinct enough to avoid confusion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'get_user_push_token' or 'list_user_push_tokens'. It lacks context about prerequisites, such as user existence or permissions, and doesn't mention any exclusions or specific scenarios for its application.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_or_update_tenantCIdempotentInspect
Create or replace a tenant. Tenants represent organizations or groups that users belong to.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Display name for the tenant | |
| brand_id | No | Brand ID to associate with this tenant | |
| tenant_id | Yes | The tenant ID | |
| properties | No | Custom properties for the tenant | |
| user_profile | No | Default profile data for users in this tenant | |
| parent_tenant_id | No | Parent tenant ID for hierarchical tenants | |
| default_preferences | No | Default notification preferences for users in this tenant |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool 'create or replace' a tenant, implying mutation, but doesn't disclose behavioral traits like whether it's idempotent, what permissions are required, what happens on replacement (e.g., data loss), or error conditions. For a mutation tool with zero annotation coverage, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences with zero waste. The first sentence states the action and resource, and the second provides helpful context about what tenants represent. It's appropriately sized and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with 7 parameters, no annotations, and no output schema, the description is incomplete. It lacks behavioral details (e.g., idempotency, side effects), usage guidance, and output expectations. The schema covers parameters well, but the description doesn't compensate for missing annotations or output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, so all parameters are documented in the schema. The description adds no parameter-specific information beyond the general purpose. It doesn't explain how parameters like 'tenant_id' or 'properties' affect the operation. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('create or replace') and resource ('tenant'), and explains what tenants represent. It distinguishes the tool from sibling tools like 'delete_tenant' and 'get_tenant' by specifying its mutative nature. However, it doesn't explicitly differentiate from other tenant-related tools like 'list_tenants' or 'get_tenant' beyond the action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, when to choose creation versus replacement, or how it relates to sibling tools like 'delete_tenant' or 'list_tenants'. The agent must infer usage from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_providerAInspect
Create a new provider (integration) configuration. Once routing strategies or notification templates reference this config, credential or settings mistakes can affect live sends—confirm provider key and settings against list_provider_catalog before saving. The provider field must be a known Courier provider key.
| Name | Required | Description | Default |
|---|---|---|---|
| alias | No | Short alias for referencing this provider | |
| title | No | Display name for this provider configuration | |
| provider | Yes | Provider key from the catalog (e.g. sendgrid, twilio, firebase-fcm) | |
| settings | No | Provider-specific settings (API keys, credentials, etc.) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate this is a mutable, non-idempotent operation. The description adds significant behavioral context by warning that credential or settings mistakes can affect live sends once referenced by routing strategies or templates, which is beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no waste: first sentence states purpose, second delivers critical warnings and constraints. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a creation tool with a single required parameter and no output schema, the description covers the main risk and prerequisite. Lacks mention of return value but that is minor given no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds value by specifying that the 'provider' field must be a known Courier provider key and emphasizing the impact of 'settings' errors on live sends.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Create a new provider (integration) configuration') with a specific verb and resource. It differentiates from sibling tools like list_provider_catalog, get_provider, update_provider, and delete_provider by focusing solely on creation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description advises confirming provider key and settings against list_provider_catalog before saving, and warns that mistakes can affect live sends. While it does not explicitly name alternatives like update_provider for editing, it provides clear context for safe usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_routing_strategyAInspect
Create a routing strategy defining how notifications are delivered across channels and providers.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Human-readable name for the routing strategy | |
| tags | No | Tags for categorization | |
| routing | Yes | Routing tree defining channel selection method and order | |
| channels | No | Per-channel delivery configuration | |
| providers | No | Per-provider delivery configuration | |
| description | No | Description of the routing strategy |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=false and idempotentHint=false, consistent with a create operation. The description adds the context that it defines delivery across providers, but does not elaborate on any side effects, permissions, or constraints beyond that.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, 14 words, no redundant information. Perfectly concise and front-loaded with the core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks explanation of return value (no output schema) and does not clarify the routing structure or dependencies on existing providers. However, the schema details are rich, and the context of sibling tools is clear. Somewhat minimal for a creation tool with nested objects.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents each parameter. The description mentions 'channels and providers' which correspond to parameters, but adds no new meaning or constraints beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Create' and the resource 'routing strategy', and specifies its function (defining delivery across channels and providers). It effectively differentiates from sibling tools like archive, list, get, and replace.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., replace_routing_strategy for updates). The description only states what it does, not the context or prerequisites for creation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_audienceCDestructiveIdempotentInspect
Delete an audience by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| audience_id | Yes | The audience ID to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states the action is 'Delete,' implying a destructive mutation, but doesn't disclose critical behaviors like whether deletion is permanent, requires specific permissions, has side effects (e.g., on related data), or returns confirmation. This is inadequate for a destructive tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's front-loaded with the core action and resource, making it highly concise and well-structured for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a destructive mutation tool with no annotations and no output schema, the description is incomplete. It lacks details on behavioral traits (e.g., irreversibility, auth needs), expected outcomes, or error handling, which are critical for safe and effective use in this context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'audience_id' fully documented in the schema. The description adds no additional meaning beyond what the schema provides (e.g., format, validation rules, or examples), so it meets the baseline for high coverage without compensation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and resource ('an audience by its ID'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'delete_profile' or 'delete_tenant' beyond the resource name, which slightly limits its distinctiveness.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing an existing audience ID), exclusions, or related tools like 'get_audience' for verification, leaving usage context unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_brandADestructiveIdempotentInspect
Delete a brand by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| brand_id | Yes | The brand ID to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true, so the description adds minimal behavioral context. The idempotentHint=true annotation is not clarified; deleting a non-existent brand may cause an error, potentially contradicting idempotency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One short sentence, no unnecessary words, front-loaded with the key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Minimally complete for a simple delete operation, but lacks details on return value, error handling, or side effects. No output schema to compensate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the single parameter 'brand_id' is already well-documented. The description simply repeats 'by its ID', adding no new meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action 'Delete' and the resource 'brand' with the method 'by its ID', distinguishing it from sibling tools like create_brand, update_brand, and list_brands.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool vs alternatives. The usage is implied by the name and description, but lacking context such as prerequisites or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_listBDestructiveIdempotentInspect
Delete a list by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide destructiveHint=true and idempotentHint=true. Description adds 'Delete' which aligns, but does not elaborate on side effects or what happens to associated data (e.g., subscribers). Minimal additional value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that is concise and to the point. Could be considered too minimal, but it efficiently conveys the core action without fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple delete with annotations and one parameter, the description is adequate but lacks context about implications (e.g., whether deletion is permanent or reversible, nor mention of related restore_list sibling).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with one parameter. Description does not add meaning beyond the schema's description 'The list ID'. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Delete' and the resource 'list', with the parameter 'by its ID'. It distinguishes from sibling tools like create_list, get_list, and restore_list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., restore_list for undoing, or update_list for modifying). No prerequisites or contexts mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_profileCDestructiveIdempotentInspect
Delete a user profile permanently.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states the action is 'permanent', which is a critical behavioral trait, but fails to mention other important aspects such as required permissions, whether deletion is reversible, what happens to associated data, or error conditions. This leaves significant gaps for a destructive operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the core action and key qualifier ('permanently') without any wasted words. It's appropriately sized and front-loaded with essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive operation with no annotations and no output schema, the description is insufficient. It mentions permanence but omits critical context like permissions needed, side effects, return values, or error handling. Given the high-stakes nature of profile deletion, more comprehensive guidance is warranted.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the single parameter 'user_id' documented as 'The user ID to delete'. The description doesn't add any additional semantic context beyond this, such as format examples or constraints. Baseline score of 3 is appropriate since the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and resource ('user profile') with the qualifier 'permanently', which adds specificity. However, it doesn't explicitly differentiate from sibling tools like 'delete_tenant' or 'delete_audience', which also perform deletion operations on different resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., user must exist), exclusions, or comparisons to related tools like 'replace_profile' or 'delete_tenant', leaving the agent with no contextual usage information.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_providerADestructiveIdempotentInspect
Delete a provider configuration. Returns 409 if the provider is still referenced by routing or notifications.
| Name | Required | Description | Default |
|---|---|---|---|
| provider_id | Yes | The provider configuration ID to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds the important behavior of returning 409 on conflict, which goes beyond annotations (destructiveHint=true). IdempotentHint is not contradicted or expanded, but the conflict detail is valuable.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words. Action-first structure with the key behavior stated immediately.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple delete tool with one parameter and no output schema, the description covers purpose and a key behavior (409 conflict). Could mention irreversibility, but destructiveHint already covers that.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a clear parameter description. The description does not add additional meaning beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Delete a provider configuration' which is specific verb+resource. Distinguishes from siblings like create_provider, update_provider, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Mentions the condition of returning 409 if referenced, guiding the agent when deletion is possible. Implicitly suggests checking for references first, but does not explicitly compare to alternatives like archiving.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_tenantCDestructiveIdempotentInspect
Delete a tenant by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| tenant_id | Yes | The tenant ID to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states the action is 'Delete' but lacks critical details: whether this is irreversible, requires specific permissions, affects associated data, or has rate limits. This is inadequate for a destructive operation with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste—it directly states the tool's purpose without unnecessary words. It's appropriately sized and front-loaded, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive tool with no annotations and no output schema, the description is insufficient. It lacks details on behavioral traits (e.g., irreversibility, side effects), usage context, and expected outcomes, leaving significant gaps in understanding how to safely and effectively invoke this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'tenant_id' fully documented in the schema. The description adds minimal value by mentioning 'by its ID', which aligns with the schema but doesn't provide additional context like ID format or examples. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and target resource ('a tenant by its ID'), providing a specific verb+resource combination. However, it doesn't differentiate from sibling tools like 'delete_audience' or 'delete_profile' beyond the resource type, which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'remove_user_from_tenant' or 'delete_audience', nor does it mention prerequisites, exclusions, or context for deletion. The description only states what it does, not when or why to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_tenant_preferenceBDestructiveIdempotentInspect
Remove default notification preference for a topic from a tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| topic_id | Yes | The subscription topic ID | |
| tenant_id | Yes | The tenant ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveness and idempotency. The description adds no additional behavioral context, such as behavior when the preference does not exist or side effects on related entities.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence with no unnecessary words. It efficiently conveys the core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple delete operation with two well-defined parameters and annotations covering key behavioral traits, the description is largely complete. It could mention idempotency implications, but that's already in annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. The description does not add further meaning beyond what the schema provides, but it is not misleading.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Remove'), the resource ('default notification preference'), and the scope ('for a topic from a tenant'). It effectively distinguishes from siblings like 'delete_tenant' or 'update_tenant_preference'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as user-level preference tools (e.g., 'update_user_preference_topic'). No exclusions or context about prerequisites are given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_user_list_subscriptionsCDestructiveIdempotentInspect
Delete all list subscriptions for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the action is a deletion but does not specify whether this is reversible, requires admin permissions, affects user data permanently, or has side effects like notifications. This leaves significant gaps for a destructive operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, direct sentence with no wasted words, clearly front-loading the core action. It efficiently communicates the essential purpose without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive tool with no annotations and no output schema, the description is inadequate. It lacks details on behavior, error handling, return values, or safety considerations, leaving the agent with insufficient context to use it correctly in complex scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with 'user_id' documented as 'The user ID'. The description adds no additional meaning beyond this, such as format examples or scope details. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and target resource ('all list subscriptions for a user'), making the purpose unambiguous. However, it does not explicitly differentiate from sibling tools like 'unsubscribe_user_from_list' or 'delete_audience', which handle related but different operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'unsubscribe_user_from_list' (for single subscriptions) or 'delete_audience' (for broader data removal). It lacks context about prerequisites, consequences, or typical scenarios for bulk deletion.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_user_tokenBDestructiveIdempotentInspect
Delete a specific push token for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| token | Yes | The token identifier to delete | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true and idempotentHint=true. The description does not add behavioral context such as whether tokens can be restored, if the operation requires specific permissions, or what happens after deletion. It adds no value beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that is concise and to the point. However, it lacks structured details like when to use or return information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple nature of the tool (delete a token) and the presence of annotations and complete schema, the description is adequate. However, it could mention that the operation is destructive and idempotent, which annotations already cover.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with each parameter described. The description adds no additional meaning or constraints beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (Delete), the target (a specific push token), and the owner (for a user). It is specific and accurately distinguishes from sibling tools like get_user_push_token or bulk_add_user_tokens.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like patch_user_token (for updating) or get_user_push_token (for reading). No context on prerequisites or side effects is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_jwt_for_userBInspect
Generate a JWT authentication token for a user. Used for client-side SDK auth (Inbox, Preferences, etc.).
| Name | Required | Description | Default |
|---|---|---|---|
| scopes | No | Permission scopes for the token | |
| user_id | Yes | The user ID to scope the token to | |
| expires_in | No | Token expiry duration (e.g. "1h", "2 days") | 1h |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions the token is for 'authentication' and 'client-side SDK auth,' which implies security-sensitive behavior, but doesn't disclose critical traits like required permissions, rate limits, or whether this operation is idempotent. For a token generation tool with zero annotation coverage, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences with zero waste. It front-loads the core purpose and follows with usage context, making it efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (security-sensitive token generation), lack of annotations, and no output schema, the description is incomplete. It covers the basic purpose and usage but misses behavioral details like auth requirements, token format, or error handling. This is adequate for a minimal viable description but has clear gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all parameters documented in the schema. The description adds no additional parameter semantics beyond what the schema provides (e.g., it doesn't explain scopes or expiry formats further). Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Generate a JWT authentication token for a user.' It specifies the verb ('generate') and resource ('JWT authentication token'), and mentions the target ('for a user'). However, it doesn't explicitly differentiate from sibling tools, as none appear to be direct alternatives for token generation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implied usage context: 'Used for client-side SDK auth (Inbox, Preferences, etc.).' This suggests when to use it (for SDK authentication) but doesn't explicitly state when not to use it or name alternatives. No prerequisites or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_audienceCRead-onlyInspect
Get an audience by its ID, including its filter definition.
| Name | Required | Description | Default |
|---|---|---|---|
| audience_id | Yes | The audience ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states it 'gets' data, implying a read-only operation, but does not disclose behavioral traits such as error handling (e.g., if the ID is invalid), authentication needs, rate limits, or response format. This leaves significant gaps for a tool with no structured safety hints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action and resource. It avoids redundancy and wastes no words, making it easy to parse quickly. Every part of the sentence contributes directly to understanding the tool's function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete. It does not explain what 'including its filter definition' entails in the return value, error conditions, or any side effects. For a read operation with minimal structured support, more behavioral context is needed to fully guide an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'audience_id' documented in the schema. The description adds no additional meaning beyond implying the ID retrieves an audience with its filter definition, which is already covered by the tool's purpose. Baseline 3 is appropriate as the schema handles parameter documentation adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('an audience by its ID'), specifying it includes the filter definition. It distinguishes from siblings like 'list_audiences' (which lists multiple) and 'delete_audience' (which removes), though not explicitly named. However, it lacks explicit sibling differentiation, making it slightly less specific than a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It does not mention prerequisites (e.g., needing a valid audience ID), exclusions, or comparisons to siblings like 'list_audiences' for browsing or 'get_audience_members' for details. The description implies usage when an ID is known, but offers no explicit context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_audit_eventCRead-onlyInspect
Get a specific audit event by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| audit_event_id | Yes | The audit event ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool retrieves a specific audit event but doesn't mention whether this is a read-only operation, if it requires specific permissions, what happens if the ID is invalid, or any rate limits. For a tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any fluff or redundancy. It's appropriately sized and front-loaded, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete. It doesn't explain what an audit event contains, the format of the return value, or error conditions. For a tool that likely returns structured data, more context is needed to guide the agent effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the single parameter 'audit_event_id' fully documented in the schema. The description adds no additional parameter semantics beyond what's in the schema, so it meets the baseline score of 3 for adequate coverage without extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and the resource 'a specific audit event by its ID', making the purpose unambiguous. However, it doesn't differentiate from its sibling 'list_audit_events', which would require explicit comparison to achieve a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'list_audit_events' or other audit-related tools. It lacks any context about prerequisites, timing, or exclusions, leaving the agent without usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_brandBRead-onlyInspect
Get a brand by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| brand_id | Yes | The brand ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states a read operation ('Get'), implying it's likely safe, but doesn't mention permissions, rate limits, error handling, or what happens if the ID is invalid. This leaves significant gaps in understanding the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words, making it easy to parse. It's front-loaded with the core action and resource, which is ideal for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (1 parameter, 100% schema coverage, no output schema), the description is adequate but minimal. It covers the basic purpose but lacks details on usage, behavioral traits, or return values, which could be helpful for an agent in a broader context with many sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the parameter 'brand_id' fully documented in the schema. The description adds no additional meaning beyond implying the parameter is required, which the schema already states. This meets the baseline for high schema coverage without extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('a brand') with the specific identifier ('by its ID'), making the purpose unambiguous. However, it doesn't differentiate from sibling tools like 'list_brands' or 'get_audience', which follow similar patterns, so it doesn't reach the highest score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'list_brands' for browsing or other 'get_' tools for different resources. The description is minimal and offers no context on prerequisites or exclusions, leaving usage decisions unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_bulk_jobCRead-onlyInspect
Get the status of a bulk job.
| Name | Required | Description | Default |
|---|---|---|---|
| job_id | Yes | The bulk job ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool retrieves status but doesn't describe what the status includes (e.g., progress percentage, success/failure, error details), whether it's read-only (implied but not confirmed), or any rate limits or authentication requirements. This leaves significant gaps for an agent to understand how the tool behaves.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded with the core action ('Get the status'), making it easy to parse. There is no wasted language, and it fits well within the context of a simple status-checking tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a job status tool with no annotations and no output schema, the description is insufficient. It doesn't explain what the status output includes (e.g., JSON structure, possible states like 'pending', 'completed'), error handling, or dependencies on other tools like 'create_bulk_job'. For a tool that likely returns dynamic data, more context is needed to be complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with 'job_id' documented as 'The bulk job ID'. The description adds no additional meaning beyond this, such as format examples (e.g., UUID) or where to obtain the ID. Since the schema already provides adequate parameter documentation, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get the status') and resource ('of a bulk job'), making the purpose unambiguous. It distinguishes from siblings like 'create_bulk_job' or 'run_bulk_job' by focusing on status retrieval rather than creation or execution. However, it doesn't specify what 'status' entails (e.g., progress, completion, errors), leaving some ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a job ID from 'create_bulk_job' or 'run_bulk_job'), nor does it differentiate from similar tools like 'list_bulk_users' or 'get_audit_event' that might provide related information. Usage is implied but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_listCRead-onlyInspect
Get a list by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states 'Get a list by its ID', implying a read-only operation, but doesn't disclose behavioral traits such as error handling (e.g., if the ID is invalid), authentication needs, rate limits, or what data is returned. This is a significant gap for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste—'Get a list by its ID.' It's appropriately sized and front-loaded, making it easy to parse. Every word serves a purpose, earning a high score for conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete. It doesn't explain what 'get' returns (e.g., list metadata, contents, or subscribers), error conditions, or dependencies. For a tool with one parameter but no structured output info, more context is needed to be fully helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'list_id' documented as 'The list ID'. The description adds no meaning beyond this, as it only repeats the parameter concept without explaining format, source, or constraints. Baseline is 3 since the schema does the heavy lifting, but no extra value is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Get a list by its ID' clearly states the action (get) and resource (list), but it's vague about what 'get' entails—retrieving metadata, contents, or both. It distinguishes from siblings like 'list_lists' (which lists multiple lists) but doesn't clarify differences from 'get_list_subscribers' or 'get_user_list_subscriptions', which are related but distinct operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. For example, it doesn't specify if this should be used for retrieving list details after 'list_lists' or as a prerequisite for operations like 'send_message_to_list'. The description lacks context on prerequisites or exclusions, leaving usage unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_list_subscribersCRead-onlyInspect
Get all subscribers of a list.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| list_id | Yes | The list ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Get all subscribers' but does not specify if this is a read-only operation, whether it supports pagination (though the schema includes a 'cursor' parameter), rate limits, authentication needs, or what happens if the list does not exist. This leaves significant gaps in understanding the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any unnecessary words. It is front-loaded and wastes no space, making it highly concise and well-structured for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete. It does not explain return values, error conditions, or behavioral traits like pagination handling. For a tool with two parameters and no structured output information, more context is needed to fully guide the agent, making it inadequate for comprehensive use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with clear descriptions for both parameters ('list_id' and 'cursor'), so the baseline score is 3. The description adds no additional semantic information beyond what the schema provides, such as format examples or usage context for the cursor, but it does not need to compensate for low coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('all subscribers of a list'), making the purpose specific and understandable. However, it does not explicitly differentiate from sibling tools like 'list_audience_members' or 'get_user_list_subscriptions', which might have overlapping functionality, so it falls short of a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. For example, it does not specify if this is for retrieving all subscribers at once, how it compares to paginated or filtered queries in sibling tools, or any prerequisites like list existence. This lack of context leaves the agent without clear usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_messageBRead-onlyInspect
Get the full details and status of a single message by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| message_id | Yes | The message ID to retrieve |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Get' implies a read-only operation, the description doesn't specify whether this requires authentication, rate limits, error conditions, or what 'full details and status' includes. For a tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any unnecessary words. It's appropriately sized and front-loaded with the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (1 parameter, no output schema, no annotations), the description is adequate but incomplete. It explains the basic operation but lacks details about authentication requirements, error handling, or what 'full details and status' entails, which would be helpful for an agent to use it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the single parameter 'message_id' clearly documented in the schema. The description adds no additional parameter semantics beyond what's already in the schema, so it meets the baseline score of 3 for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get the full details and status') and resource ('a single message by its ID'), making the purpose specific and understandable. However, it doesn't explicitly distinguish this tool from sibling tools like 'get_message_content' or 'get_message_history', which appear to be related message retrieval operations with different scopes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'get_message_content' or 'get_message_history', nor does it mention any prerequisites or exclusions. It simply states what the tool does without contextual usage information.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_message_contentCRead-onlyInspect
Get the rendered content (HTML, text, subject) of a previously sent message.
| Name | Required | Description | Default |
|---|---|---|---|
| message_id | Yes | The message ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool retrieves content but does not disclose behavioral traits like whether it requires authentication, rate limits, error handling, or the format of the returned content. This is inadequate for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and appropriately sized, making it easy to parse and understand quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete. It does not explain what the return values look like (e.g., structure of HTML/text/subject), error conditions, or other contextual details needed for effective tool invocation, leaving significant gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the single parameter 'message_id' documented. The description adds no additional meaning beyond the schema, such as format examples or constraints, so it meets the baseline score for high schema coverage without enhancing parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and the resource 'rendered content (HTML, text, subject) of a previously sent message', making the purpose specific and understandable. However, it does not explicitly differentiate from sibling tools like 'get_message' or 'get_message_history', which reduces it from a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'get_message' or 'get_message_history'. It lacks context on prerequisites, exclusions, or specific scenarios for usage, leaving the agent without clear direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_message_historyARead-onlyInspect
Get the event history for a message, showing each step in the delivery pipeline (enqueued, sent, delivered, etc.).
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | Filter by event type | |
| message_id | Yes | The message ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions retrieving event history but doesn't disclose behavioral traits like whether this requires specific permissions, if it's paginated, rate-limited, or what format the history returns. For a read operation with no annotation coverage, this leaves significant gaps in understanding how the tool behaves.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that front-loads the core purpose and provides clarifying examples. Every word earns its place with no redundancy or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description adequately covers the purpose but lacks behavioral context and return value details. For a read tool with 2 parameters, it's minimally viable but leaves gaps in understanding the full operation, especially around output format and constraints.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters (message_id and type). The description doesn't add meaning beyond what's in the schema, such as explaining what event types are available or how the filtering works. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('event history for a message'), specifying it shows 'each step in the delivery pipeline' with examples like 'enqueued, sent, delivered, etc.' This distinguishes it from sibling tools like get_message (which likely retrieves message content/metadata) and get_message_content (which retrieves message body).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when needing delivery pipeline details for a specific message, but doesn't explicitly state when to use this versus alternatives like get_message or get_audit_event. No exclusions or prerequisites are mentioned, leaving some ambiguity about appropriate contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_notificationARead-onlyInspect
Retrieve a notification template by ID. Optionally request draft, published, or a version such as v001.
| Name | Required | Description | Default |
|---|---|---|---|
| version | No | Version to retrieve: draft, published, or a string like v001 | |
| notification_id | Yes | The notification template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the description's addition of version retrieval is useful but minimal. No further behavioral traits (e.g., access requirements) are disclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence with no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is adequate for a simple retrieval tool, but without an output schema, it could mention what the response contains. However, the version detail adds needed context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters. The description adds little beyond rephrasing 'Optionally request draft, published, or a version such as v001,' which is already in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Retrieve') and the resource ('notification template by ID'), and the optional version parameter distinguishes it from similar tools like get_notification_content.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage (retrieving a template by ID with optional version) but does not explicitly state when to use this tool versus siblings like get_notification_content or get_notification_draft_content.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_notification_contentCRead-onlyInspect
Get the published content blocks of a notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| notification_id | Yes | The notification template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but only states what the tool does without behavioral details. It doesn't disclose if this is a read-only operation, what permissions are needed, error handling, or response format, leaving significant gaps for a tool that likely accesses sensitive data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with no wasted words. It's front-loaded with the core action and resource, making it efficient and easy to parse, which is ideal for conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no annotations and no output schema, the description is incomplete. It lacks details on behavioral traits, usage context, and expected outputs, which are critical given the tool likely interacts with notification data and has siblings like 'get_notification_draft_content' that could cause confusion.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description doesn't add any parameter-specific information beyond what's in the schema, which has 100% coverage. The schema already documents the single required parameter 'notification_id' as 'The notification template ID', so the baseline score of 3 is appropriate as the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('published content blocks of a notification template'), making the purpose specific and understandable. However, it doesn't explicitly differentiate from sibling tools like 'get_notification_draft_content' or 'get_message_content', which reduces it from a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context, or exclusions, such as whether it requires specific permissions or differs from similar tools like 'get_notification_draft_content' for draft content.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_notification_draft_contentARead-onlyInspect
Get the draft (unpublished) content blocks of a notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| notification_id | Yes | The notification template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Get' and 'draft (unpublished) content blocks,' indicating a read-only operation, but lacks details on permissions, rate limits, error handling, or response format. For a tool with no annotations, this leaves significant gaps in understanding its behavior beyond basic purpose.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence that efficiently conveys the tool's purpose without unnecessary words. It is front-loaded with the key action and resource, making it easy to understand at a glance. Every part of the sentence contributes directly to the tool's definition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (1 parameter, no output schema, no annotations), the description is adequate for basic understanding but lacks completeness. It does not cover behavioral aspects like permissions or response format, which are important for a read operation. Without annotations or output schema, more context would be beneficial for full usability.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the single parameter 'notification_id' documented as 'The notification template ID.' The description does not add any additional meaning beyond this, such as format examples or constraints. With high schema coverage, the baseline score of 3 is appropriate, as the schema adequately handles parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Get') and resource ('draft (unpublished) content blocks of a notification template'), distinguishing it from sibling tools like 'get_notification_content' (which likely retrieves published content) and 'list_notifications' (which lists notifications rather than fetching content). The phrase 'draft (unpublished)' adds precision about the content's state.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage by specifying 'draft (unpublished) content blocks,' suggesting it should be used when working with unpublished notification templates. However, it does not explicitly state when to use this tool versus alternatives like 'get_notification_content' or provide exclusions (e.g., not for published content). The guidance is implied but not comprehensive.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_providerARead-onlyInspect
Fetch a single provider configuration by ID.
| Name | Required | Description | Default |
|---|---|---|---|
| provider_id | Yes | The provider configuration ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, and the description adds no further behavioral context (e.g., response format, error conditions). The description is neutral and does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, 7 words, zero fluff. Every word is necessary.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Simple tool with 1 param and no output schema. The description is adequate but could mention the return value (e.g., 'returns the provider configuration object').
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description does not add additional meaning beyond the schema's description for 'provider_id'. Baseline score applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'fetch', the resource 'provider configuration', and the scope 'by ID'. It distinguishes the tool from siblings like 'list_providers' and mutation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., list_providers). No prerequisites or exclusions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_routing_strategyARead-onlyInspect
Retrieve a routing strategy by ID. Returns the full entity including routing, channels, and providers.
| Name | Required | Description | Default |
|---|---|---|---|
| routing_strategy_id | Yes | The routing strategy ID (rs_ prefix) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true. Description adds that the return includes 'routing, channels, and providers', which is useful context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the main action, no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple retrieval tool with one parameter and readOnly annotation, the description sufficiently explains what the tool does and what it returns. No output schema, so the description covers return expectations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%; the schema already describes the parameter well. The description adds no new semantic meaning about the parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (retrieve), the resource (routing strategy), and the scope (by ID, returns full entity). It distinguishes from sibling tools like list_routing_strategies which likely returns summaries.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies use when an ID is available, but does not explicitly state when to prefer this over list_routing_strategies or other retrieval methods. No when-not or alternative guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_tenantCRead-onlyInspect
Get a tenant by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| tenant_id | Yes | The tenant ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states it's a read operation ('Get'), but doesn't disclose permissions, error handling, or response format. This is inadequate for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with no wasted words. It's front-loaded and efficiently conveys the core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read tool with no annotations and no output schema, the description is insufficient. It lacks details on return values, error cases, or behavioral context, leaving significant gaps for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents the 'tenant_id' parameter. The description adds minimal value by implying the parameter is used to retrieve a tenant, but no additional semantics beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('a tenant'), specifying it's by ID. It's specific but doesn't differentiate from sibling tools like 'list_tenants' or 'create_or_update_tenant' beyond the ID focus.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like 'list_tenants' or 'get_user_tenants'. The description implies usage when you have a specific tenant ID, but lacks explicit context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_tenant_templateBRead-onlyInspect
Get a tenant notification template association by template ID.
| Name | Required | Description | Default |
|---|---|---|---|
| tenant_id | Yes | The tenant ID | |
| template_id | Yes | The template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, and the description simply states 'Get', which is consistent. However, no additional behavioral traits (e.g., return format, pagination, permissions) are disclosed beyond the annotation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that immediately conveys the action and resource, with no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is adequate for a simple getter, but lacks information about the return value or structure. Given no output schema, the agent might need more context on what is returned.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema fully covers both parameters with descriptions (tenant_id and template_id). The description adds no extra meaning beyond the schema, which is expected given 100% coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and the resource 'tenant notification template association' identified by template ID. It distinguishes from siblings like get_tenant_template_version and list_tenant_templates by focusing on a single association.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives such as get_tenant_template_version or list_tenant_templates. The description lacks context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_tenant_template_versionARead-onlyInspect
Get a specific version of a tenant notification template (e.g. latest, published, or v1).
| Name | Required | Description | Default |
|---|---|---|---|
| version | Yes | Version identifier (latest, published, or v-prefixed) | |
| tenant_id | Yes | The tenant ID | |
| template_id | Yes | The template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare 'readOnlyHint=true', so the read-only nature is covered. The description adds no further behavioral context beyond stating it is a 'Get' operation. No contradictions or additional behavioral traits disclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence that is front-loaded with the action and resource. Every word serves a purpose, with no redundancy or irrelevant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the purpose and provides examples, but lacks details about the return format or error conditions. For a simple retrieval tool with no output schema, it is mostly complete but could mention what data the response contains.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and descriptions are provided for all parameters. The description's example ('latest, published, or v1') reinforces the version parameter's allowed values but does not add new meaning beyond what the schema's description already says.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and the resource 'specific version of a tenant notification template' with concrete examples like 'latest, published, or v1'. It distinguishes from siblings like 'get_tenant_template' and 'list_tenant_templates' by focusing on version retrieval.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use for fetching a specific template version but does not explicitly state when to use this tool over alternatives like 'get_tenant_template' or list tools. No when-not-to-use or prerequisite information is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_translationCRead-onlyInspect
Get a translation for a specific locale (e.g. "en_US", "fr_FR").
| Name | Required | Description | Default |
|---|---|---|---|
| domain | No | Translation domain (only "default" is supported currently) | default |
| locale | Yes | Locale code (e.g. en_US, fr_FR) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states what the tool does but doesn't describe important behaviors: whether this is a read-only operation, what format the translation returns, whether it's cached, what happens with invalid locales, or authentication requirements. The description is minimal and lacks behavioral context beyond the basic function.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that communicates the core purpose without unnecessary words. It's appropriately sized for a simple retrieval tool and front-loads the essential information. Every word earns its place in this concise formulation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no annotations and no output schema, the description is insufficiently complete. It doesn't explain what format the translation returns, whether it's a single string or structured data, what happens with missing translations, or any error conditions. The description leaves too many open questions about how the tool behaves in practice.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already fully documents both parameters. The description adds minimal value beyond the schema - it mentions locale examples that match the schema's description, but doesn't explain the relationship between domain and locale or provide additional context about translation domains. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('translation') with specific scope ('for a specific locale'). It distinguishes itself from sibling tools like 'update_translation' by focusing on retrieval rather than modification. However, it doesn't explicitly differentiate from other get_* tools that might retrieve different resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, when-not-to-use scenarios, or comparison with sibling tools like 'get_user_preferences' or 'get_message_content' that might also involve localized content. The example locales are helpful but don't constitute usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_user_list_subscriptionsCRead-onlyInspect
Get all list subscriptions for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool retrieves data ('Get'), implying a read-only operation, but doesn't specify permissions, rate limits, pagination behavior (despite a 'cursor' parameter), or response format. This is inadequate for a tool with parameters and no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's front-loaded and wastes no space, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and two parameters (one optional for pagination), the description is incomplete. It doesn't address behavioral aspects like pagination, error handling, or return values, leaving significant gaps for the agent to operate effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear parameter descriptions in the schema. The description adds no additional meaning beyond implying the 'user_id' is used to fetch subscriptions, but doesn't explain parameter interactions or usage. With high schema coverage, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('all list subscriptions for a user'), making the purpose immediately understandable. However, it doesn't distinguish this tool from sibling tools like 'get_list_subscribers' or 'list_user_tenants', which also retrieve user-related data, so it lacks sibling differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., user authentication), exclusions, or compare it to similar tools like 'get_list_subscribers' or 'list_user_tenants', leaving the agent to infer usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_user_preferencesBRead-onlyInspect
Get a user's notification preferences (subscriptions, opt-outs, channel preferences).
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID | |
| tenant_id | No | Scope preferences to a specific tenant |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool retrieves data ('Get'), implying it's a read operation, but doesn't specify if it requires authentication, rate limits, pagination, error handling, or what format the returned preferences take. For a read tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose. Every word earns its place by specifying the resource and its subcomponents without redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read tool with 2 parameters and 100% schema coverage, the description is minimally adequate. However, with no annotations and no output schema, it lacks details on authentication needs, return format, or error cases. It meets basic needs but leaves contextual gaps that could hinder an agent's effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents both parameters (user_id and tenant_id). The description doesn't add any parameter-specific details beyond what's in the schema (e.g., it doesn't explain how tenant_id affects the output). Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('user's notification preferences') with specific subcategories (subscriptions, opt-outs, channel preferences). It distinguishes from most siblings (e.g., get_user_profile_by_id, get_user_push_token) by focusing on preferences, but doesn't explicitly differentiate from update_user_preference_topic, which is a related but distinct operation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like get_user_profile_by_id (which might include preferences) or update_user_preference_topic. The description implies usage for retrieving notification preferences but offers no context about prerequisites, error conditions, or when other tools might be more appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_user_preference_topicARead-onlyInspect
Get a user's preference for a specific subscription topic.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID | |
| topic_id | Yes | The subscription topic ID | |
| tenant_id | No | Scope to a specific tenant |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true. Description adds 'Get' which aligns, but no additional behavioral context (e.g., whether tenant_id is optional, result format). Minimal value beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, no redundant words, front-loaded with action and resource. Highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple getter with read-only annotations, the description is mostly complete. However, it lacks info on return value format and optionality of tenant_id, which would be helpful for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with parameter descriptions like 'The user ID'. Description does not add any additional meaning beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Get a user's preference for a specific subscription topic', which is a specific verb+resource combination. It effectively distinguishes from sibling tools like 'get_user_preferences' (plural) and 'update_user_preference_topic'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool vs alternatives (e.g., 'get_user_preferences' for all preferences). Usage is implied but not stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_user_profile_by_idBRead-onlyInspect
Get a user profile by their ID. Returns profile data including email, phone, and custom properties.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID to look up |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool returns profile data including email, phone, and custom properties, which adds some behavioral context beyond the basic 'get' action. However, it lacks critical details: it doesn't specify authentication requirements, error handling (e.g., for invalid IDs), rate limits, or whether the data is real-time or cached. For a read operation with no annotations, this leaves significant gaps in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the core purpose and followed by a brief note on return data. Every word earns its place with zero redundancy or fluff. It efficiently communicates the essential information without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (single parameter, no output schema, no annotations), the description is minimally adequate. It covers the basic purpose and return data, but lacks context on usage, behavioral traits, or error handling. Without annotations or an output schema, the description should do more to compensate, but it only partially meets the needs for a standalone tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'user_id' fully documented in the schema as 'The user ID to look up'. The description adds no additional meaning beyond this, such as format examples (e.g., UUID) or sourcing guidance. Given the high schema coverage, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('user profile') with a specific lookup method ('by their ID'). It distinguishes from siblings like 'get_user_preferences' or 'get_user_push_token' by focusing on profile data. However, it doesn't explicitly differentiate from potential profile-related siblings like 'replace_profile' or 'delete_profile' beyond the read vs. write distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a valid user ID), exclusions (e.g., not for bulk lookups), or direct alternatives among siblings like 'list_user_tenants' for related data. The description assumes the context is obvious without explicit usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_user_push_tokenCRead-onlyInspect
Get a specific push/device token for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| token | Yes | The token identifier | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states it 'gets' a token, implying a read-only operation, but doesn't clarify if this requires specific permissions, what happens if the token doesn't exist (e.g., returns null or error), or any rate limits. The description is minimal and misses key behavioral details for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's front-loaded with the core action and resource, making it easy to parse quickly. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a tool that retrieves specific data. It doesn't explain what the output looks like (e.g., token details or error handling), behavioral constraints, or how it differs from siblings. For a read operation with two required parameters, more context is needed to be fully helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('user_id' and 'token') clearly documented in the schema. The description adds no additional meaning beyond what the schema provides, such as explaining the relationship between user_id and token or format examples. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and the resource ('a specific push/device token for a user'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'list_user_push_tokens' or 'create_or_replace_user_push_token', which would require mentioning it retrieves a single token by identifier rather than listing all tokens or modifying them.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. For example, it doesn't mention that 'list_user_push_tokens' should be used to retrieve all tokens for a user, or that 'create_or_replace_user_push_token' is for creating/updating tokens. The description lacks context about prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
invoke_ad_hoc_automationBInspect
Invoke an ad-hoc automation with inline steps (no template needed).
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | ||
| brand | No | ||
| profile | No | ||
| template | No | ||
| recipient | No | ||
| automation | Yes | The automation definition |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions 'invoke' but doesn't disclose behavioral traits like whether this is a read/write operation, permissions required, rate limits, error handling, or what 'invoke' entails (e.g., execution, side effects). The description is minimal and misses critical context for a tool that likely performs actions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that is front-loaded with the core purpose. There is no wasted wording, making it highly concise and well-structured for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (6 parameters, nested objects, no output schema, and no annotations), the description is incomplete. It lacks details on behavior, parameter usage, expected outcomes, and error conditions. For a tool that invokes automations with multiple inputs, this minimal description fails to provide sufficient context for safe and effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is low (17%), with only 'steps' and 'cancelation_token' having descriptions. The description adds no parameter semantics beyond implying 'automation' is required for inline steps. It doesn't explain other parameters like 'data', 'brand', 'profile', 'template', or 'recipient', leaving most of the 6 parameters undocumented and unclear in purpose.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('invoke') and resource ('ad-hoc automation'), specifying it uses 'inline steps (no template needed)'. This distinguishes it from sibling tools like 'invoke_automation_template', which likely requires a template. However, it doesn't fully differentiate from other automation-related tools beyond the template aspect.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by stating 'no template needed', suggesting this tool is for ad-hoc automations without predefined templates. It indirectly contrasts with 'invoke_automation_template', but lacks explicit guidance on when to use this versus other automation or messaging tools, or any prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
invoke_automation_templateCInspect
Invoke an automation run from an existing automation template.
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Data to pass to the automation | |
| brand | No | Brand ID override | |
| profile | No | Profile data for the recipient | |
| template | No | Notification template override | |
| recipient | Yes | Recipient user ID | |
| template_id | Yes | The automation template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions 'invoke an automation run' which implies execution/triggering behavior, but provides no information about permissions required, rate limits, whether this is a synchronous or asynchronous operation, what happens on failure, or what the expected output looks like.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that gets straight to the point with zero wasted words. It's appropriately sized for the tool's complexity and front-loads the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool that executes automations with 6 parameters (including nested objects) and no annotations or output schema, the description is inadequate. It doesn't explain what an 'automation run' entails, what happens after invocation, error handling, or provide any context about the automation system this interacts with.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so all parameters are documented in the schema itself. The description doesn't add any additional parameter semantics beyond what's already in the schema descriptions. The baseline of 3 is appropriate when the schema does the heavy lifting for parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('invoke an automation run') and the resource ('from an existing automation template'), providing a specific verb+resource combination. However, it doesn't differentiate from sibling 'invoke_ad_hoc_automation' which appears to be a related alternative tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided about when to use this tool versus alternatives like 'invoke_ad_hoc_automation' or other automation-related tools. The description simply states what the tool does without any context about appropriate usage scenarios or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
invoke_journeyCInspect
Invoke a journey run from a journey template. Triggers the automation workflow for the specified user.
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Data payload passed to the journey for conditions and template variables | |
| profile | No | Profile data for the user (email, phone, custom fields) | |
| user_id | No | Recipient user ID. Can also be resolved from profile or data. | |
| template_id | Yes | The journey template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate a write operation (readOnlyHint=false) and non-idempotent (idempotentHint=false). The description adds only that it triggers automation for a user, but lacks details on side effects, permissions, or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, direct, and free of unnecessary words. It efficiently conveys core functionality.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (4 parameters, nested objects, no output schema), the description is too sparse. It omits return value information and does not explain the role of nested data parameters.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all parameters. The description adds minimal value beyond the schema, only hinting at user specification without clarifying parameter mapping.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (invoke) and resource (journey run from a journey template). However, it does not differentiate from sibling tools like invoke_ad_hoc_automation or invoke_automation_template, which share similar verbs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_audience_membersCRead-onlyInspect
List all members of an audience.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| audience_id | Yes | The audience ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. It states the action ('List all members') but doesn't describe return format, pagination behavior (despite a 'cursor' parameter in the schema), rate limits, authentication needs, or error conditions. For a list operation with no annotation coverage, this leaves significant gaps in understanding how the tool behaves.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the core purpose without unnecessary words. It's front-loaded with the essential information ('List all members of an audience') and contains no redundant or verbose elements. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a tool with two parameters (one required). It doesn't explain what 'members' entails (e.g., user objects, IDs), how pagination works with the cursor, or what the return structure looks like. For a list operation in a context with many sibling tools, more contextual detail would help the agent use it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('audience_id' and 'cursor') documented in the schema. The description doesn't add any meaningful semantics beyond what the schema provides—it mentions 'audience' but doesn't clarify the ID format or pagination usage. With high schema coverage, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('members of an audience'), making the purpose immediately understandable. It doesn't explicitly differentiate from sibling tools like 'get_audience' or 'list_audiences', but the specificity of 'members' provides some distinction. The description avoids tautology by not just restating the tool name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'get_list_subscribers' or 'list_user_tenants' that might serve similar purposes, nor does it specify prerequisites or contexts for usage. The agent must infer usage from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_audiencesCRead-onlyInspect
List all audiences in the workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. It states it's a list operation, implying read-only behavior, but doesn't mention pagination (despite a 'cursor' parameter in the schema), rate limits, authentication requirements, or what 'all audiences' entails (e.g., archived vs. active). For a tool with no annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose ('List all audiences in the workspace'). There's no wasted language or redundancy. It's appropriately sized for a simple list tool, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and a simple but incomplete description, the tool definition is inadequate for reliable agent use. The description doesn't cover pagination behavior (implied by the cursor parameter), return format, or error conditions. For a list operation with pagination, this leaves the agent guessing about how to handle multiple pages of results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the single parameter 'cursor' documented as 'Pagination cursor' in the schema. The description adds no additional parameter information beyond what's in the schema. According to scoring rules, when schema coverage is high (>80%), the baseline is 3 even with no param info in the description, which applies here.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('all audiences in the workspace'), making the purpose immediately understandable. It doesn't explicitly differentiate from sibling tools like 'get_audience' (singular) or 'list_audience_members', but the scope is clear. This is a straightforward read operation with unambiguous intent.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With siblings like 'get_audience' (for a specific audience) and 'list_audience_members' (for members within an audience), there's no indication of when this list-all operation is appropriate versus more targeted queries. The agent must infer usage from tool names alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_audit_eventsBRead-onlyInspect
List audit events in the workspace. Useful for tracking API usage and changes.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions the tool is 'useful for tracking API usage and changes,' which hints at read-only behavior but does not explicitly state it. Critical details like pagination behavior (implied by the 'cursor' parameter), rate limits, authentication needs, or response format are missing, making it inadequate for a mutation-free tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two sentences that are front-loaded and to the point. The first sentence states the purpose, and the second adds context without redundancy. However, it could be slightly more structured by explicitly mentioning the pagination aspect, which would enhance clarity without sacrificing brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity is low (single optional parameter, no output schema), the description is minimally adequate. It covers the basic purpose and usage hint but lacks details on behavioral aspects like pagination, response format, or error handling. With no annotations and no output schema, it should do more to compensate, but the simplicity of the tool keeps it from being severely incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the 'cursor' parameter documented as 'Pagination cursor.' The description does not add any meaning beyond this, as it mentions no parameters. Given the high schema coverage, the baseline score of 3 is appropriate, as the schema handles the parameter documentation adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('audit events in the workspace'), making the purpose specific and understandable. It distinguishes from sibling tools like 'get_audit_event' (singular) by implying a collection operation. However, it doesn't explicitly differentiate from other list tools (e.g., 'list_audiences', 'list_messages'), which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implied usage context with 'Useful for tracking API usage and changes,' suggesting when this tool might be applied. However, it lacks explicit guidance on when to use this versus alternatives (e.g., 'get_audit_event' for a single event or other list tools), and does not mention prerequisites or exclusions, leaving gaps in decision-making.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_automationsARead-onlyInspect
List automation templates in the workspace. Optionally filter by version.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| version | No | Filter by version state |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the description adds limited behavioral context by mentioning version filtering but does not describe pagination, performance, or other notable behaviors. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two short, front-loaded sentences with no wasted words. The first sentence captures the core purpose, the second adds a useful detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with good annotations and full schema coverage, the description is complete enough. It could optionally mention pagination via cursor, but the schema covers that. No output schema reduces needed detail.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are already documented. The description only restates the version filter without adding new semantic detail beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (list), the resource (automation templates), and an optional filter (by version). It distinguishes from sibling list tools like list_audiences by specifying 'automation templates'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like invoke_automation_template or other list tools. No when-not-to-use or explicit context is given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_brandsBRead-onlyInspect
List all brands in the workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states it's a list operation, implying read-only behavior, but doesn't mention pagination (despite a 'cursor' parameter in the schema), rate limits, authentication requirements, or what 'all brands' entails (e.g., archived vs. active). This leaves significant gaps for an agent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with zero waste—it directly states the tool's purpose without fluff or redundancy. Every word earns its place, making it highly efficient for an agent to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (1 optional parameter, no output schema, no annotations), the description is minimally adequate but incomplete. It covers the basic purpose but lacks behavioral details (e.g., pagination, scope) and usage guidelines, which are needed for full agent understanding in a context with many sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'cursor' documented as 'Pagination cursor' in the schema. The description adds no additional parameter information beyond implying it lists 'all brands' (which doesn't clarify the cursor's role). Baseline is 3 since the schema does the heavy lifting, but the description doesn't compensate with extra context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('all brands in the workspace'), making the purpose immediately understandable. It doesn't explicitly differentiate from sibling tools like 'get_brand' (singular vs. plural) or 'list_audiences' (different resource type), but the scope is unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'get_brand' (for a single brand) or other list tools (e.g., 'list_audiences'). The description implies it's for retrieving multiple brands, but lacks explicit context about prerequisites, filtering, or comparisons to siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_bulk_usersCRead-onlyInspect
List the users in a bulk job.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| job_id | Yes | The bulk job ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states a read operation ('list'), implying no destructive effects, but doesn't disclose behavioral traits like pagination (hinted by the 'cursor' parameter in schema), rate limits, authentication needs, or return format. This leaves significant gaps for a tool with parameters and no output schema, making it minimally informative.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste, front-loading the core purpose. It's appropriately sized for a simple list tool, though it could benefit from slightly more detail without losing conciseness. Every word earns its place, but it's borderline under-specified rather than optimally concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given complexity (2 parameters, no annotations, no output schema), the description is incomplete. It doesn't explain return values, pagination behavior, or error conditions, leaving the agent to infer from the schema alone. For a tool with no structured output and minimal behavioral disclosure, this is inadequate—it should provide more context to guide effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with parameters 'job_id' and 'cursor' fully documented in the schema. The description adds no meaning beyond the schema—it doesn't explain parameter interactions, format details, or usage examples. Baseline is 3 since the schema does the heavy lifting, but the description doesn't compensate or enhance understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'List the users in a bulk job' clearly states the action (list) and resource (users in a bulk job), but it's vague about scope—it doesn't specify if it lists all users, paginated results, or filtered subsets. It distinguishes from siblings like 'get_bulk_job' (which likely returns job metadata) but not explicitly from 'list_audience_members' or other list tools, lacking precise differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a bulk job ID), exclusions, or comparisons to siblings like 'list_user_tenants' or 'get_user_profile_by_id'. The description implies usage for bulk job contexts but offers no explicit when/when-not rules or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_journeysARead-onlyInspect
List journey templates in the workspace. Optionally filter by version (published or draft).
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| version | No | Filter by version state. Defaults to published. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the description adds limited behavioral context beyond the read action and optional filtering. It does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two short sentences, front-loading the primary purpose with no extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (read-only, two optional parameters), the description covers the main purpose and version filter. However, it does not mention pagination behavior, which is relevant.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description mentions version filtering but does not add significant meaning beyond what the schema provides, such as default values or cursor usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'List journey templates in the workspace' using a specific verb and resource. It is distinct from sibling list tools, though it does not explicitly differentiate itself.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates optional filtering by version, providing context on usage. However, it lacks guidance on when not to use this tool or alternatives, and omits pagination details.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_listsCRead-onlyInspect
Get all lists. Optionally filter by pattern (e.g. 'example.list.*').
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| pattern | No | Filter pattern (e.g. 'example.list.*') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions 'Get all lists' and optional filtering, but lacks critical behavioral details such as pagination behavior (implied by the 'cursor' parameter in the schema), rate limits, authentication requirements, or whether it's read-only. This leaves significant gaps for an agent to understand how to use it effectively.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—two short sentences that directly state the tool's function and optional feature. It is front-loaded with the core purpose and wastes no words, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a tool with two parameters. It fails to address key contextual elements like pagination behavior (implied by 'cursor'), response format, error handling, or usage constraints, leaving the agent with insufficient information for reliable invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('cursor' for pagination, 'pattern' for filtering). The description adds minimal value by mentioning the 'pattern' parameter with an example, but doesn't provide additional context beyond what's in the schema, such as pattern syntax details or cursor usage scenarios.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and resource 'all lists', making the purpose specific and understandable. It distinguishes from sibling 'get_list' by indicating it retrieves multiple lists rather than a single one, though it doesn't explicitly contrast with other list-related tools like 'list_audiences' or 'list_notifications'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It mentions optional filtering but doesn't specify scenarios where filtering is appropriate or compare it to other list-related tools like 'get_list' for single retrieval or 'list_audiences' for different resource types.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_messagesCRead-onlyInspect
List messages you've previously sent. Filter by status, recipient, notification, provider, tags, or tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| tag | No | Filter by metadata tags | |
| list | No | Filter by list ID | |
| tags | No | Comma-delimited list of tags | |
| event | No | Filter by event ID | |
| cursor | No | Pagination cursor for fetching the next page | |
| status | No | Filter by status (e.g. DELIVERED, UNDELIVERABLE) | |
| traceId | No | Filter by trace ID | |
| archived | No | Include archived messages | |
| provider | No | Filter by provider key (e.g. sendgrid, twilio) | |
| messageId | No | Filter by message ID | |
| recipient | No | Filter by recipient user ID | |
| tenant_id | No | Filter by tenant ID | |
| notification | No | Filter by notification ID | |
| enqueued_after | No | ISO 8601 timestamp; only return messages enqueued after this time |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions filtering but doesn't describe critical behaviors: whether this is a read-only operation, if it supports pagination (though 'cursor' parameter hints at it), rate limits, authentication requirements, or what the return format looks like. For a list operation with 14 parameters, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose. It could be slightly improved by structuring filtering examples more clearly, but there's no wasted verbiage and it gets straight to the point.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a list operation with 14 parameters and no annotations or output schema, the description is incomplete. It doesn't explain the response format, pagination behavior, error conditions, or how multiple filters interact. Given the complexity and lack of structured metadata, more contextual information is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all 14 parameters. The description adds minimal value by listing some filterable fields (status, recipient, etc.) but doesn't provide additional context beyond what's in the schema. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List messages you've previously sent') and resource ('messages'), making the purpose immediately understandable. However, it doesn't differentiate this tool from sibling tools like 'get_message' or 'get_message_history', which also retrieve message-related data, so it doesn't achieve full sibling differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions filtering capabilities but provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, limitations, or comparison with sibling tools like 'get_message' (for single messages) or 'list_notifications' (for notifications). This leaves the agent without context for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_notification_checksCRead-onlyInspect
List checks for a notification submission.
| Name | Required | Description | Default |
|---|---|---|---|
| submission_id | Yes | The submission ID for the checks resource | |
| notification_id | Yes | The notification template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, indicating safe read operation. The description adds no additional behavioral traits (e.g., pagination, order, filtering). With annotations covering read-only, the description could have contributed more context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, no wasted words, and front-loaded with purpose. Could benefit from slightly more detail but remains efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with 2 parameters and no output schema, the description covers the basic purpose. However, it lacks explanation of what 'checks' are or relationship to notification submissions, which would aid understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both parameters have descriptions). The tool description adds no extra meaning beyond what the schema provides. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the action and resource ('List checks for a notification submission'), using verb+resource format. However, it does not differentiate from sibling tools like update_notification_checks, which is the only closely related tool. A bit more specificity could improve clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives, such as listing checks via another method or when checks are available. The description lacks context for the AI agent to decide usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_notificationsCRead-onlyInspect
List notification templates. Optionally filter by cursor.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions pagination via cursor but doesn't disclose other behavioral traits such as rate limits, authentication needs, return format, or whether it's read-only. For a list tool with zero annotation coverage, this leaves significant gaps in understanding its operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It front-loads the core purpose and includes the optional parameter detail concisely, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete. It lacks details on return values, error handling, or other contextual aspects needed for a list operation. While it covers the basic action, it doesn't provide enough information for reliable tool invocation in a complex environment.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents the 'cursor' parameter as a pagination cursor. The description adds minimal value by noting it's optional for filtering, but doesn't provide additional semantics beyond what the schema states, aligning with the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('notification templates'), making the purpose evident. However, it doesn't explicitly differentiate from sibling tools like 'list_messages' or 'list_audiences', which also list resources, so it misses full sibling distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions optional filtering by cursor, implying usage for pagination, but provides no guidance on when to use this tool versus alternatives like 'get_notification_content' or other list tools. There are no explicit when/when-not instructions or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_notification_versionsBRead-onlyInspect
List version history for a notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max versions per page (default 10, max 10) | |
| cursor | No | Pagination cursor from a previous response | |
| notification_id | Yes | The notification template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already set readOnlyHint=true, so the read-only nature is clear. The description adds no further behavioral context (e.g., pagination, ordering, or result structure). Adequate but minimal beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no waste. Efficient and direct, though it could include a structured format. Still, it is concise and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema and no description of return values (e.g., list of versions, pagination details). Lacks context on what the response contains, ordering, or limit behavior. Incomplete for a list operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage for all 3 parameters. The tool description adds no extra meaning; the schema already explains each parameter's role. Baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'list' and resource 'version history for a notification template'. It distinguishes this from sibling tools like list_notifications (which lists notifications) and get_notification (which gets a single notification).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives. Does not mention when to use list_notification_versions vs. get_notification or list_notifications. Lacks context on prerequisites or scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_provider_catalogARead-onlyInspect
List available provider types from the catalog with their configuration schemas.
| Name | Required | Description | Default |
|---|---|---|---|
| keys | No | Comma-separated provider keys to filter by | |
| name | No | Substring match on provider name | |
| channel | No | Filter by channel type (email, sms, push, etc.) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true. Description adds that it returns configuration schemas but no further behavioral details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no redundant words; front-loaded with action and resource.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, and description does not elaborate on return shape or pagination; adequate but could be more complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage, so description adds no extra meaning beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'list' and resource 'provider types from catalog', clearly distinguishing from siblings like 'list_providers' and 'get_provider'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies listing catalog types vs. instances, but no explicit guidance on when to use this tool over siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_providersARead-onlyInspect
List configured provider integrations for the workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true. Description adds 'configured' context but does not disclose pagination behavior or other traits beyond the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 7 words, efficiently conveying purpose without unnecessary detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple list tool with one optional parameter. No output schema but return format is implied. Could mention pagination explicitly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with parameter 'cursor' already described. Tool description adds no additional parameter meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'List' and resource 'configured provider integrations', distinguishing it from sibling tools like 'create_provider' and 'list_provider_catalog'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives such as 'get_provider' or 'list_provider_catalog'. Lacks explicit context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_routing_strategiesARead-onlyInspect
List routing strategies in the workspace. Returns metadata only; use get for full details.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results per page (default 20, max 100) | |
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true. Description adds that it returns metadata only, which is useful beyond annotations. No mention of auth or rate limits, but acceptable for a list operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words, front-loaded with purpose and immediate guidance. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple two-parameter schema and read-only annotation, the description covers all necessary information: scope, what is returned, and pointer to more details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters. Description adds no extra parameter information beyond what's in the schema, meeting the baseline without improvement.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it lists routing strategies in the workspace, with a specific verb and resource. Distinguishes from get_routing_strategy by noting it returns only metadata.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context: listing metadata only, and explicitly directs to use get for full details. No when-not-to-use statement, but adequate for differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_routing_strategy_notificationsARead-onlyInspect
List notification templates associated with a routing strategy. Useful for checking linked templates before archiving.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results per page (default 20, max 100) | |
| cursor | No | Pagination cursor | |
| routing_strategy_id | Yes | The routing strategy ID (rs_ prefix) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, description adds minimal behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Short, direct, no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Sufficient for a straightforward list operation with well-documented schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers parameter descriptions fully, description adds no additional parameter semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Describes listing notification templates for a routing strategy, clearly distinguishes from sibling tools like list_routing_strategies.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides a use case (before archiving) but no explicit exclusions or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_tenantsBRead-onlyInspect
List all tenants in the workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results per page | |
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states 'List all tenants' but doesn't disclose behavioral traits such as pagination behavior (implied by 'limit' and 'cursor' parameters), authentication requirements, rate limits, or what 'all' means in practice (e.g., includes archived tenants?). This leaves significant gaps for a tool with pagination parameters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's appropriately sized and front-loaded, clearly stating the tool's purpose without unnecessary elaboration, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (list operation with pagination), no annotations, and no output schema, the description is minimally adequate. It states what the tool does but lacks details on behavioral context, output format, or error handling, leaving the agent with incomplete information for reliable invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('limit' and 'cursor') well-documented in the schema. The description adds no additional parameter semantics beyond what the schema provides, so it meets the baseline of 3 where the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List') and resource ('all tenants in the workspace'), providing a specific verb+resource combination. However, it doesn't differentiate from sibling tools like 'get_tenant' (singular retrieval) or 'list_user_tenants' (user-specific listing), which would require explicit sibling differentiation for a score of 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, exclusions, or compare to siblings like 'get_tenant' for single tenant retrieval or 'list_user_tenants' for user-specific listings, leaving the agent without contextual usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_tenant_templatesBRead-onlyInspect
List notification templates configured for a tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results per page (default 20, max 100) | |
| cursor | No | Pagination cursor | |
| tenant_id | Yes | The tenant ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, consistent with the description. The description adds no further behavioral traits (e.g., pagination behavior, performance characteristics).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, no redundant information. Every word serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema, the description could mention what the response contains (e.g., list of template objects or IDs). However, for a simple list tool with well-defined parameters, the description is minimally adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage for all 3 parameters. The description does not add additional meaning beyond what the schema already provides, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly identifies the action (list), resource (notification templates), and scope (for a tenant). It distinguishes from sibling tools like get_tenant_template and publish_tenant_template, though it doesn't explicitly differentiate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like get_tenant_template or list messages. No exclusion criteria or context for when this tool is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_tenant_usersBRead-onlyInspect
List users associated with a tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results per page (default 20, max 100) | |
| cursor | No | Pagination cursor | |
| tenant_id | Yes | The tenant ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description 'List' is consistent with the readOnlyHint annotation, but it adds no further behavioral details such as pagination behavior, rate limits, or the scope of users returned. The annotation already conveys safety, so the description adds minimal value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence with no redundant information. It is perfectly sized for its simplicity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with good annotations and full schema coverage, the description is minimally adequate. However, it could mention pagination or provide a hint about the output format. Given the context, a score of 3 is fair.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage for all three parameters (tenant_id, limit, cursor). The description adds no additional meaning beyond the schema, so a baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action 'List users' and the resource 'associated with a tenant'. It directly conveys the tool's function, though it does not explicitly differentiate from sibling tools like list_user_tenants or list_audience_members.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives such as get_tenant for retrieving tenant details, add_user_to_tenant for adding users, or list_tenants for listing all tenants.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_user_push_tokensCRead-onlyInspect
List all push/device tokens for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool lists tokens but fails to mention whether this is a read-only operation, if it requires specific permissions, what the output format looks like (e.g., pagination, token details), or any rate limits. This leaves significant gaps in understanding the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and wastes no space, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete for a tool that likely returns sensitive data (push tokens). It doesn't cover behavioral aspects like security implications, response structure, or error handling, leaving the agent with insufficient context to use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the 'user_id' parameter clearly documented. The description adds no additional meaning beyond what the schema provides, such as clarifying the token scope or user context. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List') and resource ('push/device tokens for a user'), making the purpose immediately understandable. However, it doesn't explicitly differentiate from sibling tools like 'get_user_push_token' (singular vs. plural), which would require a more specific distinction to achieve a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as 'get_user_push_token' for retrieving a single token or other user-related tools. It lacks context about prerequisites, timing, or exclusions, leaving the agent without usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_user_tenantsCRead-onlyInspect
List all tenants a user belongs to.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results per page | |
| cursor | No | Pagination cursor | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states it's a list operation but doesn't mention pagination behavior (though schema hints at it via limit/cursor), authentication requirements, rate limits, error conditions, or what the output looks like. For a tool with 3 parameters and no annotation coverage, this is inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the core purpose without any fluff. It's appropriately sized for a simple list operation and front-loads the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 3 parameters, no annotations, and no output schema, the description is insufficient. It doesn't explain the return format, pagination strategy, error handling, or relationship to sibling tools. The agent lacks critical context to use this tool effectively beyond basic parameter passing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are fully documented in the schema. The description adds no additional parameter semantics beyond implying the user_id is for filtering tenants. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('tenants a user belongs to'), making the purpose immediately understandable. It doesn't specifically differentiate from sibling tools like 'list_tenants' (which appears to list all tenants rather than user-specific ones), but the user-specific focus is implied through the parameter requirement.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'list_tenants' or 'get_tenant'. It doesn't mention prerequisites, context for user_id selection, or any exclusions. The agent must infer usage from the parameter requirement alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
patch_profileAInspect
Partially update a user profile via JSON Patch (RFC 6902). Use add/replace/remove operations on specific profile paths.
| Name | Required | Description | Default |
|---|---|---|---|
| patch | Yes | Array of JSON Patch operations to apply to the profile | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=false and idempotentHint=false, so the agent knows it's a mutable, non-idempotent operation. The description adds that it follows JSON Patch standard, which implies atomicity and error handling, but no further behavioral details (e.g., auth, side effects).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no redundant information. Front-loaded with action verb and resource, followed by protocol detail. Every word adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With low complexity (2 required params, no output schema), the description is largely complete. It could mention what happens on invalid patch or missing user, but 4 is appropriate given the clear intent and annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the schema already describes both parameters (user_id, patch) with types and descriptions. The description mentions operations and paths, providing context but not adding new meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it partially updates a user profile using JSON Patch (RFC 6902), specifying verb 'partially update' and resource 'user profile'. It distinguishes from sibling tools like replace_profile (full replacement) and delete_profile (deletion), making purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says to use add/replace/remove operations for specific profile paths, implying partial updates. However, it does not explicitly compare to alternatives like replace_profile or state when not to use this tool, but the context is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
patch_user_tokenAInspect
Apply a JSON Patch (RFC 6902) to a specific push token.
| Name | Required | Description | Default |
|---|---|---|---|
| patch | Yes | Array of JSON Patch operations | |
| token | Yes | The token identifier | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=false and idempotentHint=false, but the description adds no further behavioral context (e.g., atomicity, partial failure, version tracking). For a mutation tool, more disclosure is expected.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that is direct and front-loaded, with no superfluous information. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has 3 required parameters and no output schema. While the description covers the core action, it does not mention return values or side effects (e.g., whether the token is updated in place). Adequate but not comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for all parameters (patch, token, user_id). The description adds no additional meaning beyond 'specific push token', so a baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (apply a JSON Patch), the standard (RFC 6902), and the target (specific push token). This precisely defines the tool's purpose and distinguishes it from sibling tools like 'delete_user_token' or 'create_or_replace_user_push_token'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for applying a patch to a push token but does not provide when to use this tool vs. alternatives such as 'create_or_replace_user_push_token' or 'bulk_add_user_tokens'. No explicit guidance on context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
publish_notificationAInspect
Publish a notification template. Optionally publish a specific historical version instead of the current draft.
| Name | Required | Description | Default |
|---|---|---|---|
| version | No | Historical version to publish (e.g. v001); omit to publish current draft | |
| notification_id | Yes | The notification template ID to publish |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate mutation (readOnlyHint=false) and non-idempotency. Description adds context about publishing a template and an optional historical version, confirming it's a write operation. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, no fluff. Front-loaded with the primary action. Every word adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with two parameters and no output schema, the description covers the core functionality and optional versioning. Lacks mention of return values or error conditions, but adequate for the complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with parameter descriptions. The tool description adds minimal extra meaning ('Optionally publish a specific historical version') beyond what the schema already provides, meeting the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the verb 'Publish' and the resource 'notification template'. It distinguishes from sibling 'publish_tenant_template' by specifying the resource type. It also mentions the optional historical version feature, adding precision.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives like 'publish_tenant_template' or 'send_message'. The context of publishing is implied but lacks when-not-to-use or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
publish_tenant_templateCInspect
Publish a version of a tenant notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| version | No | Version to publish (e.g. v1, latest); defaults to latest if omitted | |
| tenant_id | Yes | The tenant ID | |
| template_id | Yes | The template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations indicate a write operation (readOnlyHint: false), and the description implies mutation. However, it does not disclose side effects like whether it overwrites existing published versions, requires the version to exist, or any constraints. More context is needed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence with no wasted words. It is front-loaded with the key action. However, it omits necessary details, so it leans more toward under-specification than true conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and the presence of sibling tools, the description is insufficient. It does not clarify the return value, confirmation, or how publishing affects the template lifecycle. More completeness is needed for a mutation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the description adds no value beyond parameter names and descriptions. The 'version' parameter with default 'latest' is important but not highlighted in the description. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses the verb 'Publish' and the resource 'version of a tenant notification template', clearly indicating the action and object. However, it does not differentiate this tool from siblings like 'replace_tenant_template' or 'publish_notification'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives such as 'replace_tenant_template' or 'list_tenant_templates'. There is no mention of context, prerequisites, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
put_notification_contentAIdempotentInspect
Replace the elemental content of a V2 notification template. Overwrites all elements.
| Name | Required | Description | Default |
|---|---|---|---|
| state | No | Template state after update | |
| version | No | Content version string | |
| elements | Yes | Array of elemental content nodes | |
| notification_id | Yes | The notification template ID (nt_ prefix) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate idempotentHint=true and readOnlyHint=false, which are consistent with the description's mutation claim. The description adds context that it replaces 'V2 notification template' content and 'overwrites all elements', which is informative. No contradictions. It could mention side effects (e.g., on published state) but overall sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded with the purpose. Every word earns its place; no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and moderate complexity, the description adequately explains the operation. It doesn't detail return values or errors, but with sibling tools like 'get_notification_content', it is reasonably complete for selection and invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are well-documented. The description reinforces that 'elements' are overwritten and mentions the template type, but does not add significant meaning beyond the schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it replaces the elemental content of a V2 notification template and overwrites all elements. It uses specific verbs ('replace', 'overwrites') and resource ('elemental content of a V2 notification template'), and distinguishes from siblings like 'put_notification_element' which likely modifies individual elements.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for full replacement of elemental content, contrasting with partial updates. However, it does not explicitly state when to use this tool versus alternatives like 'put_notification_element' or 'get_notification_content', nor does it mention any exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
put_notification_elementAIdempotentInspect
Update a single element within a V2 notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| if | No | Conditional expression for element visibility | |
| ref | No | Reference identifier | |
| data | No | Element data payload | |
| loop | No | Loop expression for repeating elements | |
| type | Yes | Element type (e.g. text, action, image, divider, meta) | |
| state | No | Template state after update | |
| channels | No | Channels this element applies to | |
| element_id | Yes | The element ID to update | |
| notification_id | Yes | The notification template ID (nt_ prefix) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare idempotentHint=true and readOnlyHint=false, so the description adds no additional behavioral context beyond stating the update action. No mention of side effects, required permissions, or error states.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence with no unnecessary words. It is front-loaded with the action and resource, making it easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (9 parameters, nested objects, no output schema), the description is too brief. It does not explain what the function returns, error conditions, or how to use the element-specific parameters. The lack of output schema amplifies the need for more detail.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the baseline is 3. The description does not add any meaning beyond what the schema already provides, such as clarifying the purpose of specific parameters like 'if' or 'loop'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Update'), the resource ('a single element within a V2 notification template'), and distinguishes it from sibling tools like put_notification_content and put_notification_locale, which handle different parts of a notification template.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for updating an element within a template, but it does not explicitly state when to use this tool versus other update tools (e.g., put_notification_content) or provide alternatives, leaving the agent to infer the context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
put_notification_localeAIdempotentInspect
Set locale-specific content overrides for a V2 notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| state | No | Template state after update | |
| elements | Yes | Array of element overrides with id and locale-specific content | |
| locale_id | Yes | Locale identifier (e.g. es, fr, pt-BR) | |
| notification_id | Yes | The notification template ID (nt_ prefix) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate idempotentHint=true but the description does not mention idempotency or side effects like overwriting vs merging. It correctly states it's a write operation (set overrides). No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that is concise and front-loaded. Every word earns its place with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple mutation tool with 4 parameters and no output schema, the description is nearly complete. It could optionally mention that the operation is idempotent or what the response looks like, but not strictly necessary.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are already documented. The description adds no additional meaning or context for parameters like state or elements.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states a specific verb ('Set') and resource ('locale-specific content overrides for a V2 notification template'). It distinguishes from sibling tools like put_notification_content which handle general content, not locale-specific overrides.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like put_notification_content or put_notification_element. No context about prerequisites or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
remove_all_user_tenantsADestructiveIdempotentInspect
Remove a user from all tenants.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare destructiveHint=true, indicating mutation. The description adds context by specifying 'from all tenants', which implies broad impact. However, it does not disclose side effects like loss of access. With annotations present, this is adequate but not enhanced.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, front-loaded with action and scope, with no unnecessary words. It earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 1-parameter tool with annotations and no output schema, the description is sufficient. It explains the operation and the broad effect. Could mention return value or confirmation, but not required.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of the single parameter 'user_id' with a description. The description adds no additional meaning beyond the schema, meeting the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Remove a user from all tenants' clearly states the action (remove), the resource (user), and the scope (all tenants), distinguishing it from siblings like 'remove_user_from_tenant' which operates on a single tenant.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool vs alternatives, such as 'remove_user_from_tenant'. It does not mention prerequisites or context for its use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
remove_user_from_tenantCDestructiveIdempotentInspect
Remove a user from a tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID | |
| tenant_id | Yes | The tenant ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. While 'Remove' implies a destructive mutation, the description doesn't specify whether this requires admin permissions, whether the action is reversible, what happens to user data, or if there are rate limits. This leaves significant behavioral gaps for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's appropriately sized for a simple tool with two parameters and gets straight to the point without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive mutation tool with no annotations and no output schema, the description is insufficiently complete. It doesn't address important contextual aspects like permissions required, consequences of removal, error conditions, or what the tool returns. The description should provide more behavioral context given the tool's complexity and lack of structured metadata.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters clearly documented in the schema. The description doesn't add any additional semantic context about the parameters beyond what's already in the schema (e.g., format requirements, relationship between user_id and tenant_id). The baseline of 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Remove') and target ('a user from a tenant'), providing a specific verb+resource combination. However, it doesn't distinguish this tool from sibling tools like 'delete_user' or 'delete_tenant', which might handle similar user/tenant removal operations differently.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are no explicit instructions about prerequisites (e.g., user must exist in tenant), when-not-to-use scenarios, or references to sibling tools like 'delete_user' or 'delete_tenant' that might handle related operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
replace_notificationAIdempotentInspect
Replace a notification template entirely (full document PUT).
| Name | Required | Description | Default |
|---|---|---|---|
| state | No | Template state after update (defaults to DRAFT) | |
| notification | Yes | Full notification template payload | |
| notification_id | Yes | The notification template ID to replace |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate write and idempotent operations. Description adds 'replace entirely', which aligns but does not go beyond annotations. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action, no unnecessary words. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 3 parameters, nested object, and no output schema, the description is adequate. It could optionally mention that the full object must be provided or note default state, but the schema covers required fields.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds no additional meaning to parameters beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Replace a notification template entirely (full document PUT)' using specific verb and resource, distinguishing it from siblings like create_notification or archive_notification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., update_notification or put_notification_content). The description does not mention when-not or provide context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
replace_profileADestructiveIdempotentInspect
Fully replace a user profile (PUT). All existing data is overwritten; include every field you want to keep.
| Name | Required | Description | Default |
|---|---|---|---|
| profile | Yes | Complete profile data to replace with | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key traits: it's a destructive PUT operation that overwrites all existing data, and it requires including every field to retain. It lacks details on permissions, rate limits, or error handling, but covers the core mutation behavior adequately.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is highly concise and front-loaded, consisting of two clear sentences with zero waste. The first sentence states the core action, and the second provides critical behavioral guidance, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive mutation tool with no annotations and no output schema, the description is moderately complete. It covers the overwrite behavior and parameter expectations but lacks details on permissions, error responses, or what happens to unspecified fields. Given the complexity, it should do more to compensate for missing structured data.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('user_id' and 'profile'). The description adds minimal value beyond the schema by implying the 'profile' parameter must be complete, but does not provide additional syntax or format details. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('replace') and resource ('user profile'), and distinguishes it from siblings like 'create_or_merge_user' or 'delete_profile' by emphasizing full overwrite rather than partial updates or deletions. It explicitly mentions 'all existing data is overwritten', which differentiates it from update operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool: when fully replacing a user profile and including all fields to keep. However, it does not explicitly state when not to use it (e.g., vs. partial updates or creation) or name alternatives like 'create_or_merge_user', leaving some ambiguity in sibling differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
replace_routing_strategyAIdempotentInspect
Replace a routing strategy. Full document replacement; missing optional fields are cleared.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Human-readable name | |
| tags | No | Tags. Omit to clear. | |
| routing | Yes | Routing tree | |
| channels | No | Per-channel delivery configuration. Omit to clear. | |
| providers | No | Per-provider delivery configuration. Omit to clear. | |
| description | No | Description. Omit to clear. | |
| routing_strategy_id | Yes | The routing strategy ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide idempotentHint (true) but no destructive hint. The description explicitly adds that missing optional fields are cleared, which is a key behavioral trait beyond what annotations convey. It does not, however, mention return behavior or permissions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first states purpose, second clarifies key behavioral detail. Every word adds value; no redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 parameters including a nested object and no output schema, the description is short but covers the replacement essence. It is adequate but could mention success response (e.g., returns updated strategy) or error conditions for missing IDs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage, so each parameter is already documented. The description adds overarching semantics: omitting optional fields clears them. This adds meaning beyond individual field descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Replace a routing strategy' with a specific verb and resource, and adds 'Full document replacement; missing optional fields are cleared.' This distinguishes it from create_routing_strategy (which creates new) and get/archive siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The tool name and description imply it should be used to update an existing routing strategy, as there is no separate 'update' tool among siblings. However, no explicit when-to-use or when-not-to-use guidance is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
replace_tenant_templateAIdempotentInspect
Create or replace a tenant notification template (draft unless published is true).
| Name | Required | Description | Default |
|---|---|---|---|
| title | No | Optional title merged into template content when provided | |
| content | No | Elemental content object (e.g. elements and version per Courier Elemental schema) | |
| routing | No | Message routing configuration | |
| channels | No | Channel-specific delivery configuration | |
| providers | No | Provider-specific routing configuration | |
| published | No | When true, publish immediately after save | |
| tenant_id | Yes | The tenant ID | |
| template_id | Yes | The template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds the draft/publish nuance beyond annotations (readOnlyHint=false, idempotentHint=true). It explains that the template is created as a draft unless published is true. However, it does not disclose other behavioral details like permissions, error handling, or the effect of multiple calls (though idempotency is annotated).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence of 11 words. It conveys the core purpose and the key condition without any filler or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 8 parameters (with nested objects not detailed in schema), the description is reasonably complete. It covers the essential behavior (create/replace, draft/publish) and is supplemented by annotations (idempotency) and schema (100% coverage). Lacks return value details but no output schema exists.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with individual parameter descriptions. The description adds context that the template is tenant-scoped and can be draft/published, but this does not significantly enhance understanding beyond the schema's parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool creates or replaces a tenant notification template, with a specific condition ('draft unless published is true'). It distinguishes from sibling tools like 'get_tenant_template' and 'publish_tenant_template' by combining creation/replacement and draft/publish behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus similar tools such as 'replace_notification' or 'publish_tenant_template'. It implicitly suggests using the 'published' parameter to toggle publishing, but lacks explicit usage context or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
restore_listAIdempotentInspect
Restore a previously deleted list.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate idempotent mutation. Description aligns but adds no extra context beyond 'restore', which is consistent. No disclosure of what happens if list doesn't exist or time limits on recoverability.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single concise sentence with no wasted words. Front-loaded and efficiently communicates the tool's purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description is largely adequate. However, it lacks details on the exact effects of restoration (e.g., list visibility, subscriber state).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter (list_id) with schema coverage 100%. The description adds no extra meaning beyond 'The list ID', so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Restore a previously deleted list' clearly states the action (restore) and the resource (list), distinguishing it from sibling tools like delete_list and create_list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use restore_list vs alternatives. Implied usage is for restoring deleted lists, but no prerequisites or conditions (e.g., list must be recoverable) are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
run_bulk_jobCInspect
Run a bulk job, triggering delivery to all added users.
| Name | Required | Description | Default |
|---|---|---|---|
| job_id | Yes | The bulk job ID to run |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions 'triggering delivery' which implies a write/mutation operation, but doesn't disclose critical behavioral traits such as permissions required, whether the job runs asynchronously, rate limits, error handling, or what 'delivery' entails. This is a significant gap for a tool that likely modifies system state.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's front-loaded with the core action and effect, making it easy to parse. Every word earns its place without redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (likely a mutation operation with delivery implications), lack of annotations, and no output schema, the description is incomplete. It doesn't cover behavioral aspects, return values, error conditions, or integration context with siblings like 'add_bulk_users'. For a tool that triggers deliveries, more detail is needed to ensure safe and correct usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'job_id' documented in the schema. The description adds no additional meaning about the parameter beyond what the schema provides (e.g., format examples, source of job_id, or validation rules). Baseline score of 3 is appropriate since the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Run') and resource ('bulk job'), and specifies the action's effect ('triggering delivery to all added users'). It distinguishes from sibling tools like 'create_bulk_job' or 'get_bulk_job' by focusing on execution rather than creation or retrieval. However, it doesn't explicitly differentiate from other execution-related tools like 'invoke_ad_hoc_automation'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a created bulk job with added users), exclusions (e.g., not for testing), or comparisons to siblings like 'send_message_to_list' for similar delivery purposes. Usage is implied but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_messageBInspect
Send a message to a user using inline title and body content (no template). Optionally specify routing channels.
| Name | Required | Description | Default |
|---|---|---|---|
| body | Yes | Message body | |
| data | No | Key-value data to include with the message | |
| title | Yes | Message title | |
| method | No | Routing method: deliver to all channels or stop after first success | all |
| user_id | Yes | The recipient user ID | |
| channels | No | Channel names to route through (e.g. email, sms, push). Omit to use default routing. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the tool sends messages and optionally specifies routing channels, but lacks critical details: it doesn't disclose whether this is a mutating operation (likely yes), what permissions are needed, rate limits, error conditions, or what happens on success/failure. The description is insufficient for a mutation tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose and includes optional functionality. Every word earns its place with zero redundancy or fluff, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a mutation tool (sending messages) with no annotations and no output schema, the description is incomplete. It doesn't explain what the tool returns, error handling, side effects, or security requirements. For a 6-parameter tool that likely modifies system state, more contextual information is needed to use it safely and effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all 6 parameters thoroughly. The description adds minimal value beyond the schema: it mentions 'inline title and body content' (implied by required parameters) and 'routing channels' (maps to the 'channels' parameter). No additional syntax, format, or behavioral context is provided for parameters beyond what the schema offers.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send a message') and resource ('to a user'), specifying it uses 'inline title and body content (no template)'. It distinguishes from sibling tools like 'send_message_template' by mentioning 'no template', but doesn't explicitly differentiate from other messaging tools like 'send_message_to_list'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for sending direct messages with inline content rather than templates, and mentions optional routing channels. However, it doesn't provide explicit guidance on when to use this versus alternatives like 'send_message_template' or 'send_message_to_list', nor does it mention prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_message_templateCInspect
Send a message to a user using a pre-configured notification template. Optionally pass data and routing.
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Key-value data for template variables | |
| method | No | Routing method | all |
| user_id | Yes | The recipient user ID | |
| channels | No | Channel names to route through. Omit to use template routing config. | |
| template | Yes | Template ID or notification slug |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool sends a message but doesn't cover critical aspects like whether this is a read-only or destructive operation, permission requirements, rate limits, error handling, or what happens on success/failure. This is inadequate for a tool that likely involves notifications or user interactions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose and mentions optional features without unnecessary elaboration. Every word serves a purpose, making it highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of sending messages with templates and routing, no annotations, and no output schema, the description is insufficient. It lacks details on behavioral traits, error cases, response format, and differentiation from sibling tools, leaving significant gaps for an AI agent to understand how to use this tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so parameters are well-documented in the schema. The description adds minimal value by mentioning 'optionally pass data and routing', which aligns with the 'data' and 'method/channels' parameters but doesn't provide additional context beyond what the schema already explains.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('send') and resource ('message'), specifying it uses a 'pre-configured notification template' and mentions optional data and routing. However, it doesn't explicitly differentiate from sibling tools like 'send_message' or 'send_message_to_list_template', which appear related but have different scopes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'send_message' or 'send_message_to_list_template'. It mentions optional features but doesn't specify scenarios or prerequisites for using this tool over others.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_message_to_listCInspect
Send a message to all subscribers of a list using inline title and body content.
| Name | Required | Description | Default |
|---|---|---|---|
| body | Yes | Message body | |
| data | No | Key-value data to include | |
| title | Yes | Message title | |
| method | No | Routing method | all |
| list_id | Yes | The list ID to send to | |
| channels | No | Channel names to route through. Omit to use default routing. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states the tool sends messages, implying a write/mutation operation, but doesn't disclose behavioral traits like rate limits, permissions required, whether it's asynchronous, or what happens on failure. The description adds minimal context beyond the basic action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with the core action, zero waste. Every word contributes to understanding the tool's purpose efficiently.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with 6 parameters, no annotations, and no output schema, the description is incomplete. It lacks behavioral context (e.g., side effects, error handling), doesn't explain optional parameters like 'method' or 'channels', and provides no guidance on usage relative to siblings.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description mentions 'inline title and body content', which aligns with the 'title' and 'body' parameters but doesn't add meaning beyond what the schema provides. No extra syntax, format details, or usage examples are given.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send a message') and target ('to all subscribers of a list'), specifying the content source ('using inline title and body content'). It distinguishes from sibling tools like 'send_message' (generic) and 'send_message_to_list_template' (template-based), but doesn't explicitly mention these alternatives in the description text itself.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives like 'send_message_to_list_template' or 'send_message'. The description implies it's for sending to list subscribers with inline content, but lacks context about prerequisites, timing, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_message_to_list_templateCInspect
Send a message to all subscribers of a list using a notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Key-value data for template variables | |
| method | No | Routing method | all |
| list_id | Yes | The list ID to send to | |
| channels | No | Channel names to route through. Omit to use template routing config. | |
| template | Yes | Template ID or notification slug |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions sending to 'all subscribers' but omits critical details like whether this is a bulk operation, potential rate limits, authentication requirements, or what happens on failure. For a tool that likely involves significant impact, this is a notable gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and wastes no space, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a tool that sends messages to lists using templates, with no annotations and no output schema, the description is insufficient. It lacks details on behavioral traits, error handling, and output expectations, leaving significant gaps for an AI agent to understand the tool fully.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description does not add meaning beyond what the input schema provides, as schema description coverage is 100%. It mentions 'list' and 'template' but doesn't elaborate on their semantics or usage. With high schema coverage, the baseline score of 3 is appropriate, as the schema adequately documents parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send a message') and target ('to all subscribers of a list using a notification template'), making the purpose evident. However, it doesn't explicitly differentiate from sibling tools like 'send_message_to_list' or 'send_message_template', which appear related but have nuanced differences in functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'send_message_to_list' or 'send_message_template', nor does it mention prerequisites like needing an existing list or template. It lacks context for decision-making among similar tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subscribe_user_to_listBIdempotentInspect
Subscribe a user to a list. Creates the list if it doesn't exist.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID | |
| user_id | Yes | The user ID to subscribe | |
| preferences | No | Optional notification preferences |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions the side effect of creating a list if missing, which is useful, but fails to disclose other behavioral traits like required permissions, whether the operation is idempotent, error handling, or rate limits. For a mutation tool with zero annotation coverage, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences with zero waste, front-loaded with the primary action and followed by a key behavioral note. Every sentence earns its place by adding value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a mutation tool. It covers the basic purpose and a side effect but lacks details on permissions, error cases, or return values. However, it is minimally adequate for the core functionality, aligning with a score of 3.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters (list_id, user_id, preferences). The description does not add meaning beyond what the schema provides, such as explaining the purpose of preferences or format details. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Subscribe a user to a list') and resource ('list'), with a specific additional behavior ('Creates the list if it doesn't exist'). However, it does not explicitly differentiate from sibling tools like 'subscribe_user_to_lists' or 'create_list', which would be needed for a score of 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'subscribe_user_to_lists' (for multiple lists) or 'create_list' (if list creation is the primary goal). It lacks explicit when/when-not instructions or prerequisites, leaving usage context implied.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subscribe_user_to_listsCIdempotentInspect
Subscribe a user to one or more lists. Creates lists that do not exist.
| Name | Required | Description | Default |
|---|---|---|---|
| lists | Yes | Array of lists to subscribe to | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions that the tool 'Creates lists that do not exist,' which is a useful behavioral trait beyond basic subscription. However, it lacks critical details such as required permissions, whether the operation is idempotent, error handling for invalid inputs, or what happens to existing subscriptions. For a mutation tool with zero annotation coverage, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences that directly state the tool's actions. It is front-loaded with the primary purpose and adds a secondary behavior without any wasted words, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that this is a mutation tool with no annotations and no output schema, the description is incomplete. It lacks information on return values, error conditions, side effects (e.g., impact on user notifications), and how it interacts with sibling tools. The mention of list creation adds some value, but overall, it doesn't provide enough context for safe and effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with clear documentation for 'user_id' and 'lists' parameters. The description adds no additional semantic context about parameters beyond what the schema provides, such as format examples or constraints. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('Subscribe') and resource ('user to one or more lists'), and it adds a secondary action ('Creates lists that do not exist'). However, it doesn't explicitly differentiate from the sibling tool 'subscribe_user_to_list' (singular vs. plural), which could cause confusion about when to use each.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With sibling tools like 'subscribe_user_to_list' (singular) and 'add_user_to_tenant', there's no indication of when bulk subscription or list creation is preferred, nor any prerequisites or exclusions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
track_inbound_eventCIdempotentInspect
Track an inbound event that can trigger automations. Requires event name, messageId (for deduplication), and properties.
| Name | Required | Description | Default |
|---|---|---|---|
| event | Yes | The event name (appears as trigger in Automation Trigger node) | |
| userId | No | User ID associated with the event | |
| messageId | Yes | Unique ID for deduplication (returns 409 if not unique) | |
| properties | Yes | Event properties payload |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions deduplication via messageId but doesn't explain the 409 conflict response mentioned in the schema. It doesn't disclose authentication requirements, rate limits, side effects, or what happens after tracking (e.g., how automations are triggered). For a tool that presumably creates/mutates event data, this is insufficient behavioral disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the purpose and lists required parameters. It's appropriately sized and front-loaded with the core functionality. No wasted words, though it could be slightly more structured with separate purpose and parameter sections.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 4 parameters, no annotations, no output schema, and nested objects in properties, the description is incomplete. It doesn't explain what happens after tracking, how automations are triggered, error conditions beyond deduplication, or the structure of the properties object. Given the complexity and lack of structured metadata, more context is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description mentions the three required parameters (event, messageId, properties) but adds no additional semantic context beyond what's in the schema. The baseline of 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('track') and resource ('inbound event'), and specifies its purpose ('can trigger automations'). It doesn't explicitly differentiate from sibling tools, but since no other tools mention event tracking, this is adequate. The description goes beyond tautology by explaining the automation triggering capability.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It mentions required parameters but doesn't indicate scenarios where this tool is appropriate versus other event-related or automation tools. With many sibling tools available, this lack of contextual guidance is a significant gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
unsubscribe_user_from_listCDestructiveIdempotentInspect
Unsubscribe a user from a list.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID | |
| user_id | Yes | The user ID to unsubscribe |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Unsubscribe' implies a mutation (likely removing a user from a list), the description doesn't clarify whether this requires specific permissions, if it's reversible, what happens on success/failure, or any rate limits. For a mutation tool with zero annotation coverage, this is a significant gap in behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, direct sentence with zero wasted words. It's front-loaded with the core action and resource, making it highly efficient. Every word earns its place, achieving optimal conciseness without being under-specified.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's mutation nature (implied by 'Unsubscribe'), lack of annotations, and absence of an output schema, the description is incomplete. It doesn't address behavioral aspects like permissions, side effects, or response format. For a tool that modifies data, more context is needed to ensure safe and correct usage by an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with both parameters ('list_id' and 'user_id') clearly documented in the schema. The description adds no additional meaning beyond what the schema provides (e.g., it doesn't explain format constraints or examples). Given the high schema coverage, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Unsubscribe') and the target ('a user from a list'), making the purpose immediately understandable. It uses a specific verb and identifies the resource involved. However, it doesn't explicitly differentiate from sibling tools like 'delete_user_list_subscriptions' or 'subscribe_user_to_list', which would require more specificity to earn a 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, when-not-to-use scenarios, or direct comparisons to related sibling tools like 'subscribe_user_to_list' or 'delete_user_list_subscriptions'. This leaves the agent without contextual usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_audienceCIdempotentInspect
Create or update an audience with a filter definition.
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | Display name | |
| filter | No | Filter definition object (operator, rules) | |
| audience_id | Yes | The audience ID | |
| description | No | Description |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions 'Create or update' which implies mutation, but doesn't disclose behavioral traits like required permissions, whether it's idempotent, what happens on conflicts, rate limits, or response format. For a mutation tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action ('Create or update an audience') and adds essential context ('with a filter definition'). There is zero waste, and every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a mutation tool with no annotations, no output schema, and 4 parameters, the description is incomplete. It lacks details on behavior (e.g., what 'Create or update' entails operationally), error handling, or response expectations. For a tool that modifies data, more context is needed to ensure safe and correct usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all 4 parameters with descriptions. The description adds no additional meaning beyond implying 'filter definition' is a key component, but doesn't explain syntax or constraints. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Create or update') and resource ('an audience'), specifying it involves a 'filter definition'. It distinguishes from sibling tools like 'delete_audience' and 'get_audience' by indicating mutation. However, it doesn't explicitly differentiate from 'create_list' or 'create_or_update_tenant', which are similar mutation operations on different resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., when to create vs. update), exclusions, or compare to siblings like 'create_list' for list management. Usage is implied by the name and purpose but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_brandBIdempotentInspect
Replace an existing brand with new values.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Brand display name | |
| brand_id | Yes | The brand ID to update | |
| settings | No | Brand settings (colors, email, inapp) | |
| snippets | No | Brand snippets |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate mutation (readOnlyHint=false) and idempotency (idempotentHint=true). The description adds 'replace with new values' but does not clarify if partial updates or full replacement occurs, leaving ambiguity beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that efficiently communicates the tool's purpose with no wasted words. It is front-loaded and appropriately sized for its clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of the tool (nested objects, no output schema), the description lacks crucial details such as return value, behavior when optional fields are omitted, and preconditions. It is insufficient for full understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so each parameter is already described. The tool description adds no additional meaning beyond the schema, resulting in a baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('replace') and the resource ('an existing brand') with 'new values', distinguishing it from create_brand and delete_brand siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool vs. alternatives like create_brand or delete_brand. The description lacks usage context entirely.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_notification_checksBIdempotentInspect
Update check statuses for a notification submission.
| Name | Required | Description | Default |
|---|---|---|---|
| checks | Yes | Checks to update | |
| submission_id | Yes | The submission ID for the checks resource | |
| notification_id | Yes | The notification template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds no behavioral context beyond what annotations already provide. Annotations indicate idempotentHint=true and readOnlyHint=false, which the description merely echoes as 'update'. It fails to disclose additional behaviors like whether updates are partial or full replacements, or any side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single 8-word sentence, very concise. However, it could be more informative without sacrificing conciseness, for example by indicating the batch nature of the checks parameter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the schema coverage and annotations, the description is minimally complete for a basic understanding. However, without an output schema, additional context about the result (e.g., return value or side effects) would improve completeness. The description does not mention that the checks array updates multiple checks at once.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema covers 100% of parameters with descriptions, so the baseline is 3. The description does not add any additional meaning beyond the schema; it simply restates the purpose without elaborating on parameter usage or constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Update check statuses for a notification submission' uses a specific verb ('update') and resource ('check statuses'), clearly distinguishing it from sibling tools like list_notification_checks which lists checks, and cancel_notification_submission which cancels submissions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No usage guidance is provided. The description does not specify when to use this tool over alternatives, such as when to update checks individually versus batch, or prerequisites like needing the submission to be in a certain state.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_providerAIdempotentInspect
Replace an existing provider configuration. Full replacement — retrieve current config with get_provider first; omitted optional fields are cleared. Changing API keys or settings affects live delivery if this integration is in use.
| Name | Required | Description | Default |
|---|---|---|---|
| alias | No | Short alias | |
| title | No | Display name | |
| provider | Yes | Provider key (must match existing; changing provider type is not supported) | |
| settings | No | Provider-specific settings | |
| provider_id | Yes | The provider configuration ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that it's a full replacement (clears omitted optional fields) and that changes affect live delivery. Annotations (readOnlyHint=false, idempotentHint=true) are consistent, and description adds context beyond them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two succinct sentences: first states purpose and nature, second provides critical usage guidance. No unnecessary words, front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers core aspects: purpose, usage, and side effects. Minor gap: no mention of return value (e.g., updated config or success status). But given no output schema and reasonable complexity, it's mostly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage, but description adds value by explaining that omitted optional fields are cleared due to full replacement. This clarifies parameter behavior beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the action ('replace'), the resource ('provider configuration'), and emphasizes it's a full replacement. Distinguishes from siblings like create_provider, get_provider, delete_provider with explicit context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides guidance to retrieve current config first (get_provider) and warns about live delivery impact. Could be more explicit about when to use vs alternatives for partial updates, but overall clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_tenant_preferenceAIdempotentInspect
Create or replace default notification preference for a topic on a tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| status | Yes | Subscription status for the topic | |
| topic_id | Yes | The subscription topic ID | |
| tenant_id | Yes | The tenant ID | |
| custom_routing | No | Default channels when has_custom_routing is enabled | |
| has_custom_routing | No | When true, use custom_routing instead of template defaults |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description states 'create or replace,' which aligns with the idempotentHint annotation (calling multiple times is safe). It adds context beyond annotations by specifying the operation's effect (replace existing default). However, it does not disclose details like whether existing custom routing is overwritten or if validation of tenant/topic occurs. Overall, good transparency given annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence of 12 words with no filler. It is front-loaded and every word serves a purpose. This is appropriately concise for a tool with a clear action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 5 parameters, no output schema, and moderate complexity (enum, boolean, array), the description covers the core action. It omits return value (likely a preference object or success status) and the relationship between has_custom_routing and custom_routing. Considering annotations and schema richness, it is mostly complete but could briefly state output type.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the schema already describes all parameters. The description adds no additional meaning beyond grouping them under 'default notification preference.' It does not explain interactions (e.g., has_custom_routing controls custom_routing usage). Baseline 3 is appropriate since schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description precisely states 'Create or replace default notification preference for a topic on a tenant.' It uses a specific verb ('create or replace') and identifies the resource (default notification preference) and scope (topic on a tenant). This clearly distinguishes it from sibling tools like update_user_preference_topic (per-user) and delete_tenant_preference (deletion).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for setting tenant-level preferences but provides no explicit guidance on when to use this tool versus alternatives like update_user_preference_topic or get_user_preferences. It lacks when-not-to-use conditions or prerequisites, leaving the agent to infer context from the name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_translationCIdempotentInspect
Create or update a translation for a specific locale.
| Name | Required | Description | Default |
|---|---|---|---|
| body | Yes | Translation content (PO file format) | |
| domain | No | Translation domain | default |
| locale | Yes | Locale code (e.g. en_US, fr_FR) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It implies a mutation operation ('Create or update') but does not specify permissions, side effects, error handling, or response format. This leaves critical behavioral traits undocumented for a tool that modifies data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action without unnecessary words. It earns its place by succinctly conveying the tool's purpose, making it easy to parse and understand quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a mutation tool with no annotations and no output schema, the description is insufficient. It lacks details on behavioral traits, error conditions, and return values, leaving gaps that could hinder an AI agent's ability to use the tool effectively in context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the input schema already documents parameters like 'locale' and 'body'. The description adds no additional meaning beyond what the schema provides, such as explaining the interaction between parameters or usage nuances, resulting in a baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Create or update') and resource ('translation for a specific locale'), making the purpose unambiguous. However, it does not differentiate from sibling tools like 'get_translation', which might retrieve translations, leaving room for improvement in sibling distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives, such as 'get_translation' for retrieval or other tools for related operations. The description lacks context on prerequisites, exclusions, or specific scenarios for application.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_user_preference_topicCIdempotentInspect
Update a user's preference for a specific subscription topic (opt in, opt out, or set channel preferences).
| Name | Required | Description | Default |
|---|---|---|---|
| status | Yes | Preference status | |
| user_id | Yes | The user ID | |
| topic_id | Yes | The subscription topic ID | |
| custom_routing | No | Custom channel routing order | |
| has_custom_routing | No | Whether custom channel routing is set |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool updates preferences but doesn't cover permissions required, whether changes are reversible, rate limits, or response format. For a mutation tool with zero annotation coverage, this is a significant gap in transparency about its behavior and constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action ('Update a user's preference') and includes key details (topic specificity and options). There is zero waste or redundancy, making it appropriately sized and well-structured for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (mutation with 5 parameters, no annotations, and no output schema), the description is incomplete. It doesn't explain return values, error conditions, or behavioral traits like idempotency. For a tool that modifies user preferences, more context on outcomes and constraints is needed to be fully helpful to an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters with descriptions. The description adds marginal value by mentioning 'opt in, opt out, or set channel preferences', which loosely relates to the 'status' enum and optional 'custom_routing' parameters, but doesn't provide additional syntax or format details beyond what the schema provides. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Update' and the resource 'user's preference for a specific subscription topic', specifying the action and target. It distinguishes from sibling tools like 'get_user_preferences' (read vs. write) and 'subscribe_user_to_list' (subscription vs. preference), though it doesn't explicitly name alternatives. The purpose is specific but could better differentiate from similar mutation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., user/topic existence), compare to siblings like 'update_translation' or 'replace_profile', or indicate scenarios for opting in/out versus setting channel preferences. Usage is implied from the action but lacks explicit context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!