Courier
Server Details
Send notifications, manage templates, and configure integrations with Courier.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- trycourier/courier-mcp
- GitHub Stars
- 1
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 3/5 across 57 of 59 tools scored.
Most tools have distinct purposes with clear boundaries, such as separate operations for users, messages, lists, and automations. However, some overlap exists between similar tools like send_message and send_message_template, or create_or_merge_user and replace_profile, which could cause minor confusion in selection.
Tool names follow a highly consistent verb_noun pattern throughout, such as create_brand, get_message, list_audiences, and update_translation. There are no deviations in naming conventions, making the set predictable and easy to parse.
With 59 tools, the count is excessive for the domain of a notification and messaging platform. This large number suggests over-fragmentation, such as having separate tools for similar operations (e.g., multiple send and list variations), which can overwhelm agents and reduce usability.
The tool set provides comprehensive coverage for the notification and user management domain, including CRUD operations for users, messages, lists, tenants, and automations, as well as specialized functions like JWT generation and audit tracking. No obvious gaps are present, enabling full lifecycle management.
Available Tools
59 toolsadd_bulk_usersCInspect
Add users to an existing bulk job.
| Name | Required | Description | Default |
|---|---|---|---|
| users | Yes | Array of user objects to add | |
| job_id | Yes | The bulk job ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but offers minimal behavioral insight. It states the tool adds users, implying a mutation, but doesn't disclose permissions needed, rate limits, idempotency, or what happens on failure (e.g., partial updates). For a mutation tool with zero annotation coverage, this is inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's front-loaded with the core action and resource, making it easy to parse quickly. No structural issues or redundancy are present.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It doesn't explain return values, error conditions, or side effects (e.g., whether the bulk job status changes). Given the complexity of bulk operations and lack of structured data, more context is needed for effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('job_id', 'users') documented in the schema. The description adds no additional meaning beyond the schema's details (e.g., what constitutes a valid 'user object'). Baseline score of 3 applies since the schema handles parameter documentation adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Add users to an existing bulk job' clearly states the action (add) and target (users to bulk job), but it's vague about what a 'bulk job' entails and doesn't distinguish from sibling tools like 'create_bulk_job' or 'list_bulk_users'. It provides basic purpose but lacks specificity about the resource context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites (e.g., needing an existing bulk job from 'create_bulk_job'), exclusions, or comparisons to similar tools like 'add_user_to_tenant'. Usage context is implied but not explicit.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
add_user_to_tenantCInspect
Add a user to a tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| profile | No | Tenant-scoped profile overrides | |
| user_id | Yes | The user ID | |
| tenant_id | Yes | The tenant ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must fully disclose behavior. It states 'Add a user to a tenant' but fails to explain critical aspects: whether this is a mutation (implied), what permissions are required, if it's idempotent, what happens on duplicate adds, or the response format. This leaves significant gaps in understanding the tool's effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, direct sentence with no wasted words, making it highly concise and front-loaded. It efficiently communicates the core action without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a mutation tool with no annotations and no output schema, the description is insufficient. It lacks details on behavior, error conditions, return values, and how it differs from siblings. For a tool that modifies system state, more context is needed to ensure safe and correct usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for 'user_id' and 'tenant_id', and 'profile' as 'Tenant-scoped profile overrides'. The description adds no additional parameter semantics beyond the schema, but since the schema is well-documented, a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Add') and target ('a user to a tenant'), making the purpose immediately understandable. However, it does not differentiate from the sibling tool 'remove_user_from_tenant' or explain what 'adding' entails (e.g., granting access, assigning roles). This clarity is good but lacks sibling distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'create_or_merge_user' or 'list_user_tenants'. It also omits prerequisites (e.g., user and tenant must exist) or exclusions (e.g., cannot add duplicate users). Without such context, usage is ambiguous.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cancel_messageBInspect
Cancel a message that is currently being delivered. Returns the message details with updated status.
| Name | Required | Description | Default |
|---|---|---|---|
| message_id | Yes | The message ID to cancel |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the action ('cancel') and return value ('message details with updated status'), but doesn't address critical behavioral aspects: whether cancellation is reversible, what permissions are required, whether there are rate limits, what happens if the message is already delivered, or what specific status changes occur. For a mutation tool with zero annotation coverage, this leaves significant gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with two sentences that each serve distinct purposes: the first states the action and target, the second describes the return value. There's zero wasted language, and the most important information (what the tool does) is front-loaded. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is insufficiently complete. While it states the basic action and return, it doesn't address error conditions, side effects, permissions, or what 'cancel' actually means operationally. The agent would need to guess about many behavioral aspects, making this description inadequate for safe tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with the single parameter 'message_id' well-documented in the schema. The description doesn't add any parameter-specific information beyond what the schema already provides (no format examples, no constraints on valid message IDs). With high schema coverage, the baseline score of 3 is appropriate as the description doesn't enhance parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('cancel') and resource ('message that is currently being delivered'), making the purpose immediately understandable. It distinguishes from siblings like 'delete_message' or 'get_message' by focusing on in-progress messages. However, it doesn't explicitly differentiate from all possible message-related operations in the sibling list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context ('message that is currently being delivered') suggesting this tool is for interrupting active deliveries rather than deleting sent messages. However, it doesn't provide explicit guidance on when NOT to use it or mention alternatives like 'delete_message' if it existed. The guidance is contextual but incomplete.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
courier_installation_guideAInspect
Get the Courier SDK installation guide for a specific platform. For client-side SDKs (React, iOS, Android, Flutter, React Native), also generates a sample JWT.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | No | User ID for JWT generation (client-side SDKs only). Defaults to "example_user". | |
| platform | Yes | The platform to get installation guide for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses that the tool retrieves installation guides and generates JWTs for client-side SDKs, which is useful behavioral context. However, it doesn't mention potential side effects, authentication requirements, rate limits, or response format details, leaving gaps for a tool that likely involves external resources.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently conveys the core functionality and conditional behavior. Every word earns its place, with no redundancy or unnecessary elaboration, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is moderately complete for a tool with 2 parameters and high schema coverage. It covers the main action and conditional JWT generation, but lacks details on output format, error handling, or dependencies, which could be important for an installation guide tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description adds marginal value by implying that 'user_id' is only relevant for client-side SDKs, but this is partially covered in the schema's description. Baseline 3 is appropriate since the schema does most of the work.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('Get', 'generates') and resources ('Courier SDK installation guide', 'sample JWT'). It distinguishes this tool from siblings by focusing on installation guides rather than user management, messaging, or other operations listed in the sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool: to get installation guides for specific platforms. It implicitly distinguishes usage by specifying that for client-side SDKs, it also generates a sample JWT, but it doesn't explicitly state when not to use it or name alternatives among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_brandCInspect
Create a new brand with name, colors, and email/inapp settings.
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | Optional brand ID; auto-generated if omitted | |
| name | Yes | Brand display name | |
| settings | No | Brand settings (colors, email, inapp) | |
| snippets | No | Brand snippets |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Create' implies a write/mutation operation, the description doesn't specify permissions needed, whether the operation is idempotent, error conditions, or what happens on success (e.g., returns the created brand object). For a mutation tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action ('Create a new brand') and specifies key attributes without unnecessary words. Every part of the sentence contributes directly to understanding the tool's purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It doesn't explain what the tool returns (e.g., the created brand object), error handling, or behavioral nuances like whether 'id' generation is guaranteed to be unique. Given the complexity of nested objects in the schema, more context would be helpful for the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all 4 parameters thoroughly. The description mentions 'name, colors, and email/inapp settings', which aligns with the 'name' and 'settings' parameters in the schema but doesn't add meaningful semantics beyond what the schema provides. The baseline score of 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Create a new brand') and specifies the key attributes involved ('name, colors, and email/inapp settings'), making the purpose immediately understandable. It doesn't explicitly differentiate from sibling tools like 'list_brands' or 'get_brand', but the verb 'Create' is sufficiently distinct from 'list' or 'get' operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., authentication requirements), when not to use it, or how it relates to sibling tools like 'list_brands' for viewing existing brands. The agent must infer usage from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_bulk_jobAInspect
Create a new bulk job for sending messages to multiple recipients. Workflow: create_bulk_job → add_bulk_users → run_bulk_job.
| Name | Required | Description | Default |
|---|---|---|---|
| message | Yes | Bulk message definition with event/template and content |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While it mentions the workflow, it doesn't disclose critical behavioral traits such as permissions required, whether the job is saved or transient, error handling, or what happens if the job isn't run. For a mutation tool with zero annotation coverage, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise and front-loaded, with two sentences that efficiently convey the purpose and workflow. Every sentence earns its place, and there is no wasted verbiage or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (a mutation with a nested object parameter) and no annotations or output schema, the description is moderately complete. It covers the purpose and workflow but lacks details on behavioral aspects like side effects, permissions, or return values. For a tool with these gaps, it's adequate but not fully comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'message' documented as 'Bulk message definition with event/template and content.' The description adds no additional parameter semantics beyond what the schema provides. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Create a new bulk job for sending messages to multiple recipients.' It specifies the verb ('create') and resource ('bulk job'), and distinguishes it from siblings like 'run_bulk_job' by indicating it's the first step in a workflow. However, it doesn't fully differentiate from other creation tools like 'create_list' or 'create_brand' beyond the resource type.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage context by outlining the workflow: 'create_bulk_job → add_bulk_users → run_bulk_job.' This clearly indicates when to use this tool (as the first step) and references sibling tools for subsequent steps. It doesn't explicitly state when not to use it or name alternatives, but the workflow guidance is strong.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_listCInspect
Create or update a list by list ID.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Display name for the list | |
| list_id | Yes | The list ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'create or update' but doesn't specify whether this is an upsert operation, what happens if the list ID doesn't exist, required permissions, or error conditions. This is inadequate for a mutation tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It's front-loaded with the core action, though it could be more structured by explicitly separating creation and update scenarios. Overall, it's concise but under-specified.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is incomplete. It lacks details on behavior (e.g., upsert logic), error handling, or return values. Given the complexity of a 'create or update' operation, this leaves significant gaps for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('list_id' and 'name'). The description adds no additional meaning beyond what the schema provides, such as format constraints or usage examples. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the action ('create or update') and resource ('a list by list ID'), which is clear but vague about the distinction between creation and update. It doesn't differentiate from sibling tools like 'create_brand' or 'create_or_merge_user', leaving ambiguity about when to use this specific list tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites, such as whether the list ID must exist for updates, or when to choose this over tools like 'create_bulk_job' or 'update_audience'. This leaves the agent without context for decision-making.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_or_merge_userBInspect
Create a new user profile or merge supplied values into an existing profile (POST). Existing fields not included are preserved.
| Name | Required | Description | Default |
|---|---|---|---|
| profile | No | Profile data to create or merge (e.g. { email: "...", phone_number: "..." }) | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses key behavioral traits: it's a POST operation (implying mutation), performs an upsert (create or merge), and preserves existing fields not included. However, it misses details like authentication requirements, error conditions, rate limits, or what happens on conflicts, leaving gaps for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose ('create or merge') and includes essential behavioral detail ('Existing fields not included are preserved'). It avoids redundancy, though it could be slightly more structured for clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is moderately complete. It covers the upsert behavior and field preservation, but lacks information on response format, error handling, permissions, or side effects. Given the complexity of user profile operations, more context would be beneficial.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('user_id' and 'profile'). The description adds minimal value by hinting at the merge behavior and example profile data, but doesn't provide additional syntax, format, or constraints beyond what the schema specifies. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('create or merge') and resource ('user profile'), specifying it's a POST operation. It distinguishes from siblings like 'add_user_to_tenant' or 'replace_profile' by emphasizing the merge behavior, though it doesn't explicitly name alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context through 'create or merge' and mentions preservation of existing fields, suggesting it's for upsert operations. However, it lacks explicit guidance on when to use this versus alternatives like 'create_or_replace_user_push_token' or 'replace_profile', and doesn't specify prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_or_replace_user_push_tokenCInspect
Create or replace a push/device token for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| token | Yes | The token string | |
| device | No | Device metadata | |
| user_id | Yes | The user ID | |
| provider_key | Yes | Push provider |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the action ('create or replace') but doesn't explain what 'replace' entails (e.g., overwriting existing tokens), potential side effects, authentication requirements, or error conditions. This leaves significant gaps for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's front-loaded with the core action and resource, making it easy to parse quickly without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with no annotations and no output schema, the description is inadequate. It doesn't cover behavioral aspects like idempotency, error handling, or response format. Given the complexity of managing user tokens and the lack of structured data, more context is needed to guide proper usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description adds no additional meaning beyond the schema, such as explaining the relationship between 'token' and 'device' or clarifying the 'provider_key' enum values. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('create or replace') and resource ('push/device token for a user'), making the purpose unambiguous. It doesn't explicitly differentiate from sibling tools like 'get_user_push_token' or 'list_user_push_tokens', but the action is distinct enough to avoid confusion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'get_user_push_token' or 'list_user_push_tokens'. It lacks context about prerequisites, such as user existence or permissions, and doesn't mention any exclusions or specific scenarios for its application.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_or_update_tenantCInspect
Create or replace a tenant. Tenants represent organizations or groups that users belong to.
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Display name for the tenant | |
| brand_id | No | Brand ID to associate with this tenant | |
| tenant_id | Yes | The tenant ID | |
| properties | No | Custom properties for the tenant | |
| user_profile | No | Default profile data for users in this tenant | |
| parent_tenant_id | No | Parent tenant ID for hierarchical tenants | |
| default_preferences | No | Default notification preferences for users in this tenant |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool 'create or replace' a tenant, implying mutation, but doesn't disclose behavioral traits like whether it's idempotent, what permissions are required, what happens on replacement (e.g., data loss), or error conditions. For a mutation tool with zero annotation coverage, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences with zero waste. The first sentence states the action and resource, and the second provides helpful context about what tenants represent. It's appropriately sized and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with 7 parameters, no annotations, and no output schema, the description is incomplete. It lacks behavioral details (e.g., idempotency, side effects), usage guidance, and output expectations. The schema covers parameters well, but the description doesn't compensate for missing annotations or output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, so all parameters are documented in the schema. The description adds no parameter-specific information beyond the general purpose. It doesn't explain how parameters like 'tenant_id' or 'properties' affect the operation. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('create or replace') and resource ('tenant'), and explains what tenants represent. It distinguishes the tool from sibling tools like 'delete_tenant' and 'get_tenant' by specifying its mutative nature. However, it doesn't explicitly differentiate from other tenant-related tools like 'list_tenants' or 'get_tenant' beyond the action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, when to choose creation versus replacement, or how it relates to sibling tools like 'delete_tenant' or 'list_tenants'. The agent must infer usage from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_audienceCInspect
Delete an audience by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| audience_id | Yes | The audience ID to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states the action is 'Delete,' implying a destructive mutation, but doesn't disclose critical behaviors like whether deletion is permanent, requires specific permissions, has side effects (e.g., on related data), or returns confirmation. This is inadequate for a destructive tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's front-loaded with the core action and resource, making it highly concise and well-structured for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a destructive mutation tool with no annotations and no output schema, the description is incomplete. It lacks details on behavioral traits (e.g., irreversibility, auth needs), expected outcomes, or error handling, which are critical for safe and effective use in this context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'audience_id' fully documented in the schema. The description adds no additional meaning beyond what the schema provides (e.g., format, validation rules, or examples), so it meets the baseline for high coverage without compensation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and resource ('an audience by its ID'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'delete_profile' or 'delete_tenant' beyond the resource name, which slightly limits its distinctiveness.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing an existing audience ID), exclusions, or related tools like 'get_audience' for verification, leaving usage context unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_profileCInspect
Delete a user profile permanently.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states the action is 'permanent', which is a critical behavioral trait, but fails to mention other important aspects such as required permissions, whether deletion is reversible, what happens to associated data, or error conditions. This leaves significant gaps for a destructive operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the core action and key qualifier ('permanently') without any wasted words. It's appropriately sized and front-loaded with essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive operation with no annotations and no output schema, the description is insufficient. It mentions permanence but omits critical context like permissions needed, side effects, return values, or error handling. Given the high-stakes nature of profile deletion, more comprehensive guidance is warranted.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the single parameter 'user_id' documented as 'The user ID to delete'. The description doesn't add any additional semantic context beyond this, such as format examples or constraints. Baseline score of 3 is appropriate since the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and resource ('user profile') with the qualifier 'permanently', which adds specificity. However, it doesn't explicitly differentiate from sibling tools like 'delete_tenant' or 'delete_audience', which also perform deletion operations on different resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., user must exist), exclusions, or comparisons to related tools like 'replace_profile' or 'delete_tenant', leaving the agent with no contextual usage information.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_tenantCInspect
Delete a tenant by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| tenant_id | Yes | The tenant ID to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states the action is 'Delete' but lacks critical details: whether this is irreversible, requires specific permissions, affects associated data, or has rate limits. This is inadequate for a destructive operation with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste—it directly states the tool's purpose without unnecessary words. It's appropriately sized and front-loaded, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive tool with no annotations and no output schema, the description is insufficient. It lacks details on behavioral traits (e.g., irreversibility, side effects), usage context, and expected outcomes, leaving significant gaps in understanding how to safely and effectively invoke this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'tenant_id' fully documented in the schema. The description adds minimal value by mentioning 'by its ID', which aligns with the schema but doesn't provide additional context like ID format or examples. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and target resource ('a tenant by its ID'), providing a specific verb+resource combination. However, it doesn't differentiate from sibling tools like 'delete_audience' or 'delete_profile' beyond the resource type, which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'remove_user_from_tenant' or 'delete_audience', nor does it mention prerequisites, exclusions, or context for deletion. The description only states what it does, not when or why to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
delete_user_list_subscriptionsCInspect
Delete all list subscriptions for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the action is a deletion but does not specify whether this is reversible, requires admin permissions, affects user data permanently, or has side effects like notifications. This leaves significant gaps for a destructive operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, direct sentence with no wasted words, clearly front-loading the core action. It efficiently communicates the essential purpose without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive tool with no annotations and no output schema, the description is inadequate. It lacks details on behavior, error handling, return values, or safety considerations, leaving the agent with insufficient context to use it correctly in complex scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with 'user_id' documented as 'The user ID'. The description adds no additional meaning beyond this, such as format examples or scope details. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete') and target resource ('all list subscriptions for a user'), making the purpose unambiguous. However, it does not explicitly differentiate from sibling tools like 'unsubscribe_user_from_list' or 'delete_audience', which handle related but different operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'unsubscribe_user_from_list' (for single subscriptions) or 'delete_audience' (for broader data removal). It lacks context about prerequisites, consequences, or typical scenarios for bulk deletion.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_jwt_for_userBInspect
Generate a JWT authentication token for a user. Used for client-side SDK auth (Inbox, Preferences, etc.).
| Name | Required | Description | Default |
|---|---|---|---|
| scopes | No | Permission scopes for the token | |
| user_id | Yes | The user ID to scope the token to | |
| expires_in | No | Token expiry duration (e.g. "1h", "2 days") | 1h |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions the token is for 'authentication' and 'client-side SDK auth,' which implies security-sensitive behavior, but doesn't disclose critical traits like required permissions, rate limits, or whether this operation is idempotent. For a token generation tool with zero annotation coverage, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences with zero waste. It front-loads the core purpose and follows with usage context, making it efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (security-sensitive token generation), lack of annotations, and no output schema, the description is incomplete. It covers the basic purpose and usage but misses behavioral details like auth requirements, token format, or error handling. This is adequate for a minimal viable description but has clear gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all parameters documented in the schema. The description adds no additional parameter semantics beyond what the schema provides (e.g., it doesn't explain scopes or expiry formats further). Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Generate a JWT authentication token for a user.' It specifies the verb ('generate') and resource ('JWT authentication token'), and mentions the target ('for a user'). However, it doesn't explicitly differentiate from sibling tools, as none appear to be direct alternatives for token generation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implied usage context: 'Used for client-side SDK auth (Inbox, Preferences, etc.).' This suggests when to use it (for SDK authentication) but doesn't explicitly state when not to use it or name alternatives. No prerequisites or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_audienceCInspect
Get an audience by its ID, including its filter definition.
| Name | Required | Description | Default |
|---|---|---|---|
| audience_id | Yes | The audience ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states it 'gets' data, implying a read-only operation, but does not disclose behavioral traits such as error handling (e.g., if the ID is invalid), authentication needs, rate limits, or response format. This leaves significant gaps for a tool with no structured safety hints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action and resource. It avoids redundancy and wastes no words, making it easy to parse quickly. Every part of the sentence contributes directly to understanding the tool's function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete. It does not explain what 'including its filter definition' entails in the return value, error conditions, or any side effects. For a read operation with minimal structured support, more behavioral context is needed to fully guide an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'audience_id' documented in the schema. The description adds no additional meaning beyond implying the ID retrieves an audience with its filter definition, which is already covered by the tool's purpose. Baseline 3 is appropriate as the schema handles parameter documentation adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('an audience by its ID'), specifying it includes the filter definition. It distinguishes from siblings like 'list_audiences' (which lists multiple) and 'delete_audience' (which removes), though not explicitly named. However, it lacks explicit sibling differentiation, making it slightly less specific than a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It does not mention prerequisites (e.g., needing a valid audience ID), exclusions, or comparisons to siblings like 'list_audiences' for browsing or 'get_audience_members' for details. The description implies usage when an ID is known, but offers no explicit context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_audit_eventCInspect
Get a specific audit event by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| audit_event_id | Yes | The audit event ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool retrieves a specific audit event but doesn't mention whether this is a read-only operation, if it requires specific permissions, what happens if the ID is invalid, or any rate limits. For a tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any fluff or redundancy. It's appropriately sized and front-loaded, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete. It doesn't explain what an audit event contains, the format of the return value, or error conditions. For a tool that likely returns structured data, more context is needed to guide the agent effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the single parameter 'audit_event_id' fully documented in the schema. The description adds no additional parameter semantics beyond what's in the schema, so it meets the baseline score of 3 for adequate coverage without extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and the resource 'a specific audit event by its ID', making the purpose unambiguous. However, it doesn't differentiate from its sibling 'list_audit_events', which would require explicit comparison to achieve a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'list_audit_events' or other audit-related tools. It lacks any context about prerequisites, timing, or exclusions, leaving the agent without usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_brandBInspect
Get a brand by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| brand_id | Yes | The brand ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states a read operation ('Get'), implying it's likely safe, but doesn't mention permissions, rate limits, error handling, or what happens if the ID is invalid. This leaves significant gaps in understanding the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words, making it easy to parse. It's front-loaded with the core action and resource, which is ideal for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (1 parameter, 100% schema coverage, no output schema), the description is adequate but minimal. It covers the basic purpose but lacks details on usage, behavioral traits, or return values, which could be helpful for an agent in a broader context with many sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the parameter 'brand_id' fully documented in the schema. The description adds no additional meaning beyond implying the parameter is required, which the schema already states. This meets the baseline for high schema coverage without extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('a brand') with the specific identifier ('by its ID'), making the purpose unambiguous. However, it doesn't differentiate from sibling tools like 'list_brands' or 'get_audience', which follow similar patterns, so it doesn't reach the highest score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'list_brands' for browsing or other 'get_' tools for different resources. The description is minimal and offers no context on prerequisites or exclusions, leaving usage decisions unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_bulk_jobCInspect
Get the status of a bulk job.
| Name | Required | Description | Default |
|---|---|---|---|
| job_id | Yes | The bulk job ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool retrieves status but doesn't describe what the status includes (e.g., progress percentage, success/failure, error details), whether it's read-only (implied but not confirmed), or any rate limits or authentication requirements. This leaves significant gaps for an agent to understand how the tool behaves.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded with the core action ('Get the status'), making it easy to parse. There is no wasted language, and it fits well within the context of a simple status-checking tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a job status tool with no annotations and no output schema, the description is insufficient. It doesn't explain what the status output includes (e.g., JSON structure, possible states like 'pending', 'completed'), error handling, or dependencies on other tools like 'create_bulk_job'. For a tool that likely returns dynamic data, more context is needed to be complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with 'job_id' documented as 'The bulk job ID'. The description adds no additional meaning beyond this, such as format examples (e.g., UUID) or where to obtain the ID. Since the schema already provides adequate parameter documentation, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get the status') and resource ('of a bulk job'), making the purpose unambiguous. It distinguishes from siblings like 'create_bulk_job' or 'run_bulk_job' by focusing on status retrieval rather than creation or execution. However, it doesn't specify what 'status' entails (e.g., progress, completion, errors), leaving some ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a job ID from 'create_bulk_job' or 'run_bulk_job'), nor does it differentiate from similar tools like 'list_bulk_users' or 'get_audit_event' that might provide related information. Usage is implied but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_listCInspect
Get a list by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states 'Get a list by its ID', implying a read-only operation, but doesn't disclose behavioral traits such as error handling (e.g., if the ID is invalid), authentication needs, rate limits, or what data is returned. This is a significant gap for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste—'Get a list by its ID.' It's appropriately sized and front-loaded, making it easy to parse. Every word serves a purpose, earning a high score for conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete. It doesn't explain what 'get' returns (e.g., list metadata, contents, or subscribers), error conditions, or dependencies. For a tool with one parameter but no structured output info, more context is needed to be fully helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'list_id' documented as 'The list ID'. The description adds no meaning beyond this, as it only repeats the parameter concept without explaining format, source, or constraints. Baseline is 3 since the schema does the heavy lifting, but no extra value is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Get a list by its ID' clearly states the action (get) and resource (list), but it's vague about what 'get' entails—retrieving metadata, contents, or both. It distinguishes from siblings like 'list_lists' (which lists multiple lists) but doesn't clarify differences from 'get_list_subscribers' or 'get_user_list_subscriptions', which are related but distinct operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. For example, it doesn't specify if this should be used for retrieving list details after 'list_lists' or as a prerequisite for operations like 'send_message_to_list'. The description lacks context on prerequisites or exclusions, leaving usage unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_list_subscribersCInspect
Get all subscribers of a list.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| list_id | Yes | The list ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Get all subscribers' but does not specify if this is a read-only operation, whether it supports pagination (though the schema includes a 'cursor' parameter), rate limits, authentication needs, or what happens if the list does not exist. This leaves significant gaps in understanding the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any unnecessary words. It is front-loaded and wastes no space, making it highly concise and well-structured for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete. It does not explain return values, error conditions, or behavioral traits like pagination handling. For a tool with two parameters and no structured output information, more context is needed to fully guide the agent, making it inadequate for comprehensive use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with clear descriptions for both parameters ('list_id' and 'cursor'), so the baseline score is 3. The description adds no additional semantic information beyond what the schema provides, such as format examples or usage context for the cursor, but it does not need to compensate for low coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('all subscribers of a list'), making the purpose specific and understandable. However, it does not explicitly differentiate from sibling tools like 'list_audience_members' or 'get_user_list_subscriptions', which might have overlapping functionality, so it falls short of a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. For example, it does not specify if this is for retrieving all subscribers at once, how it compares to paginated or filtered queries in sibling tools, or any prerequisites like list existence. This lack of context leaves the agent without clear usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_messageBInspect
Get the full details and status of a single message by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| message_id | Yes | The message ID to retrieve |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Get' implies a read-only operation, the description doesn't specify whether this requires authentication, rate limits, error conditions, or what 'full details and status' includes. For a tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any unnecessary words. It's appropriately sized and front-loaded with the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (1 parameter, no output schema, no annotations), the description is adequate but incomplete. It explains the basic operation but lacks details about authentication requirements, error handling, or what 'full details and status' entails, which would be helpful for an agent to use it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the single parameter 'message_id' clearly documented in the schema. The description adds no additional parameter semantics beyond what's already in the schema, so it meets the baseline score of 3 for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get the full details and status') and resource ('a single message by its ID'), making the purpose specific and understandable. However, it doesn't explicitly distinguish this tool from sibling tools like 'get_message_content' or 'get_message_history', which appear to be related message retrieval operations with different scopes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'get_message_content' or 'get_message_history', nor does it mention any prerequisites or exclusions. It simply states what the tool does without contextual usage information.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_message_contentCInspect
Get the rendered content (HTML, text, subject) of a previously sent message.
| Name | Required | Description | Default |
|---|---|---|---|
| message_id | Yes | The message ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool retrieves content but does not disclose behavioral traits like whether it requires authentication, rate limits, error handling, or the format of the returned content. This is inadequate for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and appropriately sized, making it easy to parse and understand quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete. It does not explain what the return values look like (e.g., structure of HTML/text/subject), error conditions, or other contextual details needed for effective tool invocation, leaving significant gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the single parameter 'message_id' documented. The description adds no additional meaning beyond the schema, such as format examples or constraints, so it meets the baseline score for high schema coverage without enhancing parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and the resource 'rendered content (HTML, text, subject) of a previously sent message', making the purpose specific and understandable. However, it does not explicitly differentiate from sibling tools like 'get_message' or 'get_message_history', which reduces it from a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'get_message' or 'get_message_history'. It lacks context on prerequisites, exclusions, or specific scenarios for usage, leaving the agent without clear direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_message_historyAInspect
Get the event history for a message, showing each step in the delivery pipeline (enqueued, sent, delivered, etc.).
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | Filter by event type | |
| message_id | Yes | The message ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions retrieving event history but doesn't disclose behavioral traits like whether this requires specific permissions, if it's paginated, rate-limited, or what format the history returns. For a read operation with no annotation coverage, this leaves significant gaps in understanding how the tool behaves.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that front-loads the core purpose and provides clarifying examples. Every word earns its place with no redundancy or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description adequately covers the purpose but lacks behavioral context and return value details. For a read tool with 2 parameters, it's minimally viable but leaves gaps in understanding the full operation, especially around output format and constraints.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters (message_id and type). The description doesn't add meaning beyond what's in the schema, such as explaining what event types are available or how the filtering works. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('event history for a message'), specifying it shows 'each step in the delivery pipeline' with examples like 'enqueued, sent, delivered, etc.' This distinguishes it from sibling tools like get_message (which likely retrieves message content/metadata) and get_message_content (which retrieves message body).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when needing delivery pipeline details for a specific message, but doesn't explicitly state when to use this versus alternatives like get_message or get_audit_event. No exclusions or prerequisites are mentioned, leaving some ambiguity about appropriate contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_notification_contentCInspect
Get the published content blocks of a notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| notification_id | Yes | The notification template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but only states what the tool does without behavioral details. It doesn't disclose if this is a read-only operation, what permissions are needed, error handling, or response format, leaving significant gaps for a tool that likely accesses sensitive data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with no wasted words. It's front-loaded with the core action and resource, making it efficient and easy to parse, which is ideal for conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no annotations and no output schema, the description is incomplete. It lacks details on behavioral traits, usage context, and expected outputs, which are critical given the tool likely interacts with notification data and has siblings like 'get_notification_draft_content' that could cause confusion.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description doesn't add any parameter-specific information beyond what's in the schema, which has 100% coverage. The schema already documents the single required parameter 'notification_id' as 'The notification template ID', so the baseline score of 3 is appropriate as the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('published content blocks of a notification template'), making the purpose specific and understandable. However, it doesn't explicitly differentiate from sibling tools like 'get_notification_draft_content' or 'get_message_content', which reduces it from a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context, or exclusions, such as whether it requires specific permissions or differs from similar tools like 'get_notification_draft_content' for draft content.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_notification_draft_contentAInspect
Get the draft (unpublished) content blocks of a notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| notification_id | Yes | The notification template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'Get' and 'draft (unpublished) content blocks,' indicating a read-only operation, but lacks details on permissions, rate limits, error handling, or response format. For a tool with no annotations, this leaves significant gaps in understanding its behavior beyond basic purpose.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence that efficiently conveys the tool's purpose without unnecessary words. It is front-loaded with the key action and resource, making it easy to understand at a glance. Every part of the sentence contributes directly to the tool's definition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (1 parameter, no output schema, no annotations), the description is adequate for basic understanding but lacks completeness. It does not cover behavioral aspects like permissions or response format, which are important for a read operation. Without annotations or output schema, more context would be beneficial for full usability.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the single parameter 'notification_id' documented as 'The notification template ID.' The description does not add any additional meaning beyond this, such as format examples or constraints. With high schema coverage, the baseline score of 3 is appropriate, as the schema adequately handles parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Get') and resource ('draft (unpublished) content blocks of a notification template'), distinguishing it from sibling tools like 'get_notification_content' (which likely retrieves published content) and 'list_notifications' (which lists notifications rather than fetching content). The phrase 'draft (unpublished)' adds precision about the content's state.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage by specifying 'draft (unpublished) content blocks,' suggesting it should be used when working with unpublished notification templates. However, it does not explicitly state when to use this tool versus alternatives like 'get_notification_content' or provide exclusions (e.g., not for published content). The guidance is implied but not comprehensive.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_tenantCInspect
Get a tenant by its ID.
| Name | Required | Description | Default |
|---|---|---|---|
| tenant_id | Yes | The tenant ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states it's a read operation ('Get'), but doesn't disclose permissions, error handling, or response format. This is inadequate for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence with no wasted words. It's front-loaded and efficiently conveys the core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read tool with no annotations and no output schema, the description is insufficient. It lacks details on return values, error cases, or behavioral context, leaving significant gaps for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents the 'tenant_id' parameter. The description adds minimal value by implying the parameter is used to retrieve a tenant, but no additional semantics beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('a tenant'), specifying it's by ID. It's specific but doesn't differentiate from sibling tools like 'list_tenants' or 'create_or_update_tenant' beyond the ID focus.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like 'list_tenants' or 'get_user_tenants'. The description implies usage when you have a specific tenant ID, but lacks explicit context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_translationCInspect
Get a translation for a specific locale (e.g. "en_US", "fr_FR").
| Name | Required | Description | Default |
|---|---|---|---|
| domain | No | Translation domain (only "default" is supported currently) | default |
| locale | Yes | Locale code (e.g. en_US, fr_FR) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states what the tool does but doesn't describe important behaviors: whether this is a read-only operation, what format the translation returns, whether it's cached, what happens with invalid locales, or authentication requirements. The description is minimal and lacks behavioral context beyond the basic function.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that communicates the core purpose without unnecessary words. It's appropriately sized for a simple retrieval tool and front-loads the essential information. Every word earns its place in this concise formulation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no annotations and no output schema, the description is insufficiently complete. It doesn't explain what format the translation returns, whether it's a single string or structured data, what happens with missing translations, or any error conditions. The description leaves too many open questions about how the tool behaves in practice.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already fully documents both parameters. The description adds minimal value beyond the schema - it mentions locale examples that match the schema's description, but doesn't explain the relationship between domain and locale or provide additional context about translation domains. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('translation') with specific scope ('for a specific locale'). It distinguishes itself from sibling tools like 'update_translation' by focusing on retrieval rather than modification. However, it doesn't explicitly differentiate from other get_* tools that might retrieve different resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, when-not-to-use scenarios, or comparison with sibling tools like 'get_user_preferences' or 'get_message_content' that might also involve localized content. The example locales are helpful but don't constitute usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_user_list_subscriptionsCInspect
Get all list subscriptions for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool retrieves data ('Get'), implying a read-only operation, but doesn't specify permissions, rate limits, pagination behavior (despite a 'cursor' parameter), or response format. This is inadequate for a tool with parameters and no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's front-loaded and wastes no space, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and two parameters (one optional for pagination), the description is incomplete. It doesn't address behavioral aspects like pagination, error handling, or return values, leaving significant gaps for the agent to operate effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear parameter descriptions in the schema. The description adds no additional meaning beyond implying the 'user_id' is used to fetch subscriptions, but doesn't explain parameter interactions or usage. With high schema coverage, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('all list subscriptions for a user'), making the purpose immediately understandable. However, it doesn't distinguish this tool from sibling tools like 'get_list_subscribers' or 'list_user_tenants', which also retrieve user-related data, so it lacks sibling differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., user authentication), exclusions, or compare it to similar tools like 'get_list_subscribers' or 'list_user_tenants', leaving the agent to infer usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_user_preferencesBInspect
Get a user's notification preferences (subscriptions, opt-outs, channel preferences).
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID | |
| tenant_id | No | Scope preferences to a specific tenant |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool retrieves data ('Get'), implying it's a read operation, but doesn't specify if it requires authentication, rate limits, pagination, error handling, or what format the returned preferences take. For a read tool with zero annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose. Every word earns its place by specifying the resource and its subcomponents without redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read tool with 2 parameters and 100% schema coverage, the description is minimally adequate. However, with no annotations and no output schema, it lacks details on authentication needs, return format, or error cases. It meets basic needs but leaves contextual gaps that could hinder an agent's effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents both parameters (user_id and tenant_id). The description doesn't add any parameter-specific details beyond what's in the schema (e.g., it doesn't explain how tenant_id affects the output). Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Get') and resource ('user's notification preferences') with specific subcategories (subscriptions, opt-outs, channel preferences). It distinguishes from most siblings (e.g., get_user_profile_by_id, get_user_push_token) by focusing on preferences, but doesn't explicitly differentiate from update_user_preference_topic, which is a related but distinct operation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like get_user_profile_by_id (which might include preferences) or update_user_preference_topic. The description implies usage for retrieving notification preferences but offers no context about prerequisites, error conditions, or when other tools might be more appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_user_profile_by_idBInspect
Get a user profile by their ID. Returns profile data including email, phone, and custom properties.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID to look up |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool returns profile data including email, phone, and custom properties, which adds some behavioral context beyond the basic 'get' action. However, it lacks critical details: it doesn't specify authentication requirements, error handling (e.g., for invalid IDs), rate limits, or whether the data is real-time or cached. For a read operation with no annotations, this leaves significant gaps in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the core purpose and followed by a brief note on return data. Every word earns its place with zero redundancy or fluff. It efficiently communicates the essential information without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (single parameter, no output schema, no annotations), the description is minimally adequate. It covers the basic purpose and return data, but lacks context on usage, behavioral traits, or error handling. Without annotations or an output schema, the description should do more to compensate, but it only partially meets the needs for a standalone tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'user_id' fully documented in the schema as 'The user ID to look up'. The description adds no additional meaning beyond this, such as format examples (e.g., UUID) or sourcing guidance. Given the high schema coverage, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('user profile') with a specific lookup method ('by their ID'). It distinguishes from siblings like 'get_user_preferences' or 'get_user_push_token' by focusing on profile data. However, it doesn't explicitly differentiate from potential profile-related siblings like 'replace_profile' or 'delete_profile' beyond the read vs. write distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a valid user ID), exclusions (e.g., not for bulk lookups), or direct alternatives among siblings like 'list_user_tenants' for related data. The description assumes the context is obvious without explicit usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_user_push_tokenCInspect
Get a specific push/device token for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| token | Yes | The token identifier | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states it 'gets' a token, implying a read-only operation, but doesn't clarify if this requires specific permissions, what happens if the token doesn't exist (e.g., returns null or error), or any rate limits. The description is minimal and misses key behavioral details for a tool with no annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's front-loaded with the core action and resource, making it easy to parse quickly. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a tool that retrieves specific data. It doesn't explain what the output looks like (e.g., token details or error handling), behavioral constraints, or how it differs from siblings. For a read operation with two required parameters, more context is needed to be fully helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('user_id' and 'token') clearly documented in the schema. The description adds no additional meaning beyond what the schema provides, such as explaining the relationship between user_id and token or format examples. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and the resource ('a specific push/device token for a user'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'list_user_push_tokens' or 'create_or_replace_user_push_token', which would require mentioning it retrieves a single token by identifier rather than listing all tokens or modifying them.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. For example, it doesn't mention that 'list_user_push_tokens' should be used to retrieve all tokens for a user, or that 'create_or_replace_user_push_token' is for creating/updating tokens. The description lacks context about prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
invoke_ad_hoc_automationBInspect
Invoke an ad-hoc automation with inline steps (no template needed).
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | ||
| brand | No | ||
| profile | No | ||
| template | No | ||
| recipient | No | ||
| automation | Yes | The automation definition |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions 'invoke' but doesn't disclose behavioral traits like whether this is a read/write operation, permissions required, rate limits, error handling, or what 'invoke' entails (e.g., execution, side effects). The description is minimal and misses critical context for a tool that likely performs actions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that is front-loaded with the core purpose. There is no wasted wording, making it highly concise and well-structured for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (6 parameters, nested objects, no output schema, and no annotations), the description is incomplete. It lacks details on behavior, parameter usage, expected outcomes, and error conditions. For a tool that invokes automations with multiple inputs, this minimal description fails to provide sufficient context for safe and effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is low (17%), with only 'steps' and 'cancelation_token' having descriptions. The description adds no parameter semantics beyond implying 'automation' is required for inline steps. It doesn't explain other parameters like 'data', 'brand', 'profile', 'template', or 'recipient', leaving most of the 6 parameters undocumented and unclear in purpose.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('invoke') and resource ('ad-hoc automation'), specifying it uses 'inline steps (no template needed)'. This distinguishes it from sibling tools like 'invoke_automation_template', which likely requires a template. However, it doesn't fully differentiate from other automation-related tools beyond the template aspect.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by stating 'no template needed', suggesting this tool is for ad-hoc automations without predefined templates. It indirectly contrasts with 'invoke_automation_template', but lacks explicit guidance on when to use this versus other automation or messaging tools, or any prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
invoke_automation_templateCInspect
Invoke an automation run from an existing automation template.
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Data to pass to the automation | |
| brand | No | Brand ID override | |
| profile | No | Profile data for the recipient | |
| template | No | Notification template override | |
| recipient | Yes | Recipient user ID | |
| template_id | Yes | The automation template ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions 'invoke an automation run' which implies execution/triggering behavior, but provides no information about permissions required, rate limits, whether this is a synchronous or asynchronous operation, what happens on failure, or what the expected output looks like.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that gets straight to the point with zero wasted words. It's appropriately sized for the tool's complexity and front-loads the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool that executes automations with 6 parameters (including nested objects) and no annotations or output schema, the description is inadequate. It doesn't explain what an 'automation run' entails, what happens after invocation, error handling, or provide any context about the automation system this interacts with.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so all parameters are documented in the schema itself. The description doesn't add any additional parameter semantics beyond what's already in the schema descriptions. The baseline of 3 is appropriate when the schema does the heavy lifting for parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('invoke an automation run') and the resource ('from an existing automation template'), providing a specific verb+resource combination. However, it doesn't differentiate from sibling 'invoke_ad_hoc_automation' which appears to be a related alternative tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided about when to use this tool versus alternatives like 'invoke_ad_hoc_automation' or other automation-related tools. The description simply states what the tool does without any context about appropriate usage scenarios or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_audience_membersCInspect
List all members of an audience.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| audience_id | Yes | The audience ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. It states the action ('List all members') but doesn't describe return format, pagination behavior (despite a 'cursor' parameter in the schema), rate limits, authentication needs, or error conditions. For a list operation with no annotation coverage, this leaves significant gaps in understanding how the tool behaves.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the core purpose without unnecessary words. It's front-loaded with the essential information ('List all members of an audience') and contains no redundant or verbose elements. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a tool with two parameters (one required). It doesn't explain what 'members' entails (e.g., user objects, IDs), how pagination works with the cursor, or what the return structure looks like. For a list operation in a context with many sibling tools, more contextual detail would help the agent use it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('audience_id' and 'cursor') documented in the schema. The description doesn't add any meaningful semantics beyond what the schema provides—it mentions 'audience' but doesn't clarify the ID format or pagination usage. With high schema coverage, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('members of an audience'), making the purpose immediately understandable. It doesn't explicitly differentiate from sibling tools like 'get_audience' or 'list_audiences', but the specificity of 'members' provides some distinction. The description avoids tautology by not just restating the tool name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'get_list_subscribers' or 'list_user_tenants' that might serve similar purposes, nor does it specify prerequisites or contexts for usage. The agent must infer usage from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_audiencesCInspect
List all audiences in the workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. It states it's a list operation, implying read-only behavior, but doesn't mention pagination (despite a 'cursor' parameter in the schema), rate limits, authentication requirements, or what 'all audiences' entails (e.g., archived vs. active). For a tool with no annotation coverage, this leaves significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose ('List all audiences in the workspace'). There's no wasted language or redundancy. It's appropriately sized for a simple list tool, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and a simple but incomplete description, the tool definition is inadequate for reliable agent use. The description doesn't cover pagination behavior (implied by the cursor parameter), return format, or error conditions. For a list operation with pagination, this leaves the agent guessing about how to handle multiple pages of results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the single parameter 'cursor' documented as 'Pagination cursor' in the schema. The description adds no additional parameter information beyond what's in the schema. According to scoring rules, when schema coverage is high (>80%), the baseline is 3 even with no param info in the description, which applies here.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('all audiences in the workspace'), making the purpose immediately understandable. It doesn't explicitly differentiate from sibling tools like 'get_audience' (singular) or 'list_audience_members', but the scope is clear. This is a straightforward read operation with unambiguous intent.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With siblings like 'get_audience' (for a specific audience) and 'list_audience_members' (for members within an audience), there's no indication of when this list-all operation is appropriate versus more targeted queries. The agent must infer usage from tool names alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_audit_eventsBInspect
List audit events in the workspace. Useful for tracking API usage and changes.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions the tool is 'useful for tracking API usage and changes,' which hints at read-only behavior but does not explicitly state it. Critical details like pagination behavior (implied by the 'cursor' parameter), rate limits, authentication needs, or response format are missing, making it inadequate for a mutation-free tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two sentences that are front-loaded and to the point. The first sentence states the purpose, and the second adds context without redundancy. However, it could be slightly more structured by explicitly mentioning the pagination aspect, which would enhance clarity without sacrificing brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity is low (single optional parameter, no output schema), the description is minimally adequate. It covers the basic purpose and usage hint but lacks details on behavioral aspects like pagination, response format, or error handling. With no annotations and no output schema, it should do more to compensate, but the simplicity of the tool keeps it from being severely incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the 'cursor' parameter documented as 'Pagination cursor.' The description does not add any meaning beyond this, as it mentions no parameters. Given the high schema coverage, the baseline score of 3 is appropriate, as the schema handles the parameter documentation adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('audit events in the workspace'), making the purpose specific and understandable. It distinguishes from sibling tools like 'get_audit_event' (singular) by implying a collection operation. However, it doesn't explicitly differentiate from other list tools (e.g., 'list_audiences', 'list_messages'), which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implied usage context with 'Useful for tracking API usage and changes,' suggesting when this tool might be applied. However, it lacks explicit guidance on when to use this versus alternatives (e.g., 'get_audit_event' for a single event or other list tools), and does not mention prerequisites or exclusions, leaving gaps in decision-making.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_brandsBInspect
List all brands in the workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states it's a list operation, implying read-only behavior, but doesn't mention pagination (despite a 'cursor' parameter in the schema), rate limits, authentication requirements, or what 'all brands' entails (e.g., archived vs. active). This leaves significant gaps for an agent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with zero waste—it directly states the tool's purpose without fluff or redundancy. Every word earns its place, making it highly efficient for an agent to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (1 optional parameter, no output schema, no annotations), the description is minimally adequate but incomplete. It covers the basic purpose but lacks behavioral details (e.g., pagination, scope) and usage guidelines, which are needed for full agent understanding in a context with many sibling tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'cursor' documented as 'Pagination cursor' in the schema. The description adds no additional parameter information beyond implying it lists 'all brands' (which doesn't clarify the cursor's role). Baseline is 3 since the schema does the heavy lifting, but the description doesn't compensate with extra context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('all brands in the workspace'), making the purpose immediately understandable. It doesn't explicitly differentiate from sibling tools like 'get_brand' (singular vs. plural) or 'list_audiences' (different resource type), but the scope is unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'get_brand' (for a single brand) or other list tools (e.g., 'list_audiences'). The description implies it's for retrieving multiple brands, but lacks explicit context about prerequisites, filtering, or comparisons to siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_bulk_usersCInspect
List the users in a bulk job.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| job_id | Yes | The bulk job ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states a read operation ('list'), implying no destructive effects, but doesn't disclose behavioral traits like pagination (hinted by the 'cursor' parameter in schema), rate limits, authentication needs, or return format. This leaves significant gaps for a tool with parameters and no output schema, making it minimally informative.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste, front-loading the core purpose. It's appropriately sized for a simple list tool, though it could benefit from slightly more detail without losing conciseness. Every word earns its place, but it's borderline under-specified rather than optimally concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given complexity (2 parameters, no annotations, no output schema), the description is incomplete. It doesn't explain return values, pagination behavior, or error conditions, leaving the agent to infer from the schema alone. For a tool with no structured output and minimal behavioral disclosure, this is inadequate—it should provide more context to guide effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with parameters 'job_id' and 'cursor' fully documented in the schema. The description adds no meaning beyond the schema—it doesn't explain parameter interactions, format details, or usage examples. Baseline is 3 since the schema does the heavy lifting, but the description doesn't compensate or enhance understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'List the users in a bulk job' clearly states the action (list) and resource (users in a bulk job), but it's vague about scope—it doesn't specify if it lists all users, paginated results, or filtered subsets. It distinguishes from siblings like 'get_bulk_job' (which likely returns job metadata) but not explicitly from 'list_audience_members' or other list tools, lacking precise differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a bulk job ID), exclusions, or comparisons to siblings like 'list_user_tenants' or 'get_user_profile_by_id'. The description implies usage for bulk job contexts but offers no explicit when/when-not rules or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_listsCInspect
Get all lists. Optionally filter by pattern (e.g. 'example.list.*').
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor | |
| pattern | No | Filter pattern (e.g. 'example.list.*') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions 'Get all lists' and optional filtering, but lacks critical behavioral details such as pagination behavior (implied by the 'cursor' parameter in the schema), rate limits, authentication requirements, or whether it's read-only. This leaves significant gaps for an agent to understand how to use it effectively.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—two short sentences that directly state the tool's function and optional feature. It is front-loaded with the core purpose and wastes no words, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a tool with two parameters. It fails to address key contextual elements like pagination behavior (implied by 'cursor'), response format, error handling, or usage constraints, leaving the agent with insufficient information for reliable invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('cursor' for pagination, 'pattern' for filtering). The description adds minimal value by mentioning the 'pattern' parameter with an example, but doesn't provide additional context beyond what's in the schema, such as pattern syntax details or cursor usage scenarios.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and resource 'all lists', making the purpose specific and understandable. It distinguishes from sibling 'get_list' by indicating it retrieves multiple lists rather than a single one, though it doesn't explicitly contrast with other list-related tools like 'list_audiences' or 'list_notifications'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It mentions optional filtering but doesn't specify scenarios where filtering is appropriate or compare it to other list-related tools like 'get_list' for single retrieval or 'list_audiences' for different resource types.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_messagesCInspect
List messages you've previously sent. Filter by status, recipient, notification, provider, tags, or tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| tag | No | Filter by metadata tags | |
| list | No | Filter by list ID | |
| tags | No | Comma-delimited list of tags | |
| event | No | Filter by event ID | |
| cursor | No | Pagination cursor for fetching the next page | |
| status | No | Filter by status (e.g. DELIVERED, UNDELIVERABLE) | |
| traceId | No | Filter by trace ID | |
| archived | No | Include archived messages | |
| provider | No | Filter by provider key (e.g. sendgrid, twilio) | |
| messageId | No | Filter by message ID | |
| recipient | No | Filter by recipient user ID | |
| tenant_id | No | Filter by tenant ID | |
| notification | No | Filter by notification ID | |
| enqueued_after | No | ISO 8601 timestamp; only return messages enqueued after this time |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions filtering but doesn't describe critical behaviors: whether this is a read-only operation, if it supports pagination (though 'cursor' parameter hints at it), rate limits, authentication requirements, or what the return format looks like. For a list operation with 14 parameters, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose. It could be slightly improved by structuring filtering examples more clearly, but there's no wasted verbiage and it gets straight to the point.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a list operation with 14 parameters and no annotations or output schema, the description is incomplete. It doesn't explain the response format, pagination behavior, error conditions, or how multiple filters interact. Given the complexity and lack of structured metadata, more contextual information is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all 14 parameters. The description adds minimal value by listing some filterable fields (status, recipient, etc.) but doesn't provide additional context beyond what's in the schema. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List messages you've previously sent') and resource ('messages'), making the purpose immediately understandable. However, it doesn't differentiate this tool from sibling tools like 'get_message' or 'get_message_history', which also retrieve message-related data, so it doesn't achieve full sibling differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions filtering capabilities but provides no guidance on when to use this tool versus alternatives. There's no mention of prerequisites, limitations, or comparison with sibling tools like 'get_message' (for single messages) or 'list_notifications' (for notifications). This leaves the agent without context for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_notificationsCInspect
List notification templates. Optionally filter by cursor.
| Name | Required | Description | Default |
|---|---|---|---|
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions pagination via cursor but doesn't disclose other behavioral traits such as rate limits, authentication needs, return format, or whether it's read-only. For a list tool with zero annotation coverage, this leaves significant gaps in understanding its operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It front-loads the core purpose and includes the optional parameter detail concisely, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete. It lacks details on return values, error handling, or other contextual aspects needed for a list operation. While it covers the basic action, it doesn't provide enough information for reliable tool invocation in a complex environment.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents the 'cursor' parameter as a pagination cursor. The description adds minimal value by noting it's optional for filtering, but doesn't provide additional semantics beyond what the schema states, aligning with the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('notification templates'), making the purpose evident. However, it doesn't explicitly differentiate from sibling tools like 'list_messages' or 'list_audiences', which also list resources, so it misses full sibling distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions optional filtering by cursor, implying usage for pagination, but provides no guidance on when to use this tool versus alternatives like 'get_notification_content' or other list tools. There are no explicit when/when-not instructions or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_tenantsBInspect
List all tenants in the workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results per page | |
| cursor | No | Pagination cursor |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states 'List all tenants' but doesn't disclose behavioral traits such as pagination behavior (implied by 'limit' and 'cursor' parameters), authentication requirements, rate limits, or what 'all' means in practice (e.g., includes archived tenants?). This leaves significant gaps for a tool with pagination parameters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's appropriately sized and front-loaded, clearly stating the tool's purpose without unnecessary elaboration, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (list operation with pagination), no annotations, and no output schema, the description is minimally adequate. It states what the tool does but lacks details on behavioral context, output format, or error handling, leaving the agent with incomplete information for reliable invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('limit' and 'cursor') well-documented in the schema. The description adds no additional parameter semantics beyond what the schema provides, so it meets the baseline of 3 where the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List') and resource ('all tenants in the workspace'), providing a specific verb+resource combination. However, it doesn't differentiate from sibling tools like 'get_tenant' (singular retrieval) or 'list_user_tenants' (user-specific listing), which would require explicit sibling differentiation for a score of 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, exclusions, or compare to siblings like 'get_tenant' for single tenant retrieval or 'list_user_tenants' for user-specific listings, leaving the agent without contextual usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_user_push_tokensCInspect
List all push/device tokens for a user.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool lists tokens but fails to mention whether this is a read-only operation, if it requires specific permissions, what the output format looks like (e.g., pagination, token details), or any rate limits. This leaves significant gaps in understanding the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and wastes no space, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and output schema, the description is incomplete for a tool that likely returns sensitive data (push tokens). It doesn't cover behavioral aspects like security implications, response structure, or error handling, leaving the agent with insufficient context to use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the 'user_id' parameter clearly documented. The description adds no additional meaning beyond what the schema provides, such as clarifying the token scope or user context. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List') and resource ('push/device tokens for a user'), making the purpose immediately understandable. However, it doesn't explicitly differentiate from sibling tools like 'get_user_push_token' (singular vs. plural), which would require a more specific distinction to achieve a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as 'get_user_push_token' for retrieving a single token or other user-related tools. It lacks context about prerequisites, timing, or exclusions, leaving the agent without usage direction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_user_tenantsCInspect
List all tenants a user belongs to.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results per page | |
| cursor | No | Pagination cursor | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It states it's a list operation but doesn't mention pagination behavior (though schema hints at it via limit/cursor), authentication requirements, rate limits, error conditions, or what the output looks like. For a tool with 3 parameters and no annotation coverage, this is inadequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the core purpose without any fluff. It's appropriately sized for a simple list operation and front-loads the essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 3 parameters, no annotations, and no output schema, the description is insufficient. It doesn't explain the return format, pagination strategy, error handling, or relationship to sibling tools. The agent lacks critical context to use this tool effectively beyond basic parameter passing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are fully documented in the schema. The description adds no additional parameter semantics beyond implying the user_id is for filtering tenants. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('List') and resource ('tenants a user belongs to'), making the purpose immediately understandable. It doesn't specifically differentiate from sibling tools like 'list_tenants' (which appears to list all tenants rather than user-specific ones), but the user-specific focus is implied through the parameter requirement.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'list_tenants' or 'get_tenant'. It doesn't mention prerequisites, context for user_id selection, or any exclusions. The agent must infer usage from the parameter requirement alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
remove_user_from_tenantCInspect
Remove a user from a tenant.
| Name | Required | Description | Default |
|---|---|---|---|
| user_id | Yes | The user ID | |
| tenant_id | Yes | The tenant ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. While 'Remove' implies a destructive mutation, the description doesn't specify whether this requires admin permissions, whether the action is reversible, what happens to user data, or if there are rate limits. This leaves significant behavioral gaps for a mutation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's appropriately sized for a simple tool with two parameters and gets straight to the point without unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive mutation tool with no annotations and no output schema, the description is insufficiently complete. It doesn't address important contextual aspects like permissions required, consequences of removal, error conditions, or what the tool returns. The description should provide more behavioral context given the tool's complexity and lack of structured metadata.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters clearly documented in the schema. The description doesn't add any additional semantic context about the parameters beyond what's already in the schema (e.g., format requirements, relationship between user_id and tenant_id). The baseline of 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Remove') and target ('a user from a tenant'), providing a specific verb+resource combination. However, it doesn't distinguish this tool from sibling tools like 'delete_user' or 'delete_tenant', which might handle similar user/tenant removal operations differently.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are no explicit instructions about prerequisites (e.g., user must exist in tenant), when-not-to-use scenarios, or references to sibling tools like 'delete_user' or 'delete_tenant' that might handle related operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
replace_profileAInspect
Fully replace a user profile (PUT). All existing data is overwritten; include every field you want to keep.
| Name | Required | Description | Default |
|---|---|---|---|
| profile | Yes | Complete profile data to replace with | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key traits: it's a destructive PUT operation that overwrites all existing data, and it requires including every field to retain. It lacks details on permissions, rate limits, or error handling, but covers the core mutation behavior adequately.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is highly concise and front-loaded, consisting of two clear sentences with zero waste. The first sentence states the core action, and the second provides critical behavioral guidance, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a destructive mutation tool with no annotations and no output schema, the description is moderately complete. It covers the overwrite behavior and parameter expectations but lacks details on permissions, error responses, or what happens to unspecified fields. Given the complexity, it should do more to compensate for missing structured data.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('user_id' and 'profile'). The description adds minimal value beyond the schema by implying the 'profile' parameter must be complete, but does not provide additional syntax or format details. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('replace') and resource ('user profile'), and distinguishes it from siblings like 'create_or_merge_user' or 'delete_profile' by emphasizing full overwrite rather than partial updates or deletions. It explicitly mentions 'all existing data is overwritten', which differentiates it from update operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool: when fully replacing a user profile and including all fields to keep. However, it does not explicitly state when not to use it (e.g., vs. partial updates or creation) or name alternatives like 'create_or_merge_user', leaving some ambiguity in sibling differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
run_bulk_jobCInspect
Run a bulk job, triggering delivery to all added users.
| Name | Required | Description | Default |
|---|---|---|---|
| job_id | Yes | The bulk job ID to run |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions 'triggering delivery' which implies a write/mutation operation, but doesn't disclose critical behavioral traits such as permissions required, whether the job runs asynchronously, rate limits, error handling, or what 'delivery' entails. This is a significant gap for a tool that likely modifies system state.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's front-loaded with the core action and effect, making it easy to parse. Every word earns its place without redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (likely a mutation operation with delivery implications), lack of annotations, and no output schema, the description is incomplete. It doesn't cover behavioral aspects, return values, error conditions, or integration context with siblings like 'add_bulk_users'. For a tool that triggers deliveries, more detail is needed to ensure safe and correct usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the single parameter 'job_id' documented in the schema. The description adds no additional meaning about the parameter beyond what the schema provides (e.g., format examples, source of job_id, or validation rules). Baseline score of 3 is appropriate since the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Run') and resource ('bulk job'), and specifies the action's effect ('triggering delivery to all added users'). It distinguishes from sibling tools like 'create_bulk_job' or 'get_bulk_job' by focusing on execution rather than creation or retrieval. However, it doesn't explicitly differentiate from other execution-related tools like 'invoke_ad_hoc_automation'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing a created bulk job with added users), exclusions (e.g., not for testing), or comparisons to siblings like 'send_message_to_list' for similar delivery purposes. Usage is implied but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_messageBInspect
Send a message to a user using inline title and body content (no template). Optionally specify routing channels.
| Name | Required | Description | Default |
|---|---|---|---|
| body | Yes | Message body | |
| data | No | Key-value data to include with the message | |
| title | Yes | Message title | |
| method | No | Routing method: deliver to all channels or stop after first success | all |
| user_id | Yes | The recipient user ID | |
| channels | No | Channel names to route through (e.g. email, sms, push). Omit to use default routing. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the tool sends messages and optionally specifies routing channels, but lacks critical details: it doesn't disclose whether this is a mutating operation (likely yes), what permissions are needed, rate limits, error conditions, or what happens on success/failure. The description is insufficient for a mutation tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose and includes optional functionality. Every word earns its place with zero redundancy or fluff, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a mutation tool (sending messages) with no annotations and no output schema, the description is incomplete. It doesn't explain what the tool returns, error handling, side effects, or security requirements. For a 6-parameter tool that likely modifies system state, more contextual information is needed to use it safely and effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all 6 parameters thoroughly. The description adds minimal value beyond the schema: it mentions 'inline title and body content' (implied by required parameters) and 'routing channels' (maps to the 'channels' parameter). No additional syntax, format, or behavioral context is provided for parameters beyond what the schema offers.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send a message') and resource ('to a user'), specifying it uses 'inline title and body content (no template)'. It distinguishes from sibling tools like 'send_message_template' by mentioning 'no template', but doesn't explicitly differentiate from other messaging tools like 'send_message_to_list'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for sending direct messages with inline content rather than templates, and mentions optional routing channels. However, it doesn't provide explicit guidance on when to use this versus alternatives like 'send_message_template' or 'send_message_to_list', nor does it mention prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_message_templateCInspect
Send a message to a user using a pre-configured notification template. Optionally pass data and routing.
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Key-value data for template variables | |
| method | No | Routing method | all |
| user_id | Yes | The recipient user ID | |
| channels | No | Channel names to route through. Omit to use template routing config. | |
| template | Yes | Template ID or notification slug |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool sends a message but doesn't cover critical aspects like whether this is a read-only or destructive operation, permission requirements, rate limits, error handling, or what happens on success/failure. This is inadequate for a tool that likely involves notifications or user interactions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose and mentions optional features without unnecessary elaboration. Every word serves a purpose, making it highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of sending messages with templates and routing, no annotations, and no output schema, the description is insufficient. It lacks details on behavioral traits, error cases, response format, and differentiation from sibling tools, leaving significant gaps for an AI agent to understand how to use this tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so parameters are well-documented in the schema. The description adds minimal value by mentioning 'optionally pass data and routing', which aligns with the 'data' and 'method/channels' parameters but doesn't provide additional context beyond what the schema already explains.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('send') and resource ('message'), specifying it uses a 'pre-configured notification template' and mentions optional data and routing. However, it doesn't explicitly differentiate from sibling tools like 'send_message' or 'send_message_to_list_template', which appear related but have different scopes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'send_message' or 'send_message_to_list_template'. It mentions optional features but doesn't specify scenarios or prerequisites for using this tool over others.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_message_to_listCInspect
Send a message to all subscribers of a list using inline title and body content.
| Name | Required | Description | Default |
|---|---|---|---|
| body | Yes | Message body | |
| data | No | Key-value data to include | |
| title | Yes | Message title | |
| method | No | Routing method | all |
| list_id | Yes | The list ID to send to | |
| channels | No | Channel names to route through. Omit to use default routing. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It states the tool sends messages, implying a write/mutation operation, but doesn't disclose behavioral traits like rate limits, permissions required, whether it's asynchronous, or what happens on failure. The description adds minimal context beyond the basic action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with the core action, zero waste. Every word contributes to understanding the tool's purpose efficiently.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a mutation tool with 6 parameters, no annotations, and no output schema, the description is incomplete. It lacks behavioral context (e.g., side effects, error handling), doesn't explain optional parameters like 'method' or 'channels', and provides no guidance on usage relative to siblings.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters. The description mentions 'inline title and body content', which aligns with the 'title' and 'body' parameters but doesn't add meaning beyond what the schema provides. No extra syntax, format details, or usage examples are given.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send a message') and target ('to all subscribers of a list'), specifying the content source ('using inline title and body content'). It distinguishes from sibling tools like 'send_message' (generic) and 'send_message_to_list_template' (template-based), but doesn't explicitly mention these alternatives in the description text itself.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives like 'send_message_to_list_template' or 'send_message'. The description implies it's for sending to list subscribers with inline content, but lacks context about prerequisites, timing, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
send_message_to_list_templateCInspect
Send a message to all subscribers of a list using a notification template.
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Key-value data for template variables | |
| method | No | Routing method | all |
| list_id | Yes | The list ID to send to | |
| channels | No | Channel names to route through. Omit to use template routing config. | |
| template | Yes | Template ID or notification slug |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions sending to 'all subscribers' but omits critical details like whether this is a bulk operation, potential rate limits, authentication requirements, or what happens on failure. For a tool that likely involves significant impact, this is a notable gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and wastes no space, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a tool that sends messages to lists using templates, with no annotations and no output schema, the description is insufficient. It lacks details on behavioral traits, error handling, and output expectations, leaving significant gaps for an AI agent to understand the tool fully.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description does not add meaning beyond what the input schema provides, as schema description coverage is 100%. It mentions 'list' and 'template' but doesn't elaborate on their semantics or usage. With high schema coverage, the baseline score of 3 is appropriate, as the schema adequately documents parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Send a message') and target ('to all subscribers of a list using a notification template'), making the purpose evident. However, it doesn't explicitly differentiate from sibling tools like 'send_message_to_list' or 'send_message_template', which appear related but have nuanced differences in functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'send_message_to_list' or 'send_message_template', nor does it mention prerequisites like needing an existing list or template. It lacks context for decision-making among similar tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subscribe_user_to_listBInspect
Subscribe a user to a list. Creates the list if it doesn't exist.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID | |
| user_id | Yes | The user ID to subscribe | |
| preferences | No | Optional notification preferences |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions the side effect of creating a list if missing, which is useful, but fails to disclose other behavioral traits like required permissions, whether the operation is idempotent, error handling, or rate limits. For a mutation tool with zero annotation coverage, this is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences with zero waste, front-loaded with the primary action and followed by a key behavioral note. Every sentence earns its place by adding value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a mutation tool. It covers the basic purpose and a side effect but lacks details on permissions, error cases, or return values. However, it is minimally adequate for the core functionality, aligning with a score of 3.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters (list_id, user_id, preferences). The description does not add meaning beyond what the schema provides, such as explaining the purpose of preferences or format details. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Subscribe a user to a list') and resource ('list'), with a specific additional behavior ('Creates the list if it doesn't exist'). However, it does not explicitly differentiate from sibling tools like 'subscribe_user_to_lists' or 'create_list', which would be needed for a score of 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'subscribe_user_to_lists' (for multiple lists) or 'create_list' (if list creation is the primary goal). It lacks explicit when/when-not instructions or prerequisites, leaving usage context implied.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subscribe_user_to_listsCInspect
Subscribe a user to one or more lists. Creates lists that do not exist.
| Name | Required | Description | Default |
|---|---|---|---|
| lists | Yes | Array of lists to subscribe to | |
| user_id | Yes | The user ID |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions that the tool 'Creates lists that do not exist,' which is a useful behavioral trait beyond basic subscription. However, it lacks critical details such as required permissions, whether the operation is idempotent, error handling for invalid inputs, or what happens to existing subscriptions. For a mutation tool with zero annotation coverage, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two sentences that directly state the tool's actions. It is front-loaded with the primary purpose and adds a secondary behavior without any wasted words, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that this is a mutation tool with no annotations and no output schema, the description is incomplete. It lacks information on return values, error conditions, side effects (e.g., impact on user notifications), and how it interacts with sibling tools. The mention of list creation adds some value, but overall, it doesn't provide enough context for safe and effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with clear documentation for 'user_id' and 'lists' parameters. The description adds no additional semantic context about parameters beyond what the schema provides, such as format examples or constraints. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('Subscribe') and resource ('user to one or more lists'), and it adds a secondary action ('Creates lists that do not exist'). However, it doesn't explicitly differentiate from the sibling tool 'subscribe_user_to_list' (singular vs. plural), which could cause confusion about when to use each.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With sibling tools like 'subscribe_user_to_list' (singular) and 'add_user_to_tenant', there's no indication of when bulk subscription or list creation is preferred, nor any prerequisites or exclusions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
track_inbound_eventCInspect
Track an inbound event that can trigger automations. Requires event name, messageId (for deduplication), and properties.
| Name | Required | Description | Default |
|---|---|---|---|
| event | Yes | The event name (appears as trigger in Automation Trigger node) | |
| userId | No | User ID associated with the event | |
| messageId | Yes | Unique ID for deduplication (returns 409 if not unique) | |
| properties | Yes | Event properties payload |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions deduplication via messageId but doesn't explain the 409 conflict response mentioned in the schema. It doesn't disclose authentication requirements, rate limits, side effects, or what happens after tracking (e.g., how automations are triggered). For a tool that presumably creates/mutates event data, this is insufficient behavioral disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the purpose and lists required parameters. It's appropriately sized and front-loaded with the core functionality. No wasted words, though it could be slightly more structured with separate purpose and parameter sections.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 4 parameters, no annotations, no output schema, and nested objects in properties, the description is incomplete. It doesn't explain what happens after tracking, how automations are triggered, error conditions beyond deduplication, or the structure of the properties object. Given the complexity and lack of structured metadata, more context is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description mentions the three required parameters (event, messageId, properties) but adds no additional semantic context beyond what's in the schema. The baseline of 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('track') and resource ('inbound event'), and specifies its purpose ('can trigger automations'). It doesn't explicitly differentiate from sibling tools, but since no other tools mention event tracking, this is adequate. The description goes beyond tautology by explaining the automation triggering capability.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It mentions required parameters but doesn't indicate scenarios where this tool is appropriate versus other event-related or automation tools. With many sibling tools available, this lack of contextual guidance is a significant gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
unsubscribe_user_from_listCInspect
Unsubscribe a user from a list.
| Name | Required | Description | Default |
|---|---|---|---|
| list_id | Yes | The list ID | |
| user_id | Yes | The user ID to unsubscribe |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Unsubscribe' implies a mutation (likely removing a user from a list), the description doesn't clarify whether this requires specific permissions, if it's reversible, what happens on success/failure, or any rate limits. For a mutation tool with zero annotation coverage, this is a significant gap in behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, direct sentence with zero wasted words. It's front-loaded with the core action and resource, making it highly efficient. Every word earns its place, achieving optimal conciseness without being under-specified.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's mutation nature (implied by 'Unsubscribe'), lack of annotations, and absence of an output schema, the description is incomplete. It doesn't address behavioral aspects like permissions, side effects, or response format. For a tool that modifies data, more context is needed to ensure safe and correct usage by an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with both parameters ('list_id' and 'user_id') clearly documented in the schema. The description adds no additional meaning beyond what the schema provides (e.g., it doesn't explain format constraints or examples). Given the high schema coverage, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Unsubscribe') and the target ('a user from a list'), making the purpose immediately understandable. It uses a specific verb and identifies the resource involved. However, it doesn't explicitly differentiate from sibling tools like 'delete_user_list_subscriptions' or 'subscribe_user_to_list', which would require more specificity to earn a 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, when-not-to-use scenarios, or direct comparisons to related sibling tools like 'subscribe_user_to_list' or 'delete_user_list_subscriptions'. This leaves the agent without contextual usage instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_audienceCInspect
Create or update an audience with a filter definition.
| Name | Required | Description | Default |
|---|---|---|---|
| name | No | Display name | |
| filter | No | Filter definition object (operator, rules) | |
| audience_id | Yes | The audience ID | |
| description | No | Description |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions 'Create or update' which implies mutation, but doesn't disclose behavioral traits like required permissions, whether it's idempotent, what happens on conflicts, rate limits, or response format. For a mutation tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action ('Create or update an audience') and adds essential context ('with a filter definition'). There is zero waste, and every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a mutation tool with no annotations, no output schema, and 4 parameters, the description is incomplete. It lacks details on behavior (e.g., what 'Create or update' entails operationally), error handling, or response expectations. For a tool that modifies data, more context is needed to ensure safe and correct usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all 4 parameters with descriptions. The description adds no additional meaning beyond implying 'filter definition' is a key component, but doesn't explain syntax or constraints. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Create or update') and resource ('an audience'), specifying it involves a 'filter definition'. It distinguishes from sibling tools like 'delete_audience' and 'get_audience' by indicating mutation. However, it doesn't explicitly differentiate from 'create_list' or 'create_or_update_tenant', which are similar mutation operations on different resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., when to create vs. update), exclusions, or compare to siblings like 'create_list' for list management. Usage is implied by the name and purpose but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_translationCInspect
Create or update a translation for a specific locale.
| Name | Required | Description | Default |
|---|---|---|---|
| body | Yes | Translation content (PO file format) | |
| domain | No | Translation domain | default |
| locale | Yes | Locale code (e.g. en_US, fr_FR) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It implies a mutation operation ('Create or update') but does not specify permissions, side effects, error handling, or response format. This leaves critical behavioral traits undocumented for a tool that modifies data.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action without unnecessary words. It earns its place by succinctly conveying the tool's purpose, making it easy to parse and understand quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a mutation tool with no annotations and no output schema, the description is insufficient. It lacks details on behavioral traits, error conditions, and return values, leaving gaps that could hinder an AI agent's ability to use the tool effectively in context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the input schema already documents parameters like 'locale' and 'body'. The description adds no additional meaning beyond what the schema provides, such as explaining the interaction between parameters or usage nuances, resulting in a baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Create or update') and resource ('translation for a specific locale'), making the purpose unambiguous. However, it does not differentiate from sibling tools like 'get_translation', which might retrieve translations, leaving room for improvement in sibling distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives, such as 'get_translation' for retrieval or other tools for related operations. The description lacks context on prerequisites, exclusions, or specific scenarios for application.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_user_preference_topicCInspect
Update a user's preference for a specific subscription topic (opt in, opt out, or set channel preferences).
| Name | Required | Description | Default |
|---|---|---|---|
| status | Yes | Preference status | |
| user_id | Yes | The user ID | |
| topic_id | Yes | The subscription topic ID | |
| custom_routing | No | Custom channel routing order | |
| has_custom_routing | No | Whether custom channel routing is set |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool updates preferences but doesn't cover permissions required, whether changes are reversible, rate limits, or response format. For a mutation tool with zero annotation coverage, this is a significant gap in transparency about its behavior and constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core action ('Update a user's preference') and includes key details (topic specificity and options). There is zero waste or redundancy, making it appropriately sized and well-structured for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (mutation with 5 parameters, no annotations, and no output schema), the description is incomplete. It doesn't explain return values, error conditions, or behavioral traits like idempotency. For a tool that modifies user preferences, more context on outcomes and constraints is needed to be fully helpful to an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents all parameters with descriptions. The description adds marginal value by mentioning 'opt in, opt out, or set channel preferences', which loosely relates to the 'status' enum and optional 'custom_routing' parameters, but doesn't provide additional syntax or format details beyond what the schema provides. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Update' and the resource 'user's preference for a specific subscription topic', specifying the action and target. It distinguishes from sibling tools like 'get_user_preferences' (read vs. write) and 'subscribe_user_to_list' (subscription vs. preference), though it doesn't explicitly name alternatives. The purpose is specific but could better differentiate from similar mutation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., user/topic existence), compare to siblings like 'update_translation' or 'replace_profile', or indicate scenarios for opting in/out versus setting channel preferences. Usage is implied from the action but lacks explicit context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!