ExanorOS

Server Details

AI-powered virtual assistant that connects to Gmail, Outlook, Slack, Salesforce, HubSpot, Google Calendar, DocuSign, QuickBooks, and 20+ other services. Manage your entire professional life through a single Claude conversation — email, calendar, CRM, documents, travel, expenses, and more.

Status: Healthy
Last Tested: 2026-05-17 17:32
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

C2.8/5.0

Tool DescriptionsC

Average 2.8/5 across 91 of 106 tools scored. Lowest: 1.5/5.

Server CoherenceA

Disambiguation4/5

Most tools have distinct purposes with clear descriptions, but some overlap exists (e.g., multiple inbox-related tools, catch_me_up vs morning_briefing) and a few pairs like lookup_contact vs lookup_outlook_contact are similar but platform-specific. The descriptions generally help disambiguate.

Naming Consistency4/5

The majority of tools follow a consistent verb_noun snake_case pattern (e.g., create_calendar_event, send_email). However, several exceptions exist like 'email_action', 'catch_me_up', 'morning_briefing', 'get_ooo', and 'troubleshoot', which break the pattern slightly.

Tool Count2/5

With 106 tools, the server is extremely heavy. While the broad scope of services (Google, Microsoft, CRM, accounting, messaging, etc.) justifies many tools, the number is well beyond typical MCP servers and likely causes cognitive load for agents selecting the right tool.

Completeness5/5

The tool surface is remarkably complete for a personal assistant, covering creation, retrieval, update, deletion, and specialized operations across numerous domains (email, calendar, contacts, messaging, CRM, finance, travel, etc.). It includes both platforms and integrates multiple services seamlessly.

Available Tools

107 tools

append_to_docCInspect

Append content to a Google Doc.

ParametersJSON Schema

Name	Required	Description	Default
`fileId`	Yes
`content`	Yes

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description should fully disclose behavioral traits. It only states the action, missing details like whether content is appended to the end, any formatting allowed, and authentication needs. This is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise (one sentence), which is good for structure, but it sacrifices necessary detail, making it less useful than a slightly longer but informative description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (two required parameters, no output schema), the description fails to provide context such as expected behavior on existing content, error conditions, or return values. It is incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% coverage, meaning no parameter descriptions. The tool description adds no additional meaning to the parameter names 'fileId' and 'content', leaving the agent to infer their semantics entirely from the names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (append) and the resource (Google Doc), making the purpose understandable. However, it does not differentiate from siblings like append_to_excel or append_to_sheet, but the different resource type provides implicit distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as create_doc or update_calendar_event. There is no mention of prerequisites or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

append_to_excelCInspect

Add rows to an Excel file in OneDrive.

ParametersJSON Schema

Name	Required	Description	Default
`fileId`	Yes
`values`	Yes
`sheetName`	No

Tool Definition Quality

C2.2/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description fails to disclose any behavioral traits such as whether the tool overwrites or appends, permission requirements, rate limits, or error handling. This is insufficient for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (one sentence) with no waste, but it sacrifices essential information for brevity. It is front-loaded but incomplete.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of annotations and output schema, and the presence of three parameters, the description is grossly incomplete. It does not cover usage of parameters, expected input format, success indicators, or error scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description should compensate but does not. It offers no explanation of required parameters like 'fileId' (source) or 'values' (formatting), nor optional 'sheetName'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('add rows') and resource ('Excel file in OneDrive'), distinguishing it from sibling tools like 'append_to_sheet' which likely targets Google Sheets. However, it lacks specifics like appending to the end of the sheet.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as 'create_excel' or 'append_to_sheet'. The description provides no context about prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

append_to_sheetCInspect

Add rows to a Google Sheet.

ParametersJSON Schema

Name	Required	Description	Default
`fileId`	Yes
`values`	Yes
`sheetName`	No

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden but only says 'Add rows'. It does not disclose what happens if the sheet doesn't exist, whether headers are required, or any side effects. This is insufficient for a tool with write semantics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise. However, it is too brief, sacrificing necessary detail for brevity. It could be slightly longer to include parameter context without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of parameter descriptions and output schema, the description does not provide enough information for an agent to use the tool correctly. It omits details about the structure of 'values' and how rows are appended.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has three parameters with 0% description coverage, and the description adds no information about them. The agent is left to guess the meaning of 'values' (array of arrays) and 'sheetName'. This is a critical gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Add rows') and the target resource ('a Google Sheet'). This verb+resource structure is specific and distinguishes from sibling tools like append_to_doc and append_to_excel.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

There is no guidance on when to use this tool versus alternatives. The description only states what the tool does, omitting any context about prerequisites, scenarios, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cancel_calendlyCInspect

Cancel a Calendly booking by UUID.

ParametersJSON Schema

Name	Required	Description	Default
`uuid`	Yes
`reason`	No

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavior. It fails to mention side effects (e.g., notifications, irreversibility) or any confirmation requirements. For a destructive action, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence, but it sacrifices needed detail for brevity, making it somewhat under-specified.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description lacks completeness for a cancellation operation. It does not explain what happens upon cancellation (e.g., return value, confirmation).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description only addresses the 'uuid' parameter ('by UUID') but does not explain the optional 'reason' parameter. With 0% schema description coverage, the description should compensate but does not.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Cancel'), the resource ('Calendly booking'), and the method ('by UUID'). It distinguishes from sibling tools like get_calendly or get_calendly_links, which are read-only.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., rescheduling or modifying bookings). No prerequisites or conditions are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

catch_me_upAInspect

Synthesized catch-up across inbox, calendar, Slack. Use for "what did I miss" or "what needs my attention".

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It mentions 'synthesized' but does not disclose how data is aggregated, prioritized, or any limitations (e.g., time range, data freshness). Missing details on required permissions or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise: one sentence defining purpose plus a usage example. No redundant words or repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has no output schema and no annotations; description is minimal. Does not describe what the synthesized output looks like (e.g., list, summary, action items), which is important for an aggregation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters defined, baseline 4 per rules. Description adds context that tool aggregates multiple sources but does not need to explain parameters since none exist.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it synthesizes catch-up across inbox, calendar, and Slack, distinguishing it from individual sibling tools like get_inbox or get_calendar. Verb 'synthesized' and resource 'catch-up' are specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly provides usage phrases 'what did I miss' and 'what needs my attention' to indicate when to use. Does not explicitly exclude scenarios, but context implies it for broad overview rather than specific data retrieval.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_availabilityAInspect

Check your calendar availability for a date. ALWAYS call before creating events.

ParametersJSON Schema

Name	Required	Description	Default
`date`	Yes
`duration`	No

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It indicates a read-only check but lacks details on what 'availability' means (e.g., returns free/busy, specific time slots). No contradictions, but minimal behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (two sentences) with no redundant information. The first sentence states the core purpose, the second provides critical usage guidance. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (simple tool with 2 parameters, no output schema) and many sibling calendar tools, the description is adequate but incomplete: it doesn't explain the output format or how 'duration' works, which limits its utility.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so description must compensate. It mentions 'for a date' but does not describe the 'date' or 'duration' parameters, their formats, or how they affect behavior. Adds little meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Check your calendar availability for a date.' It uses a specific verb ('check') and resource ('calendar availability'), and the strong directive 'ALWAYS call before creating events' differentiates it from event creation siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance with 'ALWAYS call before creating events,' indicating when to use this tool. However, it does not mention when not to use it or list alternatives, leaving some room for ambiguity among calendar-related siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_freebusyCInspect

Check free/busy for other people on a date.

ParametersJSON Schema

Name	Required	Description	Default
`date`	Yes
`emails`	Yes

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must disclose behavior. Only states 'Check free/busy', omitting whether it's read-only, what return format is, or any side effects. Insufficient for a query tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short at one sentence. While it is concise, it sacrifices necessary detail. Front-loading is fine, but more information could be added without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and low parameter documentation, the description is incomplete. It lacks guidance on input format and expected output, which is insufficient for effective agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and description adds no meaning to parameters. 'emails' and 'date' are vaguely hinted but no format, constraints, or examples provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks free/busy status for other people on a date using a specific verb+resource structure. It distinguishes from sibling tools like 'check_availability' by specifying 'for other people', though it could be more explicit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool over alternatives. Sibling 'check_availability' exists without differentiation or context for when to choose one.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_signature_statusCInspect

Check if a DocuSign document has been signed.

ParametersJSON Schema

Name	Required	Description	Default
`envelopeId`	Yes

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description carries the full burden. It only states the check action but doesn't disclose read-only nature, error handling, or whether it returns a boolean or status string.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no filler words; concise and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description is incomplete. It does not explain the return value, possible statuses, or error cases for a simple status check.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter, envelopeId, is not described beyond its type. The description does not clarify what envelopeId refers to, and the schema provides no description either (0% coverage).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'check' and the resource 'DocuSign document signed status', distinguishing it from sibling tools like send_for_signature and void_envelope. However, it doesn't specify the exact return format.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives like get_docusign or resend_signature_request. No prerequisites or context provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_accountBInspect

Create a new ExanorOS account and send welcome email.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes
`plan`	No
`email`	Yes

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description carries the full burden. It discloses the side effect of sending a welcome email, which is valuable beyond the schema. However, it lacks details on idempotency, error conditions, or required permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence that efficiently conveys the core purpose and a notable side effect. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of annotations, output schema, and parameter explanations, the description is too brief. An agent would lack critical context about parameter semantics, expected outcomes, and error handling.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and the description does not explain the meaning of any parameter (e.g., name format, email validation, plan options). The schema itself provides basic structure, but no added value from the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create a new ExanorOS account') and includes an additional side effect ('send welcome email'). It distinguishes itself from sibling tools like 'create_contact' or 'create_calendar_event' by specifying the resource as an account.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'hubspot_create_contact' or other create tools. There is no context on prerequisites or scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_calendar_eventCInspect

Create a Google Calendar event.

ParametersJSON Schema

Name	Required	Description	Default
`instruction`	Yes

Tool Definition Quality

C2/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full responsibility for behavioral disclosure. It only states 'Create', implying mutation, but omits crucial details such as conflict handling, permissions, or side effects. The minimal description fails to inform the agent about the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The single sentence is concise and front-loaded, but it is too terse to be useful. It states the purpose but omits necessary details, so it does not earn its place fully. A slightly longer description with parameter guidance would be more effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input schema (1 free-text param) and lack of output schema, the description is severely incomplete. It does not explain how to use the instruction parameter, nor does it differentiate from numerous calendar-related sibling tools, resulting in a poor contextual picture.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The sole parameter 'instruction' is an untyped string with no description in the schema or tool description. Schema coverage is 0%. The agent has no guidance on what format or content is expected, making the parameter effectively opaque.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a Google Calendar event, specifying the platform (Google Calendar) and action. However, it does not differentiate from sibling tools like create_recurring_event or check_availability, lacking specificity on the type of event created.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidelines are provided. The description offers no context on when to use this tool versus alternatives (e.g., create_recurring_event, check_availability) or any prerequisites, leaving the agent without decision criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_contactCInspect

Create a new Google contact.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes
`email`	Yes
`phone`	No
`company`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It only states the core action without revealing whether duplicates are handled, permissions required, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief at one sentence, which is acceptable for a simple tool, but it sacrifices necessary detail entirely.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description fails to explain return values, error handling, or behavior like duplicate checking, making it insufficient for an agent to use reliably.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and the description adds no explanation for the four parameters (name, email, phone, company). The agent is left to guess their format or required constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create') and resource ('new Google contact'), distinguishing it from sibling tools like 'lookup_contact' or 'update_contact'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'hubspot_create_contact' or 'create_outlook_contact'. There is no mention of prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_docCInspect

Create a new Google Doc.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes
`content`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden but only states 'create.' It does not disclose side effects, return values, authentication needs, or whether content is added immediately. This is insufficient for an agent to understand tool behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence, but it is too sparse to provide necessary information. Conciseness should not come at the cost of completeness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description should explain what the tool returns (e.g., doc ID or URL). It does not, and it also omits any behavior context. For a creation tool, this is critically incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no meaning to the parameters 'name' and 'content.' It does not explain their roles or constraints, leaving the agent without crucial context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it creates a new Google Doc, using a specific verb and resource. It implicitly differentiates from sibling tools like create_sheet and create_excel, though it does not explicitly contrast with append_to_doc for editing existing docs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like append_to_doc or when not to use it. No prerequisites or limitations are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_excelCInspect

Create a new Excel file in OneDrive.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description should reveal behavioral traits (e.g., does it overwrite existing files, required permissions, success/failure indicators). It only states 'create' without any such details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence and front-loaded, but it lacks essential detail, making it too brief to be fully useful.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool, the description omits output behavior and additional context (e.g., default save location in OneDrive). It is incomplete for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter 'name' has no description in the schema, and the description adds no meaning (e.g., expected format, constraints). Schema coverage is 0%, so description must compensate but fails.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'create' and the specific resource 'new Excel file in OneDrive', distinguishing it from sibling tools like 'create_doc' (Word) and 'create_sheet' (other format).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives (e.g., append_to_excel, create_sheet) or any prerequisites like OneDrive access.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_outlook_contactCInspect

Create a new Outlook contact.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes
`email`	Yes
`phone`	No
`company`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description carries the full burden. It only states 'create', giving no insight into side effects (e.g., duplicates, overwrites, permissions, or rate limits). An agent cannot gauge whether this action is safe or may cause unintended changes.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise but lacks substance. Every sentence should earn its place; this one is too minimal to be effective. It could be expanded with key details without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of annotations, output schema, and parameter descriptions, the description provides insufficient information for an AI agent to reliably use this tool. It fails to cover basic usage context, especially with multiple sibling contact tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage for its parameters (name, email, phone, company). The description adds no meaning beyond the field names, so an agent has no understanding of constraints, format, or purpose of each field.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb and resource: 'Create a new Outlook contact.' This clearly indicates the tool's purpose, but it does not differentiate from sibling tools like 'create_contact' or 'update_contact'. Without additional context, an agent could confuse it with other contact creation tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives such as 'create_contact' or 'lookup_outlook_contact'. There are no exclusions or context cues, leaving the agent to infer usage without any decision framework.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_outlook_eventCInspect

Create a new Outlook Calendar event.

ParametersJSON Schema

Name	Required	Description	Default
`end`	Yes
`start`	Yes
`summary`	Yes
`attendees`	No
`teams_meeting`	No

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description must carry the burden of behavioral disclosure. It states 'Create', implying mutation, but fails to mention side effects like sending invitations, automatic Teams meeting creation when 'teams_meeting' is true, or any rate limits or permission requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single 6-word sentence, which is under-specified for a tool with 5 parameters and multiple behavioral implications. Conciseness should not sacrifice necessary detail; this is an omission rather than efficiency.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, no output schema, no annotations) and the number of sibling tools, the description is severely incomplete. It lacks information about return values, side effects, parameter constraints, and relationships with other calendar tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 5 parameters with 0% description coverage, meaning no property descriptions exist. The description adds no information about parameter format (e.g., date-time format for 'start'/'end'), semantics (e.g., attendees as email addresses), or relationships (e.g., 'teams_meeting' requiring a valid account). This is a critical gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create') and resource ('Outlook Calendar event'), making the tool's purpose unambiguous. Sibling tools like 'create_calendar_event' and 'create_recurring_event' are differentiated by the explicit 'Outlook' in the name, but the description does not further distinguish them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives such as 'create_recurring_event' for recurring events or 'update_outlook_event' for modifications. There are no prerequisites, contextual hints, or 'when not to use' statements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_recurring_eventCInspect

Create a recurring Google Calendar event.

ParametersJSON Schema

Name	Required	Description	Default
`end`	No
`start`	No
`summary`	No
`attendees`	No
`recurrence`	No
`instruction`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. The description only states the action without disclosing behavioral traits such as what happens if recurrence is missing, required permissions, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, concise but lacking essential details. It is appropriately sized but fails to provide value beyond the tool name.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 6 parameters, no output schema, and no annotations, the description is severely incomplete. It does not cover recurrence format, required fields, or error conditions, making it insufficient for reliable agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain any parameters. The agent has no understanding of how to use fields like 'recurrence', 'start', 'end', etc.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Create' and the resource 'recurring Google Calendar event', which distinguishes it from the sibling 'create_calendar_event' (likely for single events). However, it could specify the recurrence pattern.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'create_calendar_event'. The agent must infer context from the tool name and description.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_sheetCInspect

Create a new Google Sheet.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes
`values`	No
`sheetName`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description only says 'Create new Google Sheet' without detailing behavioral traits like idempotency, side effects (e.g., overwriting existing sheets), or rate limits. It adds no value beyond the obvious.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise, but it is too minimal. It front-loads the core action but omits essential details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters, no output schema, and no annotations, the description is incomplete. It fails to explain the return value, the structure of the created sheet, or any constraints like naming conventions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain any parameters (name, values, sheetName). The agent has no clue what 'values' or 'sheetName' mean, making the tool hard to use correctly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Create' and the resource 'Google Sheet', which distinguishes it from sibling tools like create_doc or create_excel. However, it does not specify whether it creates a new spreadsheet or a new sheet within an existing spreadsheet.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool vs alternatives like create_excel. There is no mention of prerequisites, such as authentication or workspace context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_calendar_eventCInspect

Delete a Google Calendar event.

ParametersJSON Schema

Name	Required	Description	Default
`eventId`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden but only states the action. Lacks details on permanence, recurring events, or confirmation requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

One concise sentence with no waste, but overly brief given the need to explain parameters and behavior.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Simple tool but missing behavioral context and parameter guidance. No output schema, so description should provide more completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Single required parameter eventId has no description in schema or description. 0% schema coverage and description adds no meaning about parameter format or origin.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'delete' and resource 'Google Calendar event', distinguishing it from siblings like delete_outlook_event.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, no prerequisites or exclusions provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_outlook_eventCInspect

Delete an Outlook Calendar event.

ParametersJSON Schema

Name	Required	Description	Default
`eventId`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and the description does not disclose behavioral traits beyond the action. It does not mention whether deletion is permanent, reversible, or any side effects on associated data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise but lacks essential details. It is too brief to fully inform the agent without additional context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description does not cover return values, error handling, or side effects. For a delete operation, more context on success/failure would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter 'eventId' is a string with no description in the schema. The description does not explain how to obtain or format the eventId, leaving ambiguity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (Delete) and the resource (Outlook Calendar event). It is specific and distinguishes from sibling tools like 'delete_calendar_event' and 'create_outlook_event'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as 'delete_calendar_event'. No prerequisites or context about required permissions or conditions for deletion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dismiss_reminderCInspect

Dismiss a reminder by ID.

ParametersJSON Schema

Name	Required	Description	Default
`id`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must disclose behavior. It only says 'dismiss,' implying mutation, but lacks details on idempotency, undo, notifications, or error handling for invalid IDs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Remarkably concise at 6 words, but sacrifices necessary context; balanced score as it is front-loaded but incomplete.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Low complexity but no output schema, no annotations, and no success/failure behavior described, leaving critical gaps for a mutation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and description adds no meaning beyond 'by ID' for the id parameter, offering no extra context on format or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (dismiss) and resource (reminder) with the method (by ID), distinguishing it from sibling tools like get_reminders and set_reminder.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives, no prerequisites or conditions, e.g., need to have a reminder ID from get_reminders.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

email_actionBInspect

Mark Gmail email as read/unread, archive, trash, or add label.

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`labelName`	No
`messageId`	Yes

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must bear full burden. It discloses that actions like trash are destructive but lacks details on permanence, required permissions, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with all essential information; no superfluous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks description of return values, error handling, prerequisites (e.g., authentication), or behavioral nuances, making it incomplete for a 3-parameter mutation tool with no annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and description lists actions but does not clarify messageId format or when labelName is required; partial compensation for missing schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies the exact actions (mark read/unread, archive, trash, add label) on Gmail emails, clearly distinguishing it from siblings like forward_email, reply_email, or send_email.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus similar tools like 'trash' or 'archive' actions possibly available via other tools; siblings are not differentiated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

find_knowledgeCInspect

Find a knowledge file by topic name.

ParametersJSON Schema

Name	Required	Description	Default
`topic`	Yes

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry full behavioral disclosure. 'Find' implies read-only, but there is no mention of side effects, permissions, or what happens if no file is found. The description is too brief to establish transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence of 6 words, extremely concise and front-loaded. It wastes no words, but could include more information without sacrificing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and no annotations, the description fails to explain return values, success/failure behavior, or edge cases. It is severely incomplete for a tool that performs a search operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage for the only parameter 'topic'. The description adds 'by topic name', which clarifies the parameter's role, but lacks details like exact match vs. fuzzy search, case sensitivity, or format expectations. The added value is minimal.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Find' and the resource 'knowledge file', and specifies the search criterion 'by topic name'. It distinguishes itself from sibling tools like 'get_knowledge_index' (which lists all) and 'save_to_knowledge' (which creates). However, it could be more precise about what constitutes a 'knowledge file'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives (e.g., searching via email or drive). It does not mention prerequisites, fallback options, or when not to use it. This is a significant gap.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

find_restaurantsCInspect

Search restaurants using Google Places. Returns ratings, reviews, features, and pre-filled OpenTable/Resy booking links.

ParametersJSON Schema

Name	Required	Description
`date`	No
`time`	No
`limit`	No
`price`	No	1=cheap 2=moderate 3=expensive 4=very expensive
`covers`	No
`cuisine`	No
`location`	No

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It mentions using Google Places but does not disclose read-only nature, rate limits, permissions, or potential side effects. Minimal behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that is clear and to the point, covering the tool's purpose and output. However, it could be expanded slightly without losing conciseness to include more context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Description lacks details about parameter usage, output format, and behavior. For a tool with 7 params and no output schema or annotations, this is insufficient for an agent to use effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 7 parameters with only 'price' having a description. Description does not add meaning to other parameters (e.g., location, cuisine, date). Schema coverage is only 14%, and description fails to compensate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool searches restaurants via Google Places and returns ratings, reviews, features, and booking links. It is distinct from any sibling tools, none of which are restaurant-specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, nor any conditions or exclusions. The description does not mention prerequisites or scenarios where this tool is preferable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forward_emailCInspect

Forward a Gmail email.

ParametersJSON Schema

Name	Required	Description	Default
`to`	Yes
`note`	No
`messageId`	Yes

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full burden. It fails to disclose behaviors like whether attachments are preserved, how the note is included, or threading implications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, but it is under-specified; conciseness should not come at the cost of essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (3 parameters, no schema descriptions, no output schema, many siblings), the description is severely incomplete, lacking parameter details and usage context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, and the tool description adds no meaning to parameters. It does not explain what 'to', 'note', or 'messageId' represent or how they are used.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'forward' and resource 'Gmail email', but does not differentiate this from sibling tools like forward_outlook_email or reply_email.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as reply_email or forward_outlook_email, and no prerequisites or context provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forward_outlook_emailCInspect

Forward an Outlook email.

ParametersJSON Schema

Name	Required	Description	Default
`to`	Yes
`note`	No
`messageId`	Yes

Tool Definition Quality

C2.5/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must provide behavioral detail. It only says 'Forward an Outlook email' without disclosing whether attachments are included, how threading works, or what happens to the original email.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise (one sentence) but lacks necessary detail. It is not structured to front-load key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description is insufficient for a tool with 3 parameters. Missing guidance on how to obtain messageId and how the note is used.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and the description adds no meaning to parameters 'to', 'note', 'messageId'. It does not explain their roles or format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Forward an Outlook email.' clearly states the action (forward) and the resource (Outlook email), which is specific and distinguishes it from siblings like 'send_outlook_email' or 'reply_email'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'forward_email' or 'send_outlook_email'. No description of prerequisites or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_calendarCInspect

Get upcoming Google Calendar events.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description must fully disclose behavior. It only mentions 'upcoming' but does not specify how far ahead, how many events, or that it is a read-only operation. This is insufficient for an agent to understand side effects or constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence. It is front-loaded with the key action. However, it could be slightly more informative without sacrificing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple, parameterless tool, the description is adequate but not complete. It lacks details like default time range or event limit, which would help an agent understand the scope of results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has no parameters and 100% coverage (vacuously). The description adds no parameter details, but with zero parameters, baseline is 3. It does not provide additional semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves upcoming Google Calendar events, specifying both verb (get) and resource (upcoming Google Calendar events). However, it does not differentiate from sibling tools like search_calendar or check_availability, which also deal with calendar data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. Siblings such as search_calendar, check_availability, and create_calendar_event exist, but the description offers no context on choosing this tool over them.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_calendlyAInspect

Get Calendly bookings and links.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description correctly implies a read operation but does not specify return structure or side effects. It is adequate for a simple retrieval but could mention that no parameters are needed or that it returns a list.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single, clear phrase with no unnecessary words. Every word is meaningful and the structure is appropriate for the tool's simplicity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with zero parameters and no output schema, the description sufficiently explains the tool's purpose. However, noting the return type (e.g., list of bookings) would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has no parameters, so baseline is 4. The description adds no parameter info, but none is needed. The schema is fully covered.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the resource ('Calendly bookings and links') and action ('Get'). However, it does not differentiate from the sibling tool 'get_calendly_links', which may cause ambiguity about whether this tool returns only links or both. A clearer distinction would improve the score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'get_calendly_links' or 'cancel_calendly'. An agent given multiple sibling tools would lack context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_calendly_linksBInspect

Get your Calendly scheduling links to share with others.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations and no description of side effects, permissions, or behavior (e.g., does it return all links or only active ones?), the tool lacks transparency beyond the basic purpose.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The single sentence is concise and front-loaded, but could be slightly more informative without losing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity (no parameters, no output schema), the description is minimally adequate but lacks details about what the links look like or how they are presented.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has no parameters, so no additional explanation is needed. The description correctly focuses on the tool's output, though it doesn't detail the return format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves Calendly scheduling links for sharing, which distinguishes it from sibling tools like get_calendly (which likely gets general info) and cancel_calendly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives, nor any prerequisites (e.g., needing a Calendly account connected). The agent must infer usage from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_connectionsBInspect

Check which services are connected.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description carries full burden. It does not disclose any behavioral traits such as whether it requires authentication, whether it causes side effects, or what the return format is (e.g., list of service names). This is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence. It is front-loaded with the action and object. While it could be slightly more detailed, it avoids unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and no parameters, the description should provide more context on what 'connected' means and what the response looks like (e.g., list of service names, status indicators). The current description is too minimal for an agent to fully understand the tool's behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so the schema already covers everything. The description adds no parameter information, but none is needed. Baseline score of 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Check which services are connected' clearly indicates the tool lists all connected services. It uses a verb and object, and is distinguishable from siblings which target specific services. However, it could be slightly more explicit that it returns a list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus sibling tools like get_calendar or get_slack. The description does not mention that this is for a global overview while siblings are for individual service status.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_docusignCInspect

Get recent DocuSign envelopes and signature status.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`status`	No

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full burden of behavioral disclosure. The description implies read-only behavior but does not specify authentication needs, rate limits, what constitutes 'recent', or any side effects. The lack of detail leaves agents underinformed about the tool's boundaries.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that immediately conveys the core function. No words are wasted, but the brevity slightly undermines completeness. It is appropriately front-loaded but could benefit from additional context without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 optional params, no output schema, no annotations), the description is minimally complete for basic retrieval. However, it lacks information on default behavior for 'recent', expected return format, and how parameters affect results. For a tool with many siblings, more context is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 2 parameters ('limit', 'status') with 0% schema description coverage, and the tool description does not explain their meaning or usage. While parameter names are somewhat self-explanatory, the description fails to add value beyond the schema, leaving agents without guidance on how to use them effectively.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'DocuSign envelopes and signature status', making the tool's purpose immediately understandable. However, it doesn't explicitly distinguish from sibling tools like 'check_signature_status' which might have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives such as 'check_signature_status', 'send_for_signature', or other 'get_*' tools. There is no mention of prerequisites, filtering capabilities, or context for use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_expensesCInspect

Get expense summary and totals by period.

ParametersJSON Schema

Name	Required	Description	Default
`period`	No

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided. The description does not disclose behaviors such as default period when omitted, data freshness, or limitations. It only states the basic function.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (single sentence) but lacks structure. It serves its purpose without waste but could be more informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and only one parameter, the description is incomplete. It does not specify the return format or what happens when no period is provided (period is optional).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage. The description mentions 'by period' but does not explain the enum values ('week', 'month', 'year', 'all') or their meanings. The schema itself provides the enum, but the description adds no additional semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get expense summary and totals by period.' It uses a specific verb ('get') and resource ('expense summary and totals'), and distinguishes from the sibling tool 'log_expense' which creates expenses.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., log_expense). The description does not specify context or exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_followupsAInspect

Find emails you sent that have not received a reply. Use when someone asks what needs following up.

ParametersJSON Schema

Name	Required	Description	Default
`days`	No
`limit`	No

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden. It explains the tool finds unreplied sent emails but does not disclose how 'no reply' is determined (e.g., time frame via 'days' parameter) or any potential side effects. It is adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no extraneous words, front-loading the purpose. It is highly concise and well-structured for quick understanding.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotation, the description covers the core purpose and usage but misses parameter details. It is acceptable but not fully complete for a tool with two optional parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has two parameters ('days' and 'limit') with no descriptions. The description does not mention or explain them, leaving the agent to guess their meaning. With 0% schema description coverage, this is a significant gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool finds sent emails without replies, distinguishing it from sibling tools like 'get_sent' or 'get_inbox'. It also specifies the use case: 'when someone asks what needs following up.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use when someone asks what needs following up,' providing a concrete usage scenario. However, it does not mention when not to use it or suggest alternatives, which would be helpful for full guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_hubspotBInspect

Get HubSpot CRM contacts and deals.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description carries full burden. It only states it 'gets' data, with no disclosure of side effects, authentication needs, rate limits, or whether it returns all contacts/deals.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence with no extraneous words. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a zero-parameter tool with no output schema or annotations, the description is minimally complete. It tells what it does but lacks context about scope (e.g., all records? any filters?) or return format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero parameters, so schema coverage is 100%. Baseline for 0 parameters is 4. The description adds no additional parameter meaning, but none is needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves HubSpot CRM contacts and deals, using a specific verb and resource. However, it does not distinguish itself from sibling tools like hubspot_create_contact or get_salesforce.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool vs alternatives (e.g., get_salesforce or hubspot-specific tools). There is no discussion of context or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_inboxCInspect

Get unread Gmail emails.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

C2.7/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits, but it only repeats the name. It does not explain what 'unread' means, how many emails are returned, or any limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise, but it lacks substance. It could be expanded to include useful details without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description is too bare. It fails to specify the scope (e.g., Gmail only), return format, or any filters, leaving the agent with minimal context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so schema coverage is 100%. The description adds no extra meaning, but the baseline for zero parameters is 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get unread Gmail emails,' specifying the action (get) and resource (unread Gmail emails). However, it does not differentiate from sibling tools like get_unified_inbox or get_outlook_inbox, which also retrieve emails.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. For example, when to choose get_inbox over get_unified_inbox or get_outlook_inbox is not mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_knowledge_indexAInspect

Get the ExanorOS Knowledge Index — master list of all tracking files.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full responsibility. It discloses the tool returns a 'master list' but lacks detail on pagination, performance, or side effects. For a simple read operation, this is minimally adequate (score 3) but not richly transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words. It is perfectly concise and front-loaded with the core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity (no parameters, no output schema), the description covers the return value reasonably. A minor omission is lack of clarification on what 'tracking files' encompasses, but the overall completeness is high for a list-retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero parameters, so schema description coverage is 100% by default. The description does not need to add parameter details. Baseline score of 4 applies as there is no need for compensation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get') and the resource ('ExanorOS Knowledge Index') and characterizes the result as 'master list of all tracking files.' This distinguishes it from sibling tools like find_knowledge (search) and save_to_knowledge (write), meeting the specific verb+resource+scope criterion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. With sibling tools like find_knowledge and save_to_knowledge, explicit context for selection is missing, leaving the agent without decision support.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_newsAInspect

Get top news headlines with AI summary. Use for morning news or what is happening in the world.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`topics`	No	Comma separated: business,technology,health,science,sports,general

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must convey behavioral traits. It states it fetches and summarizes news, but lacks details on how 'top' is determined, the freshness of headlines, or potential limitations. It is adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences are highly concise, front-loaded with the primary action and value. Every word contributes meaning, with no redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With two parameters, no output schema, and no annotations, the description is too sparse. It does not explain what 'AI summary' entails, how results are structured, or the meaning of 'top' headlines. The agent lacks sufficient context to use the tool confidently.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has two parameters (limit and topics) with only topics having a description. The description does not mention either parameter, thus adding no value beyond the schema. Schema description coverage is 50%, and the description fails to compensate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves top news headlines with AI summary, using a specific verb ('get') and resource ('news headlines'). It also provides a usage context ('morning news or what is happening in the world'). Among sibling tools, no other news tool exists, so it is well-distinguished.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use for morning news or what is happening in the world,' which provides clear context for when to use it. However, it does not mention when not to use it or any alternatives, which is acceptable given the lack of sibling news tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_oooAInspect

Get current Gmail out of office auto-reply status.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It only states the action without disclosing what the output looks like (e.g., boolean or message structure). It lacks details on the return value, which is essential for a getter with no output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no unnecessary words, efficiently conveying the tool's purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with no parameters, but the description does not specify the return format. Given the absence of an output schema, this omission leaves the agent without full context on what to expect, making it only minimally complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so the description cannot add meaning beyond the schema. Baseline score of 4 is appropriate as no parameter documentation is needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool gets the current Gmail out-of-office auto-reply status, using a specific verb and resource. It implicitly distinguishes from the sibling tool 'set_ooo' which is for setting the OOO status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The context of getting an OOO status is clear, and the sibling name 'set_ooo' implies the distinction. However, the description does not explicitly state when to use this vs. setting, but for a simple read operation, this is adequate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_outlook_inboxBInspect

Get unread Outlook emails.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description implies a read-only operation but does not disclose behaviors such as authentication requirements, handling of no unread emails, or return format. With no annotations, the description carries the full burden but provides minimal insight.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no extraneous words. It is appropriately concise for a tool with no parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no parameters, no output schema), the description is minimally adequate. However, it lacks details about the output format, pagination, or expected behavior, which could be problematic for an agent using the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has no parameters, so the description's mention of 'unread' adds context beyond the schema. According to the rule for zero parameters, the baseline is 4, and the description meets this without needing further detail.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves unread Outlook emails, specifying a verb and resource. However, it does not differentiate from sibling tools like 'get_inbox' or 'get_unified_inbox', which may have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There is no mention of context, prerequisites, or conditions for use, leaving the agent to infer applicability.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_outlook_sentCInspect

Get sent Outlook emails.

ParametersJSON Schema

Name	Required	Description	Default
`q`	No
`limit`	No

Tool Definition Quality

C2.3/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and the description does not disclose any behavioral traits (e.g., read-only nature, return format, rate limits). The tool is a data retrieval operation, but the description fails to add transparency beyond the basic action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise but under-specified. It could include more detail without becoming verbose, so it meets a minimally adequate level.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and a simple tool, the description should clarify what is returned (e.g., list of email objects). It fails to provide completeness, especially with many sibling tools needing differentiation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the tool description does not explain the purpose or format of parameters 'q' (query string) and 'limit' (number). The description adds no value beyond the bare schema types.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves sent Outlook emails. However, with siblings like 'get_sent' and 'get_outlook_inbox', it does not differentiate its scope (Outlook-specific vs general), leaving ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as 'get_sent' or 'search_outlook'. The description lacks context for appropriate usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_preferencesAInspect

Get user preferences including automation level, tone, VIP contacts.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must compensate. It correctly implies a read-only operation, but does not disclose any potential side effects, permissions, or the scope of preferences (e.g., user-specific). Adequate but minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, 10 words, no filler. Front-loaded with the action and resource, highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with no parameters and low complexity, the description covers the purpose. However, without an output schema, it would benefit from clarifying return format (e.g., 'Returns all user preferences as key-value pairs'). Slight gap for a getter.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With zero parameters and 100% schema coverage, the description adds no parameter info, but baseline for 0 params is 4. The examples add context but are not necessary for parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb 'Get' and resource 'user preferences', and lists example fields (automation level, tone, VIP contacts), making it unambiguous and distinct from sibling tools like update_preferences.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like update_preferences or other get_ tools. Usage is implied but not clarified, leaving the agent to infer context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_quickbooksCInspect

Get QuickBooks financial summary.

ParametersJSON Schema

Name	Required	Description	Default
`period`	No

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, and the description does not disclose behavioral traits such as read-only status, authentication needs, response contents, or side effects. It fails to convey essential safety or context information.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short but under-specified. It lacks critical details, making it insufficient rather than efficiently concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description should clarify what a 'financial summary' includes, any default behavior for the optional parameter, and how to interpret the results. These are entirely missing, leaving the agent with an incomplete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero description coverage, and the description does not explain the 'period' parameter's purpose, its default value, or the impact of selecting month, quarter, or year. The agent receives no additional meaning beyond the enum list.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the resource ('QuickBooks financial summary') and action ('Get'), but does not differentiate it from sibling tools like get_expenses or get_salesforce, which also deal with financial data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description offers no context about suitable scenarios or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_remindersBInspect

Get all pending reminders.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It only states 'Get all pending reminders' without mentioning read-only nature, permissions, or side effects. The description is insufficient for an agent to understand the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at one sentence. It is front-loaded and efficient, though it lacks any structural elements like sections.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no params, no output schema), the description is minimally adequate. However, it does not explain what constitutes a 'pending' reminder, which could be ambiguous. With many sibling tools, more context would help.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has zero parameters, so the baseline score is 4. The description adds no parameter info, but none is needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (get) and resource (all pending reminders). It is distinct from sibling tools like dismiss_reminder and set_reminder, but does not explicitly differentiate itself.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as get_followups or get_outlook_inbox. The description provides no context for appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_salesforceCInspect

Get Salesforce CRM data — contacts, opportunities, accounts, tasks.

ParametersJSON Schema

Name	Required	Description	Default
`q`	No
`type`	No
`limit`	No

Tool Definition Quality

C2.3/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It fails to disclose behavioral traits such as read-only nature, authentication requirements, rate limits, or any side effects. The description is too sparse to inform safe usage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that briefly summarizes the tool's purpose. It is concise and front-loaded, but it sacrifices essential details. It earns its place without fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters, no output schema, and no annotations, the description is severely incomplete. It does not explain query behavior, return format, scrolling, or any constraints. It fails to provide sufficient context for reliable use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning property descriptions are empty. The description does not explain the meaning or usage of parameters like 'q', 'type', or 'limit'. It only lists possible values for 'type'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves Salesforce CRM data and lists specific object types (contacts, opportunities, accounts, tasks). This distinguishes it from other get_* tools for different systems. However, it could mention the query functionality implied by the 'q' parameter.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidelines are provided. The description does not indicate when to use this tool versus alternatives like sf_create_contact for write operations or other get_* tools for different CRMs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_sentCInspect

Get sent Gmail emails.

ParametersJSON Schema

Name	Required	Description	Default
`q`	No
`limit`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations and no output schema, the description must disclose behavior but only states it fetches sent emails. It does not mention pagination, ordering, authentication requirements, or any side effects, leaving the agent uninformed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (four words), but this brevity comes at the cost of completeness. It is front-loaded but lacks necessary detail, making it minimally acceptable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the two optional parameters and lack of output schema, the description fails to provide enough context for an agent to effectively use the tool. Missing return format, pagination behavior, and usage examples result in inadequate completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain the two parameters 'q' and 'limit'. The agent has no semantic understanding of what values to provide for filtering or result limiting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (get) and the resource (sent Gmail emails), but does not differentiate from sibling tools like get_outlook_sent or get_inbox, which serve similar purposes for different email providers or folders.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as search_email or get_inbox. The description lacks context about filtering, scope, or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_slackCInspect

Get recent Slack messages.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must convey behavioral traits. It only states it retrieves messages without disclosing details such as read-only nature, authentication requirements, rate limits, number of messages returned, or sorting. The description is insufficient for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that is front-loaded with the key action. While it could benefit from additional context, it avoids unnecessary words and earns its place by stating the core function concisely.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has no parameters, no output schema, and no annotations, the description should provide broader context about scope (e.g., which messages, from which channels) and behavior (e.g., read-only). It fails to clarify if this retrieves all recent messages or only from public channels, making it incomplete for reliable agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has no parameters, and schema description coverage is 100%. The description adds no parameter information beyond the empty schema, which is acceptable. Baseline is 3 due to high schema coverage, and the description does not degrade or enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Get recent Slack messages' clearly identifies the action and resource, but 'recent' is vague and it doesn't specify whether this covers channels, DMs, or both. Sibling tools like 'get_slack_dms' and 'search_slack' provide differentiation, but the description itself lacks specificity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus alternatives like 'get_slack_dms' or 'search_slack'. The description does not indicate appropriate contexts or exclusions, leaving the agent to infer usage from the tool name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_slack_dmsCInspect

Get Slack direct messages.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, and the description fails to disclose behavioral traits such as authentication requirements, read-only nature, or scope of messages returned. This leaves critical gaps for safe invocation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short but lacks informative content. It is not wasteful but sacrifices necessary detail for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has no parameters or output schema, a simple description might suffice, but it fails to address common questions like whether it returns a list or a single DM, or how it handles multiple conversations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (no parameters), so the description is not required to add param details. It adds no extra meaning beyond the empty schema, meeting the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Get Slack direct messages' clearly states the action and resource. However, it does not distinguish this tool from sibling 'get_slack', which may also retrieve messages, so it lacks specificity to differentiate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'get_slack' or 'search_slack'. The description offers no context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_stripeAInspect

Get Stripe revenue dashboard — MRR, ARR, active subscribers, recent payments, plan breakdown.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A3.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description must carry the burden. It does not disclose whether the operation is read-only, requires authentication, has rate limits, or what happens if no data is available. The description only states what the tool returns.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that front-loads the purpose and lists key data points concisely. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters or output schema, the description provides a good list of returned metrics. However, it does not specify any limitations (e.g., time period, filtering) or format, which could be useful for a complete picture.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has no parameters, so schema coverage is 100%. Description adds value by specifying the returned data points (MRR, ARR, etc.), which is useful context. Baseline for 0 params is 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the tool retrieves Stripe revenue dashboard data including MRR, ARR, active subscribers, recent payments, and plan breakdown. It uses a specific verb+resource structure and is distinct from sibling tools like get_quickbooks or get_expenses.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., get_quickbooks for accounting, get_expenses for costs). No mention of prerequisites like a Stripe connection or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_teams_dmsBInspect

Get Teams direct messages.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose any behavioral traits beyond the basic action. It does not mention whether the tool is read-only, permissions required, or what exactly 'direct messages' entails (e.g., list of conversations or messages).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, consisting of a single sentence with no extraneous information. It is front-loaded and directly addresses the tool's function.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no parameters, no output schema), the description is minimally adequate. However, it lacks details about the return value (e.g., what format the direct messages are in) and does not leverage the opportunity to clarify scope.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so the schema coverage is trivially 100%. The description adds no additional parameter meaning, but since none exist, this is acceptable. Baseline score of 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool gets Teams direct messages, with a specific verb and resource. However, it does not differentiate from siblings like get_slack_dms or send_teams_dm, which are similar in purpose but for different platforms or actions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

There is no guidance on when to use this tool versus alternatives such as get_slack_dms or send_teams_dm. The description provides no context or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_unified_inboxBInspect

Get Gmail and Outlook simultaneously.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full burden. It only states that both inboxes are retrieved simultaneously but gives no details on authentication, rate limits, ordering, or how 'simultaneously' is managed. This is insufficient for an agent to understand behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is a single sentence of 5 words, highly concise. While it effectively communicates the core function, it could potentially include more context without becoming overly verbose. Thus it scores 4, not 5.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description should clarify what is returned (e.g., list of emails, structure). It fails to do so. Additionally, there is no mention of how the unified inbox is presented or any limitations. This leaves gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 0 parameters with 100% schema description coverage, so baseline is 3. No parameter details are needed, and the description does not add any parameter information, which is acceptable.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses a specific verb 'Get' and resource 'Gmail and Outlook simultaneously', clearly indicating it fetches emails from both services at once. This distinguishes it from sibling tools like 'get_inbox' and 'get_outlook_inbox', which likely retrieve single-service inboxes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as 'get_inbox' or 'get_outlook_inbox'. The description does not specify context, prerequisites, or situations where this tool is preferred.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_weatherAInspect

Get current weather and forecast. Scans calendar for travel locations and shows weather there too.

ParametersJSON Schema

Name	Required	Description	Default
`city`	No
`days`	No
`travel`	No

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries burden. It discloses that the tool scans the calendar for travel locations, which is a behavioral trait. However, it does not mention side effects like read-only nature, permission requirements, or rate limits. Basic but incomplete.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundancy. First sentence front-loads primary purpose. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has 3 optional parameters, no output schema, and no annotations for safety or destructiveness. Description covers purpose and a key behavioral aspect (calendar scanning) but lacks parameter details and complete behavioral disclosure. Adequate but with clear gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description only hints at 'travel' parameter via 'scans calendar for travel locations', but does not explain 'city' or 'days'. Users must infer meaning from parameter names alone, which is insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb (get) and resource (weather and forecast). It uniquely identifies itself among siblings; no other weather tool exists. The extra detail about scanning calendar adds clarity without confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies usage context (weather retrieval, travel planning) but does not explicitly state when to use versus alternatives or provide exclusions. Since no sibling weather tool exists, the context is clear but guidance is minimal.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hubspot_create_contactCInspect

Create a HubSpot contact.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No
`email`	Yes
`phone`	No
`company`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden for behavioral transparency, but only states 'Create a HubSpot contact.' It does not disclose side effects, required permissions, or return behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise but arguably under-specified. It is not overly verbose, but could include key details without sacrificing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple create tool with 4 parameters and no output schema, the description is incomplete. It omits that email is required, what the response contains, and any error conditions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no meaning beyond parameter names. The 'company' parameter, for example, is not explained, nor is it specified that 'email' is required.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a HubSpot contact, using a specific verb and resource. However, given sibling tools like 'create_contact' and 'sf_create_contact', it does not differentiate its scope or purpose from those similar alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus siblings (e.g., hubspot_update_contact, create_contact). There is no mention of prerequisites or context for use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hubspot_create_dealCInspect

Create a HubSpot deal.

ParametersJSON Schema

Name	Required	Description	Default
`stage`	No
`amount`	No
`dealName`	Yes
`contactId`	No

Tool Definition Quality

C2.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It only states 'create', which implies mutation but omits details like idempotency, potential duplicates, permissions, or rate limits. An agent cannot assess side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short (one sentence), but it is not informationally dense enough. While concise, it sacrifices value for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 4 parameters with no descriptions, no output schema, and no annotations, the description is severely incomplete. An agent cannot reliably invoke this tool without guessing parameter semantics.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no information about parameters (dealName, stage, amount, contactId). The agent has no guidance on what each parameter means or how to use them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Create a HubSpot deal' clearly states the verb and resource, but it is a near-tautology of the tool name. It does not distinguish this tool from siblings like hubspot_create_contact or hubspot_update_deal beyond the resource type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives (e.g., hubspot_update_deal for updates, or other CRM tools). The agent receives no context for decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hubspot_log_noteCInspect

Log a note on a HubSpot contact.

ParametersJSON Schema

Name	Required	Description	Default
`note`	Yes
`contactId`	No

Tool Definition Quality

C2.4/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description provides no behavioral details beyond the basic action. No annotations exist, so the description carries full burden but fails to disclose side effects, authorization needs, or whether notes are appended/overwritten.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words, but it is too minimal for a tool with no annotations. Lacks structure to separate purpose from details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, no annotations, and a sparse description, the tool definition is incomplete for an agent to use correctly without prior knowledge. Missing info on return value, optional parameter usage, and error scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, yet the description adds no explanation for 'contactId' or 'note' meaning/format. The required 'note' parameter lacks any guidance on content type or length.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Log a note on a HubSpot contact.' uses a specific verb ('Log') and resource ('note on HubSpot contact'), clearly distinguishing it from siblings like 'hubspot_create_contact' or 'hubspot_update_contact'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., 'hubspot_update_contact' for notes). No context on prerequisites, such as whether the contact must already exist.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hubspot_update_contactCInspect

Update a HubSpot contact.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No
`email`	No
`phone`	No
`company`	No
`contactId`	Yes

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It only says 'Update', which implies a mutation, but does not clarify if updates are partial or full, idempotent, or require specific permissions. No side effects or error conditions are mentioned.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise (one sentence), but it sacrifices utility. While it's front-loaded with the essential purpose, it omits critical details that could be included without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the 5 parameters, no output schema, and no annotations, the description is severely lacking. It does not explain return values, whether updates are persisted immediately, or any constraints, making it difficult for an agent to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 5 parameters with 0% description coverage. The description does not mention or explain any of the parameters, leaving the agent to rely solely on parameter names (name, email, phone, company, contactId) which may be ambiguous without context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (Update) and resource (HubSpot contact), and distinguishes it from sibling tools like hubspot_create_contact and generic update_contact by specifying the platform. However, it could be more specific about which aspects of the contact are updatable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool over other tools, such as hubspot_create_contact (for creation) or the generic update_contact (which might support multiple platforms). Prerequisites or context for using this specific tool are missing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hubspot_update_dealCInspect

Update a HubSpot deal stage or amount.

ParametersJSON Schema

Name	Required	Description	Default
`stage`	No
`amount`	No
`dealId`	Yes

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It states 'update' implying mutation, but provides no details on side effects (e.g., pipeline stage progression, amount validation), authorization requirements, idempotency, or error behavior. This is insufficient for an agent to anticipate the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, which is concise. However, it lacks structure and front-loads only the basic idea. A slightly longer description with more context would improve usability without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and low schema coverage, the description is incomplete. It fails to explain what the tool returns (e.g., updated deal object), how to specify stage/amount correctly, or what happens on failure. For a tool modifying a CRM record, more context is needed for reliable use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It mentions 'stage or amount' but does not explain accepted formats (e.g., stage names as strings, amount as currency string). The dealId parameter is required but not described at all. The description adds minimal value beyond the parameter names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (update) and the resource (HubSpot deal), and specifies the updatable fields (stage or amount). It distinguishes from sibling tools like hubspot_create_deal by implying modification of an existing deal. However, it doesn't explicitly differentiate from similar update tools like hubspot_update_contact, but the resource is different.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like hubspot_create_deal (for creation) or hubspot_log_note (for adding notes). There is no mention of prerequisites (e.g., deal must exist), success conditions, or when not to use it. An agent would have to infer context from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

log_expenseCInspect

Log an expense to your expense tracker in Google Drive. Auto-creates spreadsheet if needed.

ParametersJSON Schema

Name	Required	Description
`date`	No
`amount`	Yes
`category`	No	Meals, Travel, Hotels, Client Entertainment, Office, Medical, Other
`merchant`	No
`description`	Yes
`payment_method`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Only discloses auto-creation of spreadsheet, but lacks details on side effects, required authentication, or impact on existing data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two short sentences, front-loaded with core action. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite 6 parameters and no output schema, description omits return value, required fields beyond schema, and constraints. Insufficient for safe usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 17% schema description coverage, the description adds no parameter explanations. No hints on date format, amount expected, or merchant meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states logging an expense to Google Drive and mentions auto-creation. However, it does not explicitly differentiate from sibling tools like 'append_to_sheet' or 'create_sheet'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, such as general spreadsheet operations or other logging tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lookup_contactCInspect

Look up someone email or phone by name.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose whether the tool is read-only, requires authentication, or handles missing data. It only states 'look up', leaving side effects and safety unaddressed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single efficient sentence with no unnecessary words. It could be improved by front-loading key details, but it is not overly verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool returns email or phone, but the description omits whether it returns one result or multiple, and does not specify the output format. Given no output schema, this is a significant gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% coverage for parameter 'q', and the description only says 'by name', which is vague. It does not confirm that 'q' is a name, nor does it provide examples or format constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool looks up email or phone by name, distinguishing it from siblings like create_contact or lookup_outlook_contact. However, it does not specify scope (e.g., local vs. global contacts) or behavior for multiple matches.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like lookup_outlook_contact or search_email. The description implies usage for name-based lookups but lacks explicit context or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lookup_outlook_contactCInspect

Look up a Microsoft contact by name.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and the description does not disclose behavior beyond the basic lookup. Missing details on what happens if contact is not found, multiple results, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise (6 words), but at the cost of missing critical operational details. It earns its place but is too terse for effective decision-making.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and a single parameter, the description should hint at return value or matching behavior. It does not, leaving the agent underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and the description only implies that 'q' is a name. No additional semantic value or format expectations are provided, leaving ambiguity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Look up'), the resource ('Microsoft contact'), and the method ('by name'), distinguishing it from siblings like 'lookup_contact' (generic) and 'search_outlook' (broader).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., 'lookup_contact', 'search_outlook'). Missing explicit context for optimal usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

morning_briefingBInspect

Get full morning briefing across all platforms.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears full responsibility for behavioral disclosure. It only states a high-level purpose without mentioning side effects, authentication needs, data sources, or performance characteristics, leaving significant behavioral ambiguity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no unnecessary words. The verb 'Get' is clear, and the scope is front-loaded. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the tool has no parameters and no annotations, it lacks an output schema. The description does not specify the return format or content of the briefing, which is a gap for a tool that likely returns a composite result. Adequate but incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters with 100% coverage, so the description does not need to explain parameters. Baseline is 4 for no-parameter tools.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it retrieves a 'full morning briefing across all platforms,' indicating it aggregates data from multiple sources. However, it lacks specificity about what the briefing includes, making it less precise than ideal for distinguishing from sibling get_* tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus the many specific getter siblings (e.g., get_calendar, get_news). The description provides no context on preferred scenarios or prerequisites, leaving the agent without decision support.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

outlook_email_actionCInspect

Mark Outlook email read/unread, archive, or trash.

ParametersJSON Schema

Name	Required	Description	Default
`action`	Yes
`messageId`	Yes

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description does not disclose side effects (e.g., whether 'trash' is permanent or reversible), permission requirements, or state changes beyond the action. Agent lacks understanding of consequences.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no fluff. However, it is so brief that it sacrifices completeness for conciseness. Could include a brief example or additional context without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a state-modifying tool with two parameters and no output schema, the description is insufficient. It omits details like what happens during each action (e.g., archive moves folder), success/failure indicators, or error cases. Agent lacks confidence in usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, but the description maps the 'action' enum to the listed actions. However, it does not explain 'messageId' (e.g., format, source) or how to obtain it. Partial compensation for missing schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool acts on Outlook emails with specific actions (mark read/unread, archive, trash). It distinguishes from generic 'email_action' sibling by specifying Outlook. However, it could be more precise about the resource (e.g., a single email).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings like 'email_action' or other email manipulation tools. It doesn't indicate prerequisites (e.g., email must exist) or context for choosing this over alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

qb_create_invoiceCInspect

Create a QuickBooks invoice.

ParametersJSON Schema

Name	Required	Description	Default
`amount`	Yes
`dueDate`	No
`description`	No
`customerName`	Yes

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description fails to disclose behavioral traits such as side effects, permissions, or error handling. It merely says 'create' without elaboration.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, but at the cost of omitting essential details. It is not well-structured for useful consumption.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters undocumented and no output schema, the description is severely incomplete. It fails to provide sufficient context for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description provides no information about any of the 4 parameters, despite 0% schema coverage. It adds no value over the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it creates a QuickBooks invoice, using a specific verb and resource. However, it does not distinguish from sibling tools like qb_record_payment.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives or what prerequisites are needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

qb_record_paymentCInspect

Record a QuickBooks payment.

ParametersJSON Schema

Name	Required	Description	Default
`invoiceId`	Yes
`customerName`	No
`paymentAmount`	Yes

Tool Definition Quality

C2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist. The word 'Record' implies a mutating operation, but the description does not disclose side effects (e.g., whether it creates a payment record, marks an invoice as paid, or requires specific permissions). Behavioral traits are minimally conveyed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short (5 words), but this is under-specification, not conciseness. It fails to provide necessary context or instructions for each parameter.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given three parameters, no annotations, and no output schema, the description is severely incomplete. It does not explain required fields, data formats, or what happens after execution.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage, and the description adds no information about parameters (invoiceId, customerName, paymentAmount). The agent gains no meaning beyond the raw property names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Record a QuickBooks payment' states the verb and resource, but is generic and does not distinguish from siblings like qb_create_invoice. It clearly indicates recording a payment, but lacks specificity about the context (e.g., applying to an invoice).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives (e.g., qb_create_invoice). There is no mention of prerequisites such as needing an existing invoice or customer.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_drive_fileCInspect

Read a Google Sheet or Doc.

ParametersJSON Schema

Name	Required	Description	Default
`range`	No
`fileId`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations; description only says 'Read', not disclosing that it is read-only, what happens if file doesn't exist, or if range parameter is supported.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

One short sentence with no unnecessary words, but sacrifices informative value for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With only 2 parameters and no output schema or annotations, description fails to explain parameter roles, response format, or error handling, leaving agent underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has two parameters (fileId, range) with 0% description coverage. Description adds no meaning—does not explain that fileId is the file ID or that range specifies cell range for sheets.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states verb 'Read' and resource 'Google Sheet or Doc', distinguishing it from sibling read tools for OneDrive, Outlook etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives like read_onedrive_file or read_outlook_thread. Missing context about scope or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_onedrive_fileCInspect

Read an Excel or Word file from OneDrive.

ParametersJSON Schema

Name	Required	Description	Default
`fileId`	Yes
`sheetName`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. 'Read' implies read-only, but no disclosure of side effects, authorization needs, or error behavior. Minimal additional context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is one sentence, but it sacrifices informative detail for brevity. Could include supported file types or output format without being overly long.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, no annotations, and minimal description. Tool expects a fileId and optional sheetName, but no info on return format, supported file types, or error handling. Incomplete for an agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and description does not explain parameters. 'fileId' and 'sheetName' are not described; their purpose is only implied by the tool name and description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Read' and the resource 'an Excel or Word file from OneDrive'. It implicitly distinguishes from sibling 'read_drive_file' by specifying file types, but does not explicitly differentiate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives like 'read_drive_file'. No context about prerequisites, file types supported, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_outlook_threadCInspect

Read a full Outlook thread.

ParametersJSON Schema

Name	Required	Description	Default
`id`	Yes

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not mention behavioral details such as whether the thread is marked as read, permissions required, or what constitutes a 'full' thread. This is insufficient for an agent to understand side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, which is too minimal. While concise, it lacks structure and fails to provide sufficient information for an agent to use the tool correctly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (one parameter, no output schema), the description should at least clarify the format of the thread ID or what 'full' means. It is incomplete and leaves ambiguity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage for the single 'id' parameter. The description adds no meaning beyond what the schema already provides (just an 'id' string field).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it reads a full Outlook thread, with a specific verb and resource. However, it does not differentiate from sibling tool 'read_thread', which may cause confusion for the agent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'read_thread' or 'get_outlook_inbox'. The description provides no context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_threadBInspect

Read a full Gmail thread by threadId.

ParametersJSON Schema

Name	Required	Description	Default
`id`	Yes

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description only states 'Read' which implies no mutation, but with no annotations, it should explicitly confirm read-only behavior and note any effects like marking as read. No details on authorization or limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no extraneous content. Front-loaded with the core action and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read operation with one parameter, the description is functional but lacks details about the output structure or any constraints. Could mention that it returns the full thread including messages.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage and only a single 'id' parameter of type string, the description adds no additional meaning. It merely repeats the parameter name. The schema alone is minimal.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies a read operation targeting a specific resource ('Gmail thread') by its identifier. It is distinct from sibling tools like 'read_outlook_thread' and list-oriented tools like 'get_inbox'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'get_inbox' or 'search_email'. There is no mention of prerequisites or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reply_emailCInspect

Reply to a Gmail email.

ParametersJSON Schema

Name	Required	Description	Default
`to`	Yes
`body`	Yes
`subject`	No
`threadId`	No
`messageId`	Yes

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description lacks any behavioral details beyond the action. It does not state whether the reply is sent immediately, if it drafts the email, or what authentication is needed. With no annotations, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, which is concise but overly terse. It front-loads the purpose but sacrifices necessary detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 5 parameters, no output schema, and no annotations, the description is far from complete. It does not cover success behavior, error cases, or required preconditions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%. The description does not explain any parameters (e.g., messageId, to, body). An agent cannot infer how to construct a valid request from the description alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Reply to a Gmail email' clearly identifies the action (reply) and resource (Gmail email). However, it does not differentiate from sibling tools like reply_outlook or forward_email, which could cause confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives (e.g., send_email, forward_email). There is no mention of prerequisites such as having an existing email to reply to.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reply_outlookCInspect

Reply to an Outlook email.

ParametersJSON Schema

Name	Required	Description	Default
`body`	Yes
`messageId`	Yes

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description is minimal. It fails to disclose behavioral traits such as whether it modifies the original message, sends immediately, or requires specific permissions, which is critical for a mutational tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence with no redundant information. However, given the lack of annotations and parameter details, it could be slightly expanded for clarity without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, no annotations, and only two minimal parameters, the description is highly incomplete. It omits return values, side effects, prerequisites, and behavioral context essential for agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 0% description coverage for parameters, and the tool description adds no additional meaning beyond field names. The description must compensate but does not, resulting in insufficient parameter guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Reply to an Outlook email' clearly states verb and resource, but lacks differentiation from sibling tools like 'reply_email' or 'forward_outlook_email', missing specifics like whether it includes original message body.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidelines are provided. The description does not mention when to use this tool over alternatives like 'send_outlook_email' or 'reply_email', leaving the agent without decision support.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reply_slackCInspect

Reply to a Slack thread or send DM.

ParametersJSON Schema

Name	Required	Description	Default
`channel`	Yes
`message`	Yes
`thread_ts`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden but provides minimal behavioral insight. It does not mention authentication needs, message format constraints, whether it creates threads, or limits on DM recipients. A mutation tool requires more transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise (one sentence) but lacks front-loaded details. The structure is flat and does not prioritize key information. While brevity is valued, the sentence omits critical context, making it less effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, no output schema) and missing annotations, the description is incomplete. It fails to explain when to use thread_ts, response format, or error conditions. The agent lacks sufficient information for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no meaning to parameters. It does not explain what 'channel' should be (ID or name?), message format, or the role of 'thread_ts'. Parameter names are self-explanatory, but without additional context, the agent cannot infer constraints or formatting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it replies to a Slack thread or sends a DM, using specific verb and resource. It distinguishes from siblings like send_slack_message (new message) and reply_email (email). However, ambiguity remains about whether 'send DM' means starting a new DM conversation, which overlaps with send_slack_message.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., send_slack_message, reply_teams_thread). The description does not mention when-not-to-use, prerequisites, or preferred contexts, leaving the agent to infer usage from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reply_teams_threadCInspect

Reply to a Teams channel thread.

ParametersJSON Schema

Name	Required	Description	Default
`teamId`	Yes
`message`	Yes
`channelId`	Yes
`messageId`	Yes

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose behavioral traits such as whether it replies within the existing thread, supports attachments, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence, but it is too minimal and could benefit from additional context without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given four required parameters, no annotations, and no output schema, the description is incomplete. It fails to explain the return value, permissions needed, or what constitutes a successful reply.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description adds no meaning beyond parameter names. It does not clarify constraints or formats for teamId, channelId, messageId, or message.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'reply' and the resource 'Teams channel thread', which distinguishes it from siblings like 'send_teams_dm' and 'reply_slack'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., send_teams_dm), nor any prerequisites like thread membership.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

resend_signature_requestBInspect

Resend/nudge a DocuSign envelope that has not been signed yet.

ParametersJSON Schema

Name	Required	Description	Default
`envelopeId`	Yes

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It only states the action and condition, but does not mention what happens if the envelope is already signed, whether it is idempotent, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is succinct (one sentence) and to the point. However, it could be slightly more structured or include bullet points for clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one required parameter and no output schema, the description covers the basic purpose. However, it lacks behavioral details, error conditions, and parameter format, making it only adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no extra meaning to the single parameter envelopeId beyond its name. No format or constraints are clarified.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (resend/nudge), the resource (DocuSign envelope), and the condition (not yet signed). This distinguishes it from siblings like send_for_signature and check_signature_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when an envelope is pending signature, but does not explicitly state when not to use or mention alternatives. No direct guidance on preconditions or edge cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_draftBInspect

Save Gmail draft without sending.

ParametersJSON Schema

Name	Required	Description	Default
`to`	Yes
`body`	Yes
`subject`	Yes

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description only states basic function, omitting behavioral details such as whether it overwrites existing drafts, required auth scopes, or rate limits. For a write operation, more disclosure is needed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no wasted words. Front-loaded with the key action. Could be slightly expanded without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 3 required parameters, no output schema, and no annotations. The description fails to specify field formats, return values (e.g., draft ID), or error handling. Incomplete for a write operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage and the description does not explain any parameters. While parameter names are somewhat self-explanatory, the description adds no additional meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verb 'Save' and resource 'Gmail draft' with clarifier 'without sending'. It clearly distinguishes from siblings like send_email and save_outlook_draft.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies this tool is for saving drafts only, but does not explicitly state when to use it over save_outlook_draft or other alternatives. No when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_outlook_draftCInspect

Save Outlook draft.

ParametersJSON Schema

Name	Required	Description	Default
`to`	Yes
`body`	Yes
`subject`	Yes

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description must disclose behavioral traits, but it only states 'Save Outlook draft.' It does not mention that this is a write operation, whether it overwrites existing drafts, or any permissions needed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short (4 words) but fails to provide needed details. Conciseness is valued, but here it has sacrificed valuable information, resulting in under-specification rather than efficient communication.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, low schema description coverage, and no annotations, the description is incomplete. It lacks return value information, behavioral details, and usage context, making it insufficient for correct tool invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, yet the description adds no parameter explanations. Although parameter names ('to', 'subject', 'body') are somewhat self-explanatory, the description should clarify their purpose (e.g., 'to' is recipient email address) to fully compensate for the schema gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the verb 'save' and resource 'Outlook draft', conveying the basic action. However, it does not distinguish this from sibling tools like 'save_draft' that might be generic, making the purpose only moderately clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'save_draft', 'send_outlook_email', or 'append_to_doc'. The agent must infer usage from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

save_to_knowledgeBInspect

Save information to a persistent tracking file. Finds or creates file by topic.

ParametersJSON Schema

Name	Required	Description	Default
`data`	No
`note`	No
`topic`	Yes

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It only states a basic operation without details on side effects (e.g., overwrite vs append), authorization needs, or what happens if a file already exists. This lack of disclosure is insufficient for safe use.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of two concise sentences that are front-loaded with the key action and mechanism. Every sentence earns its place without unnecessary verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with three parameters (including a nested object) and no output schema or annotations, the description is too brief. It lacks explanation of return values and parameter roles, leaving gaps for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description should explain parameter meanings. It only hints at 'topic' but does not clarify the 'data' or 'note' parameters. The description adds no value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Save information'), the resource ('persistent tracking file'), and the mechanism ('finds or creates file by topic'). It is specific and distinguishes from siblings like 'find_knowledge' and 'get_knowledge_index'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for saving information under a topic, but does not explicitly state when to use this tool vs alternatives (e.g., 'find_knowledge') or when not to use it. No prerequisites or exclusions are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_calendarCInspect

Search Google Calendar events.

ParametersJSON Schema

Name	Required	Description	Default
`q`	No
`days`	No

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description must carry the full burden of behavioral disclosure. It does not mention whether the tool is read-only, requires authentication, or what the response contains (e.g., event summaries vs. full details). This is insufficient for safe agent use.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, which is concise but at the expense of essential detail. It borders on under-specification rather than efficient communication, failing to earn its place with substantive content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With two undocumented parameters, no output schema, and numerous sibling calendar tools, the description is severely incomplete. The agent lacks critical information to correctly invoke or interpret results from this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has two parameters ('q' and 'days') with 0% description coverage. The description provides no explanation of their meaning or expected format. The agent cannot infer how to use 'q' (query string?) or 'days' (search range?) from the current text.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the verb 'Search' and the resource 'Google Calendar events', clearly indicating the tool's function. However, it does not differentiate from sibling tools like 'check_availability' or 'get_calendar', leaving some ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'check_availability' or 'search_outlook_calendar'. The description lacks context on suitable scenarios or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_driveCInspect

Search Google Drive files.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes
`type`	No

Tool Definition Quality

C2.1/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavior, but it merely restates the tool name. It does not mention that the tool uses a query parameter 'q', returns file metadata, or any potential side effects. This is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (2 words), but this brevity sacrifices essential information. It is not structured to front-load key details; it is essentially a tautology and provides no value over the tool name.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (2 parameters, many siblings, no output schema), the description is highly incomplete. It fails to cover return values, pagination, or any behavioral context, leaving the agent with insufficient information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, yet the description adds no meaning to the parameters 'q' and 'type'. It does not explain that 'q' is a search query or that 'type' filters by file type (sheet, doc, any). The agent is left without critical parameter guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Search Google Drive files' clearly identifies the tool's action (search) and resource (Google Drive files), distinguishing it from sibling tools like search_onedrive or search_email. However, it could be more specific about the search capability (e.g., query syntax).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like search_onedrive or search_email. The description lacks context for the AI agent to choose appropriately among the many sibling search tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_emailCInspect

Search Gmail by name, subject, or keyword.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes
`limit`	No

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It only states the action and search fields, omitting details like whether it reads data (likely read-only), pagination, rate limits, or any side effects. The description is insufficient for an agent to infer behavioral safety.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence with no redundant information. It is front-loaded with the main action and target. However, it could be slightly more structured by including a brief note on parameters or usage hints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and schema descriptions, the description should provide more context about return values, search behavior (e.g., full text? exact match?), and pagination. As a search tool, critical details are missing, making it incomplete for an agent that needs to interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage for parameters. The description explains that the 'q' parameter can be a name, subject, or keyword, which adds some meaning, but it ignores the 'limit' parameter entirely. For a 2-parameter tool with no schema descriptions, the description should cover both parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (search) and target resource (Gmail), and specifies search fields (name, subject, keyword). It distinguishes from sibling tools like 'search_drive' or 'search_slack' by identifying the specific platform. However, it could be more precise about the scope (e.g., user's own inbox).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for Gmail searches but does not provide explicit guidance on when to use this tool versus alternatives like 'search_outlook' or 'get_inbox'. No exclusions or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_onedriveCInspect

Search OneDrive files.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description lacks any disclosure of behavioral traits such as pagination, authentication requirements, rate limits, or what happens when no results are found. With no annotations to fill these gaps, the description is insufficient for an agent to understand side effects or constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

While the description is extremely short (three words), it is paradoxically too concise. It sacrifices necessary detail for brevity, leaving critical information missing. It does not adhere to a structure that front-loads key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and one parameter, the description is severely incomplete. It does not describe the return format, result limits, sorting, or any filtering capabilities, leaving the agent without enough context to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage, and the tool's description adds no information about the 'q' parameter. The agent cannot infer what to query or how to format the search string, making the parameter effectively undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action ('Search') and the resource ('OneDrive files'), making the tool's purpose immediately understandable. However, it does not explicitly differentiate it from sibling tools like 'search_drive' that may have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives, such as 'read_onedrive_file' for retrieving specific files or 'search_drive' for other cloud storage services. The agent is left to infer usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_outlookCInspect

Search Outlook emails.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes
`limit`	No

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description does not disclose read-only nature, rate limits, or any side effects. The minimal text fails to communicate behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, but it sacrifices necessary detail for brevity. It is under-specified rather than efficiently concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, no annotations, and 2 undocumented parameters, the description fails to provide a complete picture. It does not cover return format, query syntax, or behavior (e.g., pagination, date range).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%. Description does not elaborate on the purpose or format of 'q' or 'limit', leaving parameters entirely unexplained.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb (Search) and resource (Outlook emails), but does not differentiate from siblings like search_email or get_outlook_inbox, which may have overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as search_email or get_outlook_inbox. The description provides no context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_outlook_calendarCInspect

Search Outlook Calendar events.

ParametersJSON Schema

Name	Required	Description	Default
`q`	No
`days`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden for behavioral disclosure. It only states 'Search Outlook Calendar events', implying a read operation, but omits details like search scope, time range defaults, or return format. Minimal behavioral information is provided.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with a single sentence, but it sacrifices necessary detail. It could be expanded slightly to cover key aspects without becoming overly long.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description is severely incomplete. It fails to address the two parameters, return values, or usage context. Sibling tools numbers are high, but no guidance on differentiation is given. The minimal description leaves the agent underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description does not explain the two parameters ('q' and 'days'). Given 0% schema description coverage, this is a critical omission; the agent receives no guidance on how to use the parameters effectively.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states verb 'Search' and resource 'Outlook Calendar events', making the purpose unambiguous. However, it does not differentiate from sibling tools like 'search_calendar' or 'check_availability', which may cause confusion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description lacks any context about appropriate scenarios, prerequisites, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_slackCInspect

Search Slack messages.

ParametersJSON Schema

Name	Required	Description	Default
`q`	Yes

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description should disclose behavioral traits like search scope, read-only nature, or output format, but it only states the basic purpose. This leaves significant gaps for the agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

At five words, the description is extremely concise, but it lacks necessary detail. It is not wasted but is too brief to be adequately informative or structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (one parameter, no output schema, no annotations), the description is still incomplete. It fails to contextualize search scope, distinguish from siblings, or clarify parameter semantics.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage for the parameter 'q', and the external description adds no meaning beyond the tool's name. The agent cannot infer what 'q' represents or its expected format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Search' and the resource 'Slack messages', providing a specific action and target. However, it does not differentiate from sibling tools like search_email or get_slack, which reduces clarity in distinguishing usage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use search_slack versus alternatives such as get_slack, get_slack_dms, or other search tools. The agent lacks context for selecting this tool appropriately.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_emailAInspect

Send an email via Gmail. Supports file attachments — when the user provides a file, encode it as base64 and pass in the attachments array. Never tell the user attachments are unsupported.

ParametersJSON Schema

Name	Required	Description
`to`	Yes
`body`	No
`subject`	Yes
`attachments`	No	Optional file attachments as base64
`instruction`	No

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description partially discloses attachment behavior but omits details on authentication, error handling, rate limits, or the 'instruction' parameter's purpose.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, 30 words, with no redundancy. Front-loads the core action and adds essential attachment instructions efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 5-parameter tool with no output schema and no annotations, the description covers main use and attachments, but omits behavior for other parameters, return values, and error contexts.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is only 20%, and the description only adds value for 'attachments' (encoding guidance). Other parameters (to, subject, body, instruction) lack any extra meaning beyond type/required flags.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Send an email via Gmail' and specifies file attachment handling, distinguishing it from siblings like 'send_outlook_email' or 'reply_email'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit instructions for file attachments ('encode as base64') and a 'never tell' directive, but does not explicitly guide when to use this tool over alternatives like reply or forward.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_for_signatureCInspect

Send a document for electronic signature via DocuSign.

ParametersJSON Schema

Name	Required	Description	Default
`message`	No
`subject`	Yes
`documentName`	No
`recipientName`	Yes
`recipientEmail`	Yes
`documentContent`	No

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description only says 'send for signature' without disclosing side effects, prerequisites, or post-send behavior. Minimal behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence is concise but lacks necessary detail; front-loaded but under-specified.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has 6 parameters, no output schema, no annotations; description provides almost no context, leaving agent unable to understand required inputs or return behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 6 parameters with 0% description coverage; description adds no parameter explanations. Agent cannot infer purpose of each parameter from description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'send', resource 'document for electronic signature', and platform 'via DocuSign'. It distinguishes from sibling tools like check_signature_status or resend_signature_request.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use or alternatives. Context implies it's for initial sending, but no mention of when not to use or comparison with resend_signature_request.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_outlook_emailCInspect

Send an email via Outlook.

ParametersJSON Schema

Name	Required	Description	Default
`to`	Yes
`body`	Yes
`subject`	Yes

Tool Definition Quality

C2.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description must disclose behavioral traits. It only states 'send' implying mutation, but no details on whether it sends immediately or drafts, authentication requirements, rate limits, or side effects (e.g., sent items). The schema lacks structure for attachments or CC, but the description does not clarify limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

At six words, the description is too terse. It omits essential operational context, crossing from concise into under-specification. Every sentence must earn its place, but this sentence fails to add value beyond the tool name.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (3 required string parameters, no output schema, no annotations), the description is incomplete. It doesn't explain return behavior, error handling, or whether it uses default Outlook settings. The agent needs more context to use the tool safely and effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning no parameter descriptions exist. The description ('Send an email via Outlook') adds no meaning to parameters like 'to' (format? single vs. multiple?), 'subject' (length?), or 'body' (plain text vs. HTML?). The parameter names alone are insufficient for correct invocation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Send an email via Outlook' clearly specifies the action (send email) and the platform (Outlook), distinguishing it from generic send_email or other Outlook actions like forward or reply. However, it is very brief and lacks details on scope (e.g., attachments, recipients).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidelines are provided. The description does not specify when to use this tool over alternatives (e.g., send_email, forward_outlook_email) or any constraints (e.g., only sends from primary account, cannot schedule). Context signals show many sibling email tools, making guidance necessary.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_slack_messageDInspect

Send a Slack message.

ParametersJSON Schema

Name	Required	Description	Default
`channel`	Yes
`message`	Yes

Tool Definition Quality

D1.5/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses no behavioral traits. Without annotations, the description should cover safety (e.g., mutability), authentication needs, or error behavior, but it does not. The agent has no insight into side effects or constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short (one sentence), but this is under-specification rather than effective conciseness. It omits critical information that would fit in a few more sentences, failing to earn its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, no annotations, and only 2 required parameters, the description should cover channel format, message handling, and return value (e.g., sent message ID). It does none of this, leaving the agent with an incomplete picture.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% for parameters, yet the description adds no explanation of 'channel' (e.g., ID vs name) or 'message' (formatting, length limits). The agent must guess parameter semantics entirely from the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Send a Slack message.' is a tautology of the tool name 'send_slack_message', providing no additional specificity. It does not distinguish this tool from siblings like 'reply_slack' (which implies threading) or 'send_sms' (different platform).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus alternatives such as 'reply_slack' or 'set_slack_status'. There is no indication of optimal contexts, prerequisites (e.g., channel membership), or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_smsCInspect

Send a text message to any phone number.

ParametersJSON Schema

Name	Required	Description	Default
`to`	Yes	Phone number in any format
`message`	Yes

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It only states the action but fails to disclose delivery guarantees, rate limits, character limits, side effects, or error scenarios, leaving significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Overly terse; while concise, it omits critical details for a tool with no annotations. Single sentence fails to justify its brevity by sacrificing completeness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple but lacks essential context: no mention of required permissions, response behavior, error handling, or cost implications. Incomplete even for a basic SMS tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 50% (only 'to' has a description). The description adds no parameter-specific guidance beyond 'Send a text message' – e.g., 'message' content expectations or 'to' format hints not covered by schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Send') and resource ('text message'), and specifies the scope ('any phone number'), effectively distinguishing it from sibling communication tools like send_email or send_slack_message.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when or when not to use this tool, no mention of limitations (e.g., international numbers, MMS), and no alternatives referenced. Users must infer usage from context alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

send_teams_dmCInspect

Send a Teams direct message.

ParametersJSON Schema

Name	Required	Description	Default
`chatId`	No
`message`	Yes

Tool Definition Quality

C2.1/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full behavioral burden but only states the basic action. No disclosure of side effects, authentication needs, error states, or whether the message is sent to a user or channel.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, but it sacrifices necessary detail for brevity. Essential parameter guidance is missing, making it under-specified rather than concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no output schema and incomplete parameter descriptions, the description fails to provide sufficient context. The agent cannot determine input format, expected behavior, or return value from this alone.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no information about the parameters. The 'chatId' field is undocumented; its purpose and format remain unclear, forcing the agent to guess.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (send) and target (Teams direct message), distinguishing it from sibling messaging tools like send_slack_message or send_email. However, it lacks specificity about the recipient.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool vs alternatives, nor any prerequisites or constraints. The agent must infer usage from context alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_oooCInspect

Set or disable Gmail out of office auto-reply.

ParametersJSON Schema

Name	Required	Description	Default
`enabled`	Yes
`message`	No
`subject`	No
`end_date`	No
`start_date`	No
`contacts_only`	No

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, leaving the description fully responsible for behavioral transparency. The description indicates a write operation but does not disclose side effects (e.g., overriding existing OOO), authentication needs, or limits. The behavioral impact is understated.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, concise and efficient. However, the brevity comes at the cost of missing critical details that could fit without bloating. Still, it is well-structured for its length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 6 parameters with no schema descriptions and no output schema, the description is insufficient. It does not explain the parameters, return values, or constraints (e.g., date formats, required fields beyond 'enabled'). The agent lacks key information to use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description must compensate but does not. It mentions no parameter details. The parameter names (enabled, message, subject, etc.) give some clue, but the description adds no explicit semantic meaning to help the agent use them correctly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool sets or disables Gmail out of office auto-reply, using specific verbs and identifying the resource. It differentiates from the sibling 'get_ooo' which reads the OOO status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, no prerequisites (e.g., Gmail account), and no conditions under which it should not be used.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_reminderCInspect

Set a reminder. Use when someone says remind me, do not forget, follow up.

ParametersJSON Schema

Name	Required	Description
`text`	Yes
`type`	No
`remind_at`	Yes	When to remind ISO8601 or natural language

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It only states 'Set a reminder' without disclosing behavior like duplicate handling, confirmation, or side effects. Very minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (one sentence plus usage examples), but being concise risks under-specification. It front-loads the purpose, but could be more structural.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema or annotations exist. The description fails to explain return values, error cases, or recurrence behavior. For a 3-parameter tool, it's incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 3 parameters but only remind_at has a description (33% coverage). The description adds no parameter-level detail. Agents lack guidance on text formatting, type meaning, or reminder semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Set a reminder' clearly states the action and resource. Examples like 'remind me, do not forget, follow up' provide context, but it doesn't differentiate from sibling tools like dismiss_reminder or get_reminders.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives typical usage triggers ('remind me, do not forget, follow up'), but lacks explicit when-not-to-use or alternative tools. Usage is implied rather than clarified.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_slack_statusBInspect

Set Slack status with emoji and optional duration.

ParametersJSON Schema

Name	Required	Description	Default
`status_text`	Yes
`status_emoji`	No
`duration_minutes`	No

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It states 'set Slack status' implying mutation, but does not mention side effects (e.g., overwriting existing status), rate limits, or authorization needs. The addition of 'with emoji and optional duration' gives minimal parameter behavior but lacks broader transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence with no wasted words. Every part ('Set Slack status', 'with emoji', 'optional duration') adds value and is efficiently front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple tool (3 params, no output schema, no annotations), the description is insufficiently complete. It does not explain the return value, whether the status is persistent, or how long the duration is applied. For a tool with no annotations and 0% schema coverage, more detail is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must add meaning to parameters. It mentions 'emoji' and 'optional duration', hinting at two optional params, but does not explain the required 'status_text' parameter or specify formats (e.g., emoji must be like ':smile:'). The description does not fully compensate for the schema's lack of descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'set', the resource 'Slack status', and the key features (emoji, optional duration). It is specific and distinguishes from sibling tools like send_slack_message which sends messages rather than setting status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description does not mention any prerequisites, context, or exclusions, leaving the agent to infer usage without help.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sf_create_contactCInspect

Create a new Salesforce contact.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes
`email`	No
`phone`	No
`title`	No

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description only states the creation action without disclosing side effects, permissions, rate limits, or other behavioral traits. The agent has minimal insight into the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise but lacks necessary detail. It is not verbose, but the brevity comes at the cost of completeness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters, no output schema, and no annotations, the description fails to provide critical context such as return values, error handling, or integration specifics. It is incomplete for an AI agent to use reliably.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%; the description does not explain the meaning or constraints of the parameters (name, email, phone, title). The agent must infer semantics from parameter names alone, which is insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Create' and the resource 'Salesforce contact', providing a specific purpose. However, it does not differentiate from sibling tools like 'create_contact' or 'hubspot_create_contact', leaving ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool over alternatives such as 'create_contact' or 'hubspot_create_contact'. The description lacks context about prerequisites or appropriate scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sf_create_opportunityCInspect

Create a new Salesforce opportunity.

ParametersJSON Schema

Name	Required	Description	Default
`stage`	Yes
`amount`	No
`oppName`	Yes
`closeDate`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, and the description fails to disclose any behavioral traits such as permissions, side effects, or return values. The minimal description does not compensate for missing annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The single-sentence description is concise, but it is under-specified for a tool with 4 parameters. It does not earn its place by adding value beyond the name.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of annotations, output schema, and parameter descriptions, the description is grossly incomplete. It fails to provide enough context for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no meaning to any of the 4 parameters. All parameters remain undocumented, leaving the agent without guidance on how to fill them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly specifies a verb (create) and a resource (Salesforce opportunity), distinguishing it from siblings like sf_update_opportunity and sf_create_contact.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs. alternatives (e.g., sf_update_opportunity). The description lacks context for appropriate usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sf_create_taskCInspect

Create a Salesforce task or follow-up.

ParametersJSON Schema

Name	Required	Description	Default
`note`	No
`dueDate`	No
`subject`	Yes
`relatedId`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behaviors. Beyond stating it creates a task (an obvious mutation), it offers no details about permissions, side effects, or return values.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, which is concise, but it omits critical details. It earns a score of 3 for being appropriately brief in form but not in substance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given four parameters, no output schema, and no annotations, the description is severely incomplete. An AI agent would lack essential context about parameter formats (e.g., dueDate as string), the meaning of relatedId, and expected return behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It fails to explain any of the four parameters (note, dueDate, subject, relatedId), leaving their meaning entirely to the schema's generic types.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the action ('Create') and the resource ('Salesforce task or follow-up'), which is specific and distinguishes it from sibling tools like 'sf_create_contact' or 'sf_log_call'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool over alternatives. The description lacks any contextual clues about prerequisites or when a task is preferred over other Salesforce actions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sf_log_callCInspect

Log a call or meeting in Salesforce. Use after any sales call.

ParametersJSON Schema

Name	Required	Description	Default
`note`	Yes
`subject`	No
`relatedId`	No

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It fails to mention whether the tool creates a task, activity, or other record; what side effects occur (e.g., owner assignment, date stamping); or any required permissions. As a mutation tool (logging implies write), this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise (two sentences) with no extraneous information. It front-loads the primary action and adds a usage hint. However, it could be slightly more structured without adding length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema and no annotations, so the description bears full burden for completeness. It does not explain what the return value or outcome is (e.g., record ID, success status), nor does it cover edge cases like missing required fields. Given its simplicity (logging a call), more context is needed for an agent to use it correctly without errors.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 3 parameters (note, subject, relatedId) with 0% description coverage in the schema. The description does not elaborate on any parameter meaning, format, or constraints. For instance, 'relatedId' could be a contact, lead, or opportunity ID, but no guidance is given. The description adds no value beyond the schema's field names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Log a call or meeting in Salesforce' with a clear verb and resource. It distinguishes from sibling tools like 'sf_create_task' (which likely creates a task, not logs an event) and 'get_salesforce' (read). The added context 'Use after any sales call' clarifies usage scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description says 'Use after any sales call', which implies a when-to-use, but lacks when-not-to-use or alternatives. No mention of related tools like 'hubspot_log_note' or 'sf_create_task' for similar actions. Guidance is minimal and not comparative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sf_update_opportunityBInspect

Update a Salesforce opportunity stage or amount.

ParametersJSON Schema

Name	Required	Description	Default
`id`	Yes
`stage`	No
`amount`	No

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description does not disclose behavioral traits beyond the fact that it mutates. With no annotations, the agent is left unaware of permissions, side effects, or what happens if the opportunity ID is invalid. The lack of detail on the update behavior (e.g., whether partial updates are supported) is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence of 8 words, which is very concise. While it gets the core purpose across, it could be slightly expanded to include more details without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of annotations and output schema, the description is too brief. It does not specify that the 'id' must correspond to an existing opportunity, what valid values for 'stage' are, or the format for 'amount'. The agent lacks essential context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description mentions 'stage or amount', which maps to the two optional parameters, providing some context beyond the schema. However, it does not explain the expected format or valid values for these fields, and the required 'id' parameter is not described at all. With 0% schema description coverage, the description only partially compensates.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Update' and the resource 'Salesforce opportunity stage or amount', which distinguishes it from sibling tools like sf_create_opportunity that create opportunities. It specifies the exact fields that can be updated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidelines on when to use this tool versus alternatives (e.g., sf_create_opportunity for creation). However, the description implies it's for updating existing opportunities, so usage context is vaguely clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

submit_feature_requestAInspect

Submit a feature request when a user asks for something not available. Use when no matching tool exists.

ParametersJSON Schema

Name	Required	Description	Default
`context`	No	What they were trying to do
`feature`	Yes	What the user wants

Tool Definition Quality

A3.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, and the description does not disclose any behavioral traits such as what happens after submission (e.g., logging, notification), side effects, or permissions needed. This leaves the agent in the dark about the tool's internal behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently convey purpose and usage condition with no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple submission tool with 2 parameters and no output schema, the description is functional but lacks details on return values or behavioral context. It is minimally complete given the simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters. The tool description adds no extra meaning beyond what the schema provides, so it meets the baseline but does not enhance understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool submits a feature request when a user asks for something not available. It explicitly distinguishes from siblings by saying 'Use when no matching tool exists.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides explicit usage condition: 'Use when no matching tool exists.' This gives clear context for when to invoke, though it does not specify exclusions beyond that.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sync_crm_contextAInspect

Sync your Salesforce accounts, deals, and contacts into persistent memory so Claude knows your CRM landscape in every future session. Call after any significant Salesforce update.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description indicates a write mutation ('sync into persistent memory') with a lasting effect across sessions, but lacks detail on side effects (e.g., overwrite vs. append, impact on existing memory, authentication requirements). Without annotations, the description carries the full burden and only partially meets it.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first explains the purpose and effect, second gives usage context. Every word earns its place; no redundancy. Front-loaded with the key action and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters, no output schema, and no annotations, the description is fairly complete. It covers purpose, usage timing, and the behavioral impact of persistence. It omits prerequisites (e.g., Salesforce connection) and whether the sync is incremental, but for a 0-param tool it is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are zero parameters, so the schema provides no information. The description adds value by specifying the data synced (Salesforce accounts, deals, contacts), which compensates for the empty schema. The implicit scope is helpful.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool syncs Salesforce accounts, deals, and contacts into persistent memory for future sessions. It uses a specific verb 'sync' and identifies the resources and purpose, distinguishing it from sibling tools like get_salesforce or individual create operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Call after any significant Salesforce update,' providing clear when-to-use guidance. It does not state when not to use it, but the context implies that for one-off queries, other tools like get_salesforce would be alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

track_flightAInspect

Track a flight status including gate, terminal, delays. Scans calendar for upcoming flights automatically. Use flight number like DL1425 not confirmation number.

ParametersJSON Schema

Name	Required	Description	Default
`flight`	No	Flight number e.g. DL1425, AA1234. Leave empty to scan calendar.
`scan_calendar`	No

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must shoulder the burden of transparency. It discloses the automatic calendar scanning behavior and the type of data returned (status, gate, terminal, delays). However, it does not mention whether the tool is read-only, any required permissions, rate limits, or error conditions. This is adequate but incomplete.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, consisting of two sentences. The first sentence states the core functionality, and the second provides critical input format guidance. Every word is necessary and front-loaded, making it easy for an agent to quickly understand and use the tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the moderate complexity of flight tracking and the absence of an output schema, the description covers the main functionality and usage. It explains both parameters and the automatic calendar scan. However, it does not describe the output format or potential error responses, which would be helpful for an agent to process the result correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 2 parameters with 50% description coverage. The flight parameter has a schema description that is clear, and the tool description reiterates the input format ('Use flight number like DL1425 not confirmation number'). The scan_calendar parameter has a schema enum but no description; the tool description indirectly explains its effect ('Scans calendar for upcoming flights automatically'). This adds some value but does not fully compensate for the lack of explicit parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Track a flight status') and specifies the information tracked (gate, terminal, delays). It also mentions the automatic calendar scanning feature, which is unique among sibling tools. No other tool in the list serves a similar function, so it is well-differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear guidance on when to use the tool (to track a flight) and how to use it (provide a flight number like DL1425, not a confirmation number). It also explains that leaving the flight parameter empty triggers automatic calendar scanning. However, it does not explicitly state when not to use it or mention alternatives, though none are obvious.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

track_packagesAInspect

Track package delivery. Scans email for tracking numbers automatically. Supports UPS FedEx USPS Amazon DHL.

ParametersJSON Schema

Name	Required	Description	Default
`carrier`	No
`tracking`	No	Specific tracking number optional

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears the burden of behavioral disclosure. It reveals that the tool scans email for tracking numbers automatically and supports specific carriers. However, it omits details like read-only nature, response format, or error handling, leaving some behavioral uncertainty.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences clearly convey purpose and key feature (auto-scanning). No redundant words; information is front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero annotations and no output schema, the description covers the core function but lacks depth on edge cases (e.g., no email found, manual entry). It is adequate for a simple tool but not comprehensive for the 2 parameters and implicit email scanning behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 50% (tracking has description, carrier does not). The description adds meaning by listing supported carriers (UPS, FedEx, etc.), which indirectly informs the carrier parameter. However, it does not explicitly map carriers to the carrier parameter or define its format, requiring inference.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'track' and resource 'package delivery', lists supported carriers, and mentions automatic email scanning. It effectively distinguishes from sibling tools like track_flight.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, nor any conditions for use. It mentions scanning email automatically but does not specify when not to use it or what prerequisites exist.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

translateBInspect

Translate text to any language. Auto-detects source language.

ParametersJSON Schema

Name	Required	Description
`text`	Yes
`source`	No
`target`	Yes	Target language code: es fr de it pt zh ja ko ar

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full responsibility for behavioral disclosure. It only mentions auto-detection of source language, but omits details such as supported output formats, rate limits, or error handling for unsupported languages.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of two short sentences with no superfluous information. It is front-loaded with the core action ('Translate').

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity of the tool (3 simple parameters, no output schema), the description is minimally adequate but lacks completeness. It does not list all supported target languages beyond the schema snippet, and could clarify the range of source languages supported.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 33% (only target has a description). The description adds value by clarifying that source language is auto-detected when not provided, which is not explicit in the schema. However, text parameter lacks any semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: translate text to any language, and highlights auto-detection of source language. It uses a specific verb and resource, and there are no sibling tools with similar functionality to distinguish from.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. It does not specify prerequisites, limitations, or scenarios where other tools might be more appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

troubleshootBInspect

Run a full ExanorOS diagnostic.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose behavioral traits such as whether the diagnostic is read-only or has side effects, what it returns, or any system requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence with no unnecessary words. It is appropriately front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters, no output schema, and no annotations, the description is minimal but sufficient for a simple diagnostic tool. It could be more complete by mentioning the nature of the output.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero parameters with 100% coverage, so the description does not need to add parameter info. Baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states a specific verb ('Run') and resource ('full ExanorOS diagnostic'), distinguishing it from sibling tools that perform other tasks like email, calendar, or document operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There is no context for when a full diagnostic is appropriate, nor any exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_calendar_eventCInspect

Update a Google Calendar event.

ParametersJSON Schema

Name	Required	Description	Default
`end`	No
`start`	No
`eventId`	Yes
`summary`	No
`attendees`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It only says 'Update', implying mutation, but omits details on permissions, error handling, partial updates, or effects on existing data.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise, but it provides minimal information. It is not wasteful but lacks necessary substance to be effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters, no output schema, and no annotations, the description is severely incomplete. It does not mention required fields, parameter formats, return values, or any usage details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% and the description adds no parameter meaning. Five parameters exist (eventId, start, end, summary, attendees) but nothing explains their format, constraints, or behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Update' and the resource 'Google Calendar event', differentiating it from sibling tools like create_calendar_event and delete_calendar_event.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives (e.g., other update tools or creating new events). There are no contexts, prerequisites, or exclusions mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_contactCInspect

Update an existing Google contact.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No
`email`	No
`notes`	No
`phone`	No
`company`	No
`resourceName`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It merely says 'update' without specifying partial vs full replacement, required permissions, limits, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, concise but lacking any additional structure or elaboration that would add value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no annotations, no output schema, and 6 undocumented parameters, the description is insufficient for an AI agent to confidently invoke the tool. No return value or behavioral details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% and description adds no meaning to any of the 6 parameters. No explanation of what each parameter does beyond the schema structure.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'Update' and resource 'existing Google contact'. Distinguishes from sibling tools like create_contact (create vs update).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives, no prerequisites, no when-not-to-use instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_outlook_eventCInspect

Update an Outlook Calendar event.

ParametersJSON Schema

Name	Required	Description	Default
`end`	No
`start`	No
`eventId`	Yes
`summary`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavior. It only states 'Update', implying mutation, but does not detail authorization needs, side effects, or whether partial updates are supported.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise but overly minimal. It conveys only the basic action without additional context that would justify its brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description should compensate by explaining return values, error behavior, or update semantics. It fails to do so, leaving the agent underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage, and the description adds no parameter details. While parameter names ('start', 'end', 'summary') are somewhat self-explanatory, format constraints (e.g., date-time format for start/end) are missing.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (update) and the resource (Outlook Calendar event), distinguishing it from create and delete siblings. However, it does not differentiate from the generic 'update_calendar_event' sibling that also updates events.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'update_calendar_event'. The description fails to specify context or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_preferencesCInspect

Update user preferences.

ParametersJSON Schema

Name	Required	Description	Default
`tone`	No
`sign_off`	No
`priorities`	No
`vip_contacts`	No
`email_platform`	No
`automation_level`	No

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description carries the full burden. It does not disclose side effects, authorization needs, or update behavior (e.g., merge vs. overwrite). Only the basic action is stated.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is too brief—just three words. While concise, it omits crucial details, making it under-specified rather than efficiently informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no annotations, no output schema, and zero parameter descriptions, the description fails to provide a complete picture. It does not cover return values, side effects, or parameter constraints beyond the schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and the description provides no explanation of the 6 parameters (tone, sign_off, etc.). The agent gets no help understanding what each parameter does or how to use them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Update user preferences' clearly states the action (update) and resource (preferences). It is a specific verb+resource but does not differentiate from sibling tools like get_preferences, which is acceptable since the verb differs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternative tools (e.g., when to update vs. read preferences). The description lacks any context about prerequisites or suitable scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

void_envelopeCInspect

Cancel a DocuSign envelope.

ParametersJSON Schema

Name	Required	Description	Default
`envelopeId`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears full responsibility. It indicates 'Cancel' (a destructive action) but does not disclose permanence, reversibility, or permission requirements. The brief description fails to add necessary behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, using a single sentence without unnecessary words. However, it could be slightly more informative while maintaining brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and a single required parameter, the description is too minimal. It does not explain return values, side effects, or the result of cancellation, leaving the agent without essential context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not elaborate on the parameter 'envelopeId' beyond its presence in the schema. No added meaning or constraints are provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (Cancel) and resource (DocuSign envelope). It is specific and distinguishes from siblings like 'resend_signature_request' and 'send_for_signature', as those involve different operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, or any prerequisites. The description simply states the action without context on appropriate use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

weekly_previewBInspect

Full weekly preview — calendar, inbox, priorities.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but only states the data sources. It doesn't disclose whether the tool is read-only, what it returns, or any side effects. Minimal behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single sentence that clearly states the tool's purpose. No wasted words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the description covers the main aggregations, it lacks specifics like time range (weekly preview of what period?), what constitutes 'priorities', and what is included from each source. Given no output schema or annotations, more detail would enhance completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has zero parameters, so schema coverage is complete. The description adds value by listing the aggregated areas (calendar, inbox, priorities) without needing to explain parameters. Baseline for 0 params is 4, and the description is sufficiently clear.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Full weekly preview — calendar, inbox, priorities' clearly states the tool aggregates three data sources into a weekly overview. However, it doesn't differentiate from similar aggregated tools like 'morning_briefing' or 'catch_me_up', which might overlap.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like individual getter tools or other aggregated views. It doesn't specify prerequisites or exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?