macalc
Server Details
The most comprehensive everyday calculator MCP server — 501 tools across 22 categories covering 8 countries' tax systems (FR, BE, CH, CA, US, UK, MA, SN). Finance, health, math, science, construction, conversions, education, sport, cooking, travel, and more. Free, no API key required. Streamable HTTP transport.
- Status
- Unhealthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Score is being calculated. Check back soon.
Available Tools
446 toolscalculate_1rm_tableAInspect
Generate a full 1RM-to-12RM repetition table from a known lift using Epley formula
| Name | Required | Description | Default |
|---|---|---|---|
| reps | Yes | Repetitions performed at that weight | |
| weight | Yes | Weight lifted in kg or lbs |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses specific calculation method (Epley formula) and output scope (1RM-to-12RM range), but with no annotations provided, fails to mention safety profile, error conditions, or output format details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. Core purpose, method, and output scope presented upfront with no redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for simple 2-parameter calculation tool. Describes inputs (implied via 'known lift'), method, and output. Lacking output schema but description compensates by detailing table contents.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% providing full param documentation. Description adds 'from a known lift' context implying real performance data, but does not substantially elaborate param syntax or formats beyond schema baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Generate' with clear resource '1RM-to-12RM repetition table' and distinguishes from sibling calculate_one_rep_max by emphasizing 'full...table' (range output) versus single value calculation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implied usage through output description (generates full table), but lacks explicit when-to-use guidance or explicit contrast with calculate_one_rep_max alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_add_hoursAInspect
Add two time durations and return the total in hours and minutes
| Name | Required | Description | Default |
|---|---|---|---|
| hours1 | Yes | First duration — hours | |
| hours2 | Yes | Second duration — hours | |
| minutes1 | Yes | First duration — minutes | |
| minutes2 | Yes | Second duration — minutes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses the return format ('hours and minutes') but omits critical behavioral details like whether minutes overflow into hours (e.g., 90 minutes → 1h30), return data structure, or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 11-word sentence with zero waste. Front-loaded with action verb, immediately clear what the tool does and returns.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple 4-parameter input with complete schema coverage, the description is nearly sufficient. However, without an output schema, it should specify the exact return structure (object fields, string format) and normalization behavior to fully compensate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear descriptions ('First duration — hours', etc.). Description implies parameter pairing (hours1+minutes1 as first duration) but adds no syntax, formatting, or constraint details beyond the schema. Baseline 3 appropriate for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb 'Add' and resource 'time durations', plus output format 'hours and minutes'. The specificity distinguishes it from sibling tools like calculate_time_difference (subtraction) and convert_time (conversion).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when/when-not guidance or named alternatives provided. Usage must be inferred from the description and tool name. Given the extensive list of calculation siblings, explicit guidance on when to prefer this over calculate_time_difference or convert_time would help.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ageBInspect
Calculate exact age in years, months and days from a birth date
| Name | Required | Description | Default |
|---|---|---|---|
| birth_date | Yes | YYYY-MM-DD — Date of birth |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses the output structure (years, months, days) which compensates partially for the missing output schema, but omits details about date validation, timezone handling, or behavior with future dates.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of 11 words with zero waste. The core action and output format are front-loaded immediately.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple single-parameter calculation tool, the description is reasonably complete. It specifies the output format to compensate for the missing output schema. However, it could be improved by mentioning edge case handling (e.g., future dates) or timezone assumptions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the 'birth_date' parameter fully documented in the schema. The description references 'birth date' which aligns with the parameter, but adds no additional semantic information about format requirements or validation beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the resource ('age') and expected output format ('years, months and days'). However, it does not explicitly distinguish from the sibling tool 'calculate_age_in_units', leaving ambiguity about which tool to select when.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'calculate_age_in_units' or other date-related tools. There are no stated prerequisites, exclusions, or conditions for use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_age_in_unitsBInspect
Calculate exact age in multiple units from birth date
| Name | Required | Description | Default |
|---|---|---|---|
| birth_date | Yes | Birth date YYYY-MM-DD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose what 'multiple units' specifically means (e.g., years/months/days vs. other units) or describe the return format/structure. Does not mention idempotency or calculation precision limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with verb and object. No redundant phrases or filler content. Appropriate length for a single-parameter calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple one-parameter tool without output schema, but lacks specificity about return values (what units are calculated). Given the tool's simplicity and 100% schema coverage, the description is minimally sufficient but could be improved with output format details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('Birth date YYYY-MM-DD'). Description mentions 'from birth date' which aligns with the parameter, but adds no semantic meaning, validation rules, or format guidance beyond what the schema already provides. Baseline 3 appropriate for complete schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (Calculate) and resource (age) with scope (in multiple units from birth date). Mentions 'multiple units' which implicitly distinguishes from sibling 'calculate_age', but does not explicitly clarify what units are returned (years, months, days, etc.) or when to choose this over the simpler age calculator.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus the sibling 'calculate_age' or other date-related tools. No mention of prerequisites or input format requirements beyond the schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_alcohol_unitsAInspect
Calculate total alcohol units from drinks and compare to UK weekly limit
| Name | Required | Description | Default |
|---|---|---|---|
| drinks | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, description carries full burden. It discloses the comparison behavior (against UK limit) beyond pure calculation, but omits the actual limit value (14 units), return format details, and read-only nature.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence (10 words) with zero waste. Purpose is front-loaded and immediately clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter calculation tool without output schema. Description sufficiently indicates domain (UK units) but could clarify what the comparison returns (boolean, percentage, or raw count vs limit).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 0% description coverage. Description mentions 'drinks' which maps to the parameter name, but fails to compensate by explaining the required drink object structure (volume_ml, abv_pct, type) or the enum values expected.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb (Calculate), specific resource (alcohol units), and specific jurisdiction (UK weekly limit) clearly distinguishes from sibling tools like calculate_blood_alcohol which computes BAC for driving safety rather than UK public health units.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies UK health context via 'UK weekly limit' but lacks explicit when-to-use guidance (e.g., health tracking vs driving safety) and does not explicitly reference sibling calculate_blood_alcohol as the alternative for impairment estimation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_altitude_sicknessAInspect
Assess altitude sickness risk and recommend acclimatization schedule
| Name | Required | Description | Default |
|---|---|---|---|
| altitude_m | Yes | Target altitude in meters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It indicates the tool generates recommendations/assessments but fails to disclose output format, whether results include medical disclaimers, or if calculations are based on standard physiological models. No contradictions with absent annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of nine words with the core purpose front-loaded. There is no redundant information; every word serves to define the tool's dual function (risk assessment + recommendation generation).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the single simple parameter with complete schema documentation and no output schema, the description adequately covers necessary context. However, for a health-related tool, it could benefit from mentioning the nature of the output (risk categories, timeline recommendations) or accuracy limitations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('Target altitude in meters'), the schema fully documents the single parameter. The description adds no additional parameter guidance (e.g., typical acclimatization altitudes, why 8850m is the max), warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Assess', 'recommend') and clearly identifies the resource (altitude sickness risk, acclimatization schedule). It implicitly distinguishes from sibling 'calculate_baking_altitude' by focusing on health/medical outcomes rather than cooking adjustments, though it doesn't explicitly name the sibling.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implicit usage guidance through domain-specific terminology ('altitude sickness', 'acclimatization'), making it clear this is for mountaineering/health contexts. However, it lacks explicit when-to-use criteria, prerequisites, or warnings about when medical professional consultation is needed instead.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_angle_convertBInspect
Convert angles between degrees, radians, gradians and turns
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Angle value | |
| to_unit | Yes | Target unit | |
| from_unit | Yes | Source unit |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden but only states the conversion capability. It omits behavioral details such as precision constraints, handling of negative angles, zero values, or whether the operation is idempotent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of nine words with the action verb front-loaded. Every word serves a purpose, specifying both the operation and the complete set of supported units without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and complete input schema, the description is minimally adequate for invocation. However, with no output schema and no annotations, the absence of any description of the return value (converted angle) leaves a minor gap in contextual completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description reinforces the schema by listing the specific enum values (degrees, radians, gradians, turns) but does not add semantic depth regarding parameter relationships or validation logic.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Convert') with a clear resource ('angles') and enumerates the exact supported units (degrees, radians, gradians, turns). However, it does not differentiate from sibling tool 'convert_angle', leaving ambiguity about which tool to prefer for angle conversions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, particularly the sibling 'convert_angle' tool. There are no mentioned prerequisites, exclusions, or contextual triggers for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_annuity_paymentCInspect
Calculate periodic payment amount for a loan or annuity
| Name | Required | Description | Default |
|---|---|---|---|
| rate | Yes | Annual interest rate percent | |
| periods | Yes | Number of payment periods (months) | |
| principal | Yes | Principal amount EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description discloses minimal behavioral information. It does not specify the calculation formula (PMT/annuity formula), rounding behavior, currency handling, or that it returns a monetary value versus a schedule. The description implies a single output value but does not confirm this.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficient and front-loaded with the action verb. However, given the lack of annotations and output schema, extreme brevity here constitutes under-specification rather than optimal conciseness—it could accommodate additional context without becoming verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a three-parameter calculation tool with complete schema coverage but no annotations or output schema, the description meets minimum viability by identifying the calculation type. However, it lacks behavioral specifics (rounding, currency assumptions) and sibling differentiation that would make it complete for an agent selecting among many 'calculate_' tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (principal, rate, and periods all have descriptions), the schema carries the semantic load. The description adds no parameter-specific context beyond what the schema already provides (e.g., it doesn't clarify the rate period or compounding assumptions), warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the resource (periodic payment amount for loans/annuities). However, it fails to distinguish from the sibling tool 'calculate_loan_payment', leaving ambiguity about which tool to use for standard loan calculations versus annuity-specific calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to prefer this tool over siblings like 'calculate_loan_payment' or 'calculate_mortgage', nor does it mention prerequisites or expected contexts (e.g., amortizing loans). Users must infer applicability from the parameter names alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_anythingAInspect
Universal AI-powered calculator — handles any calculation not covered by specialized tools. Requires premium subscription.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Calculation request in natural language (English or French) | |
| context | No | Optional context: units, constraints, domain |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full disclosure burden. It successfully adds two critical behavioral traits: 'AI-powered' (indicating probabilistic/natural language processing vs deterministic calculation) and 'Requires premium subscription' (auth/billing constraint). However, it omits other relevant behaviors like read-only nature, output format (text vs number), or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, zero waste. First sentence front-loads purpose and sibling differentiation; second sentence adds critical constraint (premium). Every word earns its place—'Universal' signals scope, 'AI-powered' signals mechanism, specialized tools reference signals fallback role.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (universal fallback to 200+ specialized tools), the description adequately establishes its role and constraints. With full schema coverage and no output schema, it appropriately leaves return value documentation implicit. Missing only minor details like output format (natural language explanation vs raw value).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with explicit natural language and context documentation. Description adds minimal semantic value beyond the schema, though 'AI-powered' reinforces the natural language input expectation for the 'query' parameter. Baseline 3 appropriate given comprehensive schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity: 'Universal AI-powered calculator' defines the tool's function and mechanism, while 'handles any calculation not covered by specialized tools' explicitly distinguishes it from 200+ deterministic siblings. Clear verb (handles) + resource (calculation) + scope differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear selection criteria via 'not covered by specialized tools,' establishing it as the fallback option. However, lacks explicit 'when NOT to use' guidance (e.g., prefer deterministic siblings when available) and doesn't name specific alternatives, though this is impractical given the sibling volume.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_aquarium_volumeCInspect
Calculate aquarium gross/net volume and stocking capacity
| Name | Required | Description | Default |
|---|---|---|---|
| width_cm | Yes | ||
| height_cm | Yes | ||
| length_cm | Yes | ||
| substrate_cm | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It usefully indicates the tool computes both 'gross' and 'net' volumes (implying substrate displacement) and 'stocking capacity' (fish population limits), but fails to explain how stocking capacity is calculated, what units are returned, or whether it assumes freshwater vs. saltwater.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is front-loaded and efficient, but with four undocumented parameters and no output schema, the six-word description is undersized for the tool's complexity. It needs additional sentences explaining parameters and return format.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero schema descriptions, no annotations, and no output schema, the description provides the minimum viable context by identifying the three key outputs (gross volume, net volume, stocking capacity). However, it lacks critical details like measurement units, the relationship between 'substrate_cm' and net volume, and the methodology for calculating stocking capacity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, requiring the description to compensate. While 'gross/net volume' implicitly explains the 'substrate_cm' parameter (used to calculate net volume), the description mentions no parameters by name or their physical meaning (e.g., that dimensions are in centimeters), leaving all four parameters effectively undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and resource ('aquarium'), and distinguishes from generic siblings like 'calculate_volume' by specifying domain-specific outputs: 'gross/net volume' and 'stocking capacity'. However, it doesn't explicitly differentiate from 'calculate_fish_tank_heater' or 'calculate_pool_volume' which appear in the sibling list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives like 'calculate_volume' (generic) or 'calculate_pool_volume'. No prerequisites or contextual triggers are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_areaCInspect
Calculate area for common geometric shapes
| Name | Required | Description | Default |
|---|---|---|---|
| d1 | No | Diagonal 1 for rhombus | |
| d2 | No | Diagonal 2 for rhombus | |
| side | No | Side for hexagon | |
| shape | Yes | Shape type | |
| width | No | Width | |
| height | No | Height | |
| length | No | Length or base | |
| radius | No | Radius | |
| semi_major | No | Semi-major axis for ellipse | |
| semi_minor | No | Semi-minor axis for ellipse |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. Fails to mention units (output is square units of input), precision handling, or the conditional parameter logic required by different shapes.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with no redundancy. However, given the tool's complexity (10 parameters with shape-dependent requirements), the brevity may underserve the user rather than demonstrate conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Inadequate for tool complexity. With 10 parameters and conditional logic based on the 'shape' enum, the description should explain the parameter mapping (which fields apply to which shapes). No output schema compounds the gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage, establishing baseline of 3. Description adds no additional semantics about conditional requirements (e.g., that 'circle' requires 'radius' while 'rectangle' requires 'width'/'height'), but schema adequately documents individual fields.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (calculate) and resource (area of geometric shapes). Distinguishes broadly from siblings like 'calculate_volume' and 'calculate_perimeter', though it doesn't clarify relationship to specific-shape calculators like 'calculate_ellipse'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this general-purpose tool versus specific alternatives (e.g., 'calculate_ellipse'), nor does it indicate which parameter combinations are required for each shape type.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_auto_entrepreneurAInspect
Calculate French auto-entrepreneur (micro-enterprise) net income and social charges
| Name | Required | Description | Default |
|---|---|---|---|
| revenue | Yes | Annual revenue (chiffre d'affaires) in euros | |
| category | No | Activity category: vente (sales), service_bic, service_bnc, liberal | service_bnc |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses the calculation outputs (net income and social charges) but omits other behavioral traits like whether this is a simulation vs. official filing, rate limits, or authentication requirements. It meets minimum disclosure for a calculation tool but lacks richness.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single, efficient sentence with no waste. Front-loaded with the key action ('Calculate'), domain ('French auto-entrepreneur'), and outputs ('net income and social charges'). Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 2-parameter tool without output schema, the description adequately covers scope by naming the expected outputs (net income and social charges). It could improve by clarifying this is an estimation/simulation tool, but sufficiently covers the tool's functionality given its low complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (revenue and category both well-documented in the schema), the baseline is 3. The description provides domain context ('French auto-entrepreneur') that frames the parameters but does not add additional semantic details beyond the schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and explicitly identifies the resource ('French auto-entrepreneur/micro-enterprise net income and social charges'). It clearly distinguishes from siblings like 'calculate_french_income_tax' or 'calculate_portage_salarial' by specifying the unique 'auto-entrepreneur' regime.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage by specifying the French auto-entrepreneur regime, signaling when to select it (for micro-enterprise calculations). However, it lacks explicit contrasts with alternatives like 'calculate_french_income_tax' or 'calculate_portage_salarial' that handle different employment statuses.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_averageCInspect
Calculate simple, weighted, or geometric mean
| Name | Required | Description | Default |
|---|---|---|---|
| values | Yes | Array of numbers | |
| weights | No | Optional weights for weighted average |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but discloses nothing about input validation (e.g., weights/values length matching), how the calculation mode is determined, or the return value format. The claim of supporting 'geometric mean' without explaining how to request it creates behavioral opacity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely brief (5 words) and front-loaded with the core action. While efficient, it may be underspecified given the behavioral ambiguity around geometric mean calculation—every word earns its place, but the place needs more words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a two-parameter statistical utility without output schema, the description minimally covers the basic function. However, given the 'geometric mean' ambiguity and lack of error-handling documentation, it falls short of fully equipping an agent to invoke the tool correctly in all claimed scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Array of numbers' and 'Optional weights'), establishing baseline 3. The description adds the terminology 'simple, weighted, or geometric' but fails to explain parameter interactions (e.g., that weights array must match values length) or how geometric calculation is triggered.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the tool calculates means (simple, weighted, geometric), providing specific verbs and resource types. However, it fails to clarify how the user selects between these three modes given the schema only contains 'values' and 'weights' parameters with no enum or 'type' selector, creating ambiguity about invocation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this generic tool versus sibling alternatives like 'calculate_statistics' or domain-specific calculators. There are no 'when-not' exclusions or prerequisites mentioned despite hundreds of specialized calculation siblings existing.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_bac_pointsCInspect
Calculate French Baccalaureat score estimation
| Name | Required | Description | Default |
|---|---|---|---|
| grand_oral | Yes | Grand oral (/20, coeff 10) | |
| philosophy | Yes | Philosophy (/20, coeff 8) | |
| specialty1 | Yes | Specialty 1 (/20, coeff 16) | |
| specialty2 | Yes | Specialty 2 (/20, coeff 16) | |
| french_oral | Yes | French oral exam (/20, coeff 5) | |
| french_written | Yes | French written exam (/20, coeff 5) | |
| continuous_control | Yes | Continuous assessment score (/720 = 40%) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Only behavioral hint is 'estimation,' implying it's a projection, not an official result. Missing: return value format (weighted total? out of 100?), calculation methodology, and whether this is a pure computation or stores data. For a 7-parameter mutation/calculation tool with no annotations, this is insufficient disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at 5 words, front-loaded with the action verb. No wasted text. However, given the complexity (7 required parameters, domain-specific knowledge), it may be overly terse rather than appropriately concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite rich input schema, the tool lacks output schema and the description fails to explain what the calculation returns (final weighted score out of 100? pass/fail threshold? mentions?). For a domain-specific calculation with no output schema, the description should specify the return value format and calculation basis.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with excellent inline documentation (scales like /20, coefficients like coeff 10, and percentages like 40%). Description adds no parameter semantics beyond the schema. Baseline 3 is appropriate when the schema fully documents all 7 required parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb (Calculate) + specific resource (French Baccalaureat score) + scope (estimation). Distinguishes from siblings like `calculate_brevet_points` (middle school) and `calculate_parcoursup_points` (university admission) by naming the specific diploma. Could be improved by clarifying this is for the French secondary school leaving certificate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus generic calculators like `calculate_grade_average` or `calculate_average`. Does not mention prerequisite requirements (needing all exam scores) or that it should be used after final exams.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_baking_altitudeCInspect
Adjust baking recipe for high altitude cooking
| Name | Required | Description | Default |
|---|---|---|---|
| altitude_m | Yes | Altitude in meters | |
| flour_cups | Yes | Flour in cups | |
| sugar_cups | Yes | Sugar in cups | |
| liquid_cups | Yes | Liquid in cups | |
| oven_temp_c | Yes | Oven temperature °C |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool 'adjusts' recipes but fails to specify what the adjustment entails (e.g., return format like modified ingredient amounts vs. multipliers, text instructions vs. structured data) or whether the operation is idempotent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is terse with no wasted words, achieving brevity. However, given the lack of annotations and output schema, the description is arguably under-sized rather than optimally concise—it front-loads the core purpose but leaves critical behavioral gaps.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 5 required parameters, no annotations, and no output schema, the description fails to specify what the tool returns (e.g., adjusted ingredient quantities, temperature modifications, or textual guidance), leaving a significant gap in the agent's ability to predict tool behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage ('Altitude in meters', 'Flour in cups', etc.), establishing a baseline score of 3. The tool description adds only general context ('baking recipe') which is already implicit in the parameter names, without elaborating on units, valid ranges, or relationships between parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a clear verb ('Adjust'), resource ('baking recipe'), and specific context ('high altitude cooking') that distinguishes it from generic conversion tools. However, it does not explicitly differentiate from similar siblings like 'calculate_baking_conversion' or 'calculate_recipe_scaling'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., when altitude thresholds are met), nor does it mention prerequisites or expected use cases beyond the high-altitude context implied in the name.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_baking_conversionBInspect
Convert cups to grams for common baking ingredients
| Name | Required | Description | Default |
|---|---|---|---|
| ingredient | Yes | ||
| quantity_cups | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It fails to mention whether the operation is read-only, the return format (since no output schema exists), precision of conversions, or error handling for unsupported ingredients.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is appropriately sized with zero waste. It is front-loaded with the action verb and contains no redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (2 simple parameters) and clear naming, the description is minimally viable. However, with no output schema and no annotations, it lacks information about the return value and behavioral guarantees that would make it complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description partially compensates by adding semantic context: 'cups to grams' implies the quantity_cups parameter's unit and purpose, and 'common baking ingredients' hints at the ingredient parameter's domain. However, it does not clarify the specific enum values or that quantity_cups must be positive.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific action (Convert), specific units (cups to grams), and domain (baking ingredients). It effectively distinguishes this from general cooking conversion siblings like 'calculate_cooking_conversion' by specifying 'baking ingredients,' though it does not explicitly name siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'calculate_cooking_conversion' or 'convert_cooking', nor does it mention prerequisites or constraints beyond the implicit domain.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_base_converterAInspect
Convert numbers between bases (binary, octal, decimal, hexadecimal, any base 2-36)
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Number as string | |
| to_base | Yes | Target base | |
| from_base | Yes | Source base |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It mentions the valid base range (2-36), which adds useful constraint context beyond the raw schema. However, it lacks details on error handling (e.g., invalid digits for a given base), output format/return type, or whether the result includes prefixes (like '0x').
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the action ('Convert') and uses a parenthetical to provide examples and constraints without waste. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a focused mathematical utility with a simple 3-parameter schema and no output schema, the description covers the essential functional scope (base range 2-36) and typical use cases. While it could mention string format requirements for hex digits (A-Z), the schema already indicates the value is a string, making this acceptable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While the schema has 100% coverage describing the parameters technically, the description adds semantic value by providing concrete examples of common bases (binary=2, octal=8, etc.), helping the agent understand what integer values to pass for typical use cases.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool converts numbers between bases and lists specific examples (binary, octal, decimal, hexadecimal) plus the valid range (2-36). However, it does not differentiate from the sibling tool 'calculate_number_base_convert', which could cause selection confusion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites or constraints beyond the base range. The existence of 'calculate_number_base_convert' as a sibling makes this lack of differentiation problematic.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_beam_loadBInspect
Calculate max bending moment and shear for a beam under uniform distributed load
| Name | Required | Description | Default |
|---|---|---|---|
| span_m | Yes | Beam span in meters | |
| beam_type | No | Support type | simply_supported |
| load_kg_per_m | Yes | Distributed load in kg/m |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of disclosure. It successfully identifies what physical values are computed (bending moment and shear), but omits critical behavioral details such as output units (kNm?), return format structure, read-only safety confirmation, or error handling conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient at 11 words with zero redundancy. Front-loaded with the action verb 'Calculate' followed immediately by the specific outputs and subject. Every word earns its place; no restructuring needed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple flat schema (3 parameters, no nesting) and absence of output schema, the description adequately covers the calculation purpose but leaves gaps regarding the return value structure and units. Sufficient for tool selection but minimal for result interpretation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds contextual meaning by referencing 'uniform distributed load' (aligning with load_kg_per_m) and 'beam' (aligning with span_m and beam_type), but does not extend beyond the schema's documentation with additional semantic constraints or format details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the action (Calculate), the outputs (max bending moment and shear), and the context (beam under uniform distributed load). It effectively distinguishes this structural engineering tool from the many other calculate_* siblings by specifying the specific physics calculations performed.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., calculate_roof_truss), nor does it mention prerequisites such as required units or structural assumptions. It states only what the tool does, not when to invoke it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_belgian_car_advantageBInspect
Calculate Belgian benefit-in-kind for company car (avantage de toute nature voiture)
| Name | Required | Description | Default |
|---|---|---|---|
| co2 | Yes | CO2 emissions in g/km | |
| fuel_type | No | Fuel type | essence |
| catalog_value | Yes | Catalog value of the vehicle (HTVA) in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. While 'Calculate' implies a read-only operation, the description omits critical details such as the output format (monthly vs annual amount), calculation methodology reference, potential error conditions, or whether results are stored. It fails to disclose what the agent receives upon invocation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero waste. Information is front-loaded with the action verb, followed by specific domain identifiers (Belgian, benefit-in-kind, company car), plus clarifying local terminology in parentheses. Every element earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While the input schema is fully documented, the tool lacks an output schema and the description fails to explain the calculation result (e.g., monetary amount, taxable base). Given the complexity of Belgian tax regulations, the description should ideally reference the calculation standard or return value interpretation, constituting a notable gap despite adequate parameter coverage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage (CO2 emissions, catalog value HTVA, fuel type), establishing baseline 3. The description itself adds no parameter-specific semantics, but the complete schema renders additional elaboration unnecessary. The French tax term 'HTVA' in the schema description is helpful context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the action (Calculate), jurisdiction (Belgian), tax concept (benefit-in-kind), and asset type (company car). The French translation '(avantage de toute nature voiture)' reinforces local specificity, effectively distinguishing it from siblings like 'calculate_belgian_income_tax' or 'calculate_car_depreciation'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no explicit guidance on when to use this tool versus alternatives (e.g., 'calculate_belgian_salary' for general income), nor does it mention prerequisites such as requiring catalog value and CO2 data. Usage must be inferred solely from the tool name and parameter schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_belgian_donationBInspect
Calculate Belgian donation tax (droits de donation) — Wallonia rates
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Donation amount in euros | |
| relationship | Yes | Relationship: direct_line (parents/children), between_spouses (or cohabitants), others |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It successfully communicates regional specificity (Wallonia rates) but omits whether the tool performs pure calculation (read-only) or has side effects, and fails to describe the output format or structure despite lacking an output schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient single-line description with zero redundancy. The French translation 'droits de donation' and Wallonia qualifier earn their place. However, extreme brevity comes at the cost of omitting critical context like output format, preventing a perfect score.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tax calculation tool with no output schema and no annotations, the description is incomplete. It fails to specify what the tool returns (tax amount due? rate tables? deductions?), omits regional applicability warnings, and provides no guidance on the legal context of Belgian donation taxes.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear definitions for 'amount' and 'relationship'. The description adds no semantic clarification beyond the schema, meeting the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb (Calculate), resource (Belgian donation tax/droits de donation), and geographic scope (Wallonia rates), distinguishing it from sibling tools like calculate_belgian_income_tax. However, it assumes user knowledge of what 'donation tax' entails without clarifying it refers to gift/inheritance transfer taxes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While 'Wallonia rates' implies geographic limitation, the description explicitly states neither when to prefer this over other regional calculators nor mentions that it specifically excludes Flanders and Brussels rates. No alternative tools are named for other Belgian regions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_belgian_income_taxAInspect
Calculate Belgian personal income tax (IPP/PB) using 2026 progressive brackets
| Name | Required | Description | Default |
|---|---|---|---|
| income | Yes | Annual taxable income in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries full burden. It adds valuable calculation context by specifying '2026 progressive brackets' (revealing year-specific methodology), but lacks safety disclosure (read-only status, determinism), return format, or rate limit information despite being a 'calculate' operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste, front-loaded with action verb 'Calculate', includes localization (Belgian), specific tax nomenclature (IPP/PB), and temporal scope (2026) without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter calculation tool with full schema coverage, the description is appropriately complete. Mentioning the 2026 tax year is crucial contextual information for tax calculations. Minor gap: no mention of return value format despite absent output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage providing baseline 3. The description adds meaningful temporal context by specifying '2026', which constrains the applicability of the income parameter to the 2026 tax year—information not present in the schema but critical for correct usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with clear resource 'Belgian personal income tax', distinguishes from siblings via country specificity ('Belgian'), tax type ('IPP/PB'), and year ('2026'), clearly differentiating from other country's income tax tools and other Belgian calculators like VAT or social contributions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this specific tool versus related siblings like calculate_belgian_salary or calculate_belgian_social_contributions, nor does it indicate prerequisites or when to prefer this over other tax calculators.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_belgian_pensionCInspect
Estimate Belgian statutory pension (pension legale)
| Name | Required | Description | Default |
|---|---|---|---|
| career_years | Yes | Number of career years | |
| average_salary | Yes | Average annual salary in euros | |
| household_type | No | Pension type: single rate (60%) or household rate (75%) | single |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Only behavioral hint is 'Estimate,' implying approximation. Missing: specific pension regimes covered (employee vs. self-employed), calculation methodology, output format, and whether results are gross or net.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise (7 words) with no filler. However, given zero annotations and no output schema, the brevity leaves significant gaps in contextual information. Front-loaded but perhaps insufficiently descriptive for the domain complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Identifies the calculation domain adequately given good parameter schema coverage. However, lacks explanation of output values, relationship to other retirement tools, and specific Belgian pension rules (e.g., 45-year maximum career implication). Minimum viable for a specialized calculator.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for all three parameters including the enum values and their percentage rates (60%/75%). Description adds no parameter-specific context, but baseline 3 is appropriate given complete schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Estimate') and specific resource ('Belgian statutory pension'). The parenthetical French term '(pension legale)' adds local context. Distinguishes from generic pension calculators via 'Belgian' domain specifier, though it doesn't explicitly differentiate from sibling 'calculate_retirement_pension'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus other pension/retirement calculators, or prerequisites such as requiring Belgian employment history. No mention of complementary tools or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_belgian_salaryAInspect
Convert Belgian gross monthly salary to net salary (approximation)
| Name | Required | Description | Default |
|---|---|---|---|
| gross_monthly | Yes | Gross monthly salary in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses '(approximation)' which is valuable behavioral context, but lacks details on which deductions are modeled (social security, taxes, etc.), applicable tax year, or output format/structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single, efficient sentence with zero waste. Information is front-loaded with the action ('Convert'), followed by domain specifics, and closes with the important caveat ('approximation').
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter conversion tool. The description covers input (gross), output (net), domain (Belgian), and quality (approximation). Minor gap: absent output schema means return structure (number vs object) is unspecified, though 'net salary' implies the value.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('Gross monthly salary in euros'). Description mentions 'gross monthly salary' which aligns with the parameter name, but adds no additional semantic detail (e.g., acceptable ranges, examples, or validation rules) beyond what's in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity: 'Convert' is a clear verb, 'Belgian gross monthly salary to net salary' identifies the exact resource and transformation, and 'Belgian' distinguishes it from calculate_french_salary/calculate_swiss_salary while 'gross to net' distinguishes it from calculate_belgian_income_tax (which is tax-specific).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus sibling tools like calculate_belgian_income_tax, calculate_belgian_social_contributions, or calculate_salary_hourly_to_annual. Given the crowded calculate_* namespace, explicit context would help selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_belgian_social_contributionsBInspect
Calculate Belgian self-employed social contributions (cotisations INASTI)
| Name | Required | Description | Default |
|---|---|---|---|
| annual_income | Yes | Annual net professional income in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure, yet it only states the calculation purpose without mentioning output format, calculation methodology, or whether results are estimates. The mention of 'INASTI' provides institutional context but fails to disclose behavioral traits like determinism or data freshness.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of a single efficient sentence with seven words that front-loads the core purpose. Every word earns its place with no redundant phrases or tautology, achieving maximum information density.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter calculation tool with full schema coverage and no output schema, the description adequately identifies the calculation domain and target population. However, it lacks mention of the output format or calculation year applicability, which would be helpful given the complexity of social contribution rules.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage with 'Annual net professional income in euros' already explaining the single parameter thoroughly. The description does not add semantic details beyond the schema, such as how 'net professional income' is specifically defined for Belgian self-employment.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses the specific verb 'Calculate' and precisely identifies the resource as 'Belgian self-employed social contributions (cotisations INASTI)'. The mention of INASTI specifically targets the Belgian social security institution for self-employed workers, effectively distinguishing it from siblings like calculate_belgian_salary and calculate_belgian_income_tax.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no explicit guidance on when to select this tool versus alternatives such as calculate_belgian_salary for employees. While 'self-employed' implies a specific user category, there is no explicit comparison to sibling tools or guidance on prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_belgian_vatBInspect
Calculate Belgian VAT — convert between HT and TTC
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Input mode: ht=before tax, ttc=after tax | ht |
| rate | No | VAT rate: 6%, 12% or 21% | 21 |
| amount | Yes | Amount in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description bears full burden. It explains the bidirectional conversion logic (HT↔TTC) but fails to disclose output format, whether calculation includes rounding rules specific to Belgian VAT, or safety properties (read-only nature).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise two-clause description with zero redundancy. Purpose front-loaded with 'Calculate', mechanism specified with 'convert between HT and TTC', and no extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 3-parameter calculation tool with complete schema coverage. However, lacks output specification (does it return VAT amount, gross/net values, or rate details?) given no output schema exists to cover this gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with complete enum descriptions ('ht=before tax', etc.). Description adds context that the tool 'convert[s] between HT and TTC', framing the mode parameter's purpose, but doesn't add syntax details beyond the schema's existing documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verbs ('Calculate', 'convert') and resource ('Belgian VAT') with geographic specifier 'Belgian' distinguishing it from sibling VAT tools (french_vat, uk_vat, vat_generic). Lacks explicit comparison to alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this specific Belgian calculator versus generic alternatives like calculate_vat_generic or calculate_vat_reverse. No mention of prerequisites or conditions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_biorhythmBInspect
Calculate physical, emotional, and intellectual biorhythm cycles
| Name | Required | Description | Default |
|---|---|---|---|
| birth_date | Yes | Birth date YYYY-MM-DD | |
| target_date | Yes | Target date YYYY-MM-DD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to indicate whether this is a read-only operation, what format the results return in, whether calculations are stored, or any rate limiting concerns. The single sentence provides only the operation intent with no behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is exactly one sentence and seven words long. It is front-loaded with the action and subject, contains zero redundancy or filler, and every word serves to define the tool's specific purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has only two well-documented parameters and no output schema, the description is minimally adequate for an AI to select the tool for biorhythm calculations. However, gaps remain: it does not describe the output format, value ranges (e.g., -100% to +100%), or criticality of the date parameters, which would be expected given the lack of annotations or output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input parameters (birth_date, target_date) are already well-documented in the schema itself. The description adds no additional semantic context about the parameters (e.g., explaining that birth_date is the reference point for cycle calculation), so it meets the baseline score for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb ('Calculate') and resource ('biorhythm cycles') and clarifies the scope by listing the three specific cycle types (physical, emotional, intellectual). This distinguishes it sufficiently from other date-based calculation siblings like calculate_age or calculate_moon_phase, though it does not explicitly name alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus other date calculation tools, nor does it mention prerequisites (e.g., needing valid dates) or expected use cases. It merely states the function without context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_blood_alcoholAInspect
Estimate blood alcohol content (BAC) using Widmark formula
| Name | Required | Description | Default |
|---|---|---|---|
| sex | Yes | Biological sex | |
| drinks | Yes | Number of standard drinks (1 drink = 14g pure alcohol) | |
| weight_kg | Yes | Body weight in kilograms | |
| hours_drinking | Yes | Hours elapsed since first drink |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Mentions 'Widmark formula' providing method context and 'Estimate' indicating approximation nature, but lacks disclosure of output format (percentage units), safety disclaimers, or read-only nature that annotations would typically cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient 7-word sentence. Every element earns its place: action (Estimate), subject (BAC), and specific method (Widmark formula). Zero waste, front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 100% schema coverage and simple input structure, description adequately supports tool selection. However, for a health/legal-adjacent calculation (BAC), it lacks expected disclaimers about individual metabolic variations and legal limits that would make it complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear parameter descriptions (e.g., '1 drink = 14g pure alcohol'), establishing baseline 3. Description adds implicit context via 'Widmark formula' (explains why sex/weight matter) but does not explicitly elaborate parameter semantics beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Estimate' plus resource 'blood alcohol content (BAC)' and distinguishes from siblings via 'Widmark formula', which identifies the specific pharmacological method used versus generic calculation tools like calculate_alcohol_units or calculate_bac_points.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage through specificity of the Widmark formula (medical/scientific BAC estimation) but provides no explicit when-to-use guidance comparing it to siblings like calculate_bac_points or calculate_alcohol_units.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_bmiBInspect
Calculate Body Mass Index (BMI) and weight category
| Name | Required | Description | Default |
|---|---|---|---|
| height_cm | Yes | Height in centimeters | |
| weight_kg | Yes | Weight in kilograms |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It mentions 'weight category' as an output hint (useful given no output schema exists), but fails to disclose the calculation method (kg/m²), return value format, or limitations (e.g., athletic builds, pregnancy).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single front-loaded sentence with zero waste. Every word earns its place—immediately states the action (Calculate) and scope (BMI + weight category).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple two-parameter calculation tool with well-documented inputs. However, lacking an output schema, the description could be improved by specifying what 'weight category' entails (underweight/normal/overweight/obese classifications) or that it returns a numeric BMI value.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Height in centimeters', 'Weight in kilograms'), so the baseline per rubric is 3. The description adds no additional parameter context beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and resource ('Body Mass Index'), clearly identifying the tool's function. While it doesn't explicitly differentiate from sibling health calculators (e.g., calculate_body_fat, calculate_bmr), BMI is sufficiently distinct in purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like calculate_body_fat or calculate_ideal_weight, nor does it mention prerequisites (e.g., adult vs. pediatric BMI calculation limitations).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_bmrAInspect
Calculate Basal Metabolic Rate using Mifflin-St Jeor equation
| Name | Required | Description | Default |
|---|---|---|---|
| age | Yes | Age in years | |
| sex | Yes | Biological sex | |
| height_cm | Yes | Height in centimeters | |
| weight_kg | Yes | Weight in kilograms |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses the specific equation used (Mifflin-St Jeor) but omits output format, units (presumably kcal/day), or validation details beyond the schema constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with all essential information. No filler words or redundant phrases. Every token earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 4-parameter calculation tool, but lacks output documentation (no output schema exists). Should mention return value format/units given no output schema is present.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 4 parameters documented), establishing baseline 3. Description adds no parameter-specific guidance (e.g., why biological sex matters for BMR), but high schema coverage makes this acceptable.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with specific resource 'Basal Metabolic Rate' and specifies the exact method 'Mifflin-St Jeor equation'. This clearly distinguishes it from siblings like calculate_bmi or calculate_tdee.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use BMR vs. TDEE (calculate_tdee) or calories burned (calculate_calories_burned). No prerequisites or exclusions mentioned despite many related metabolic calculation siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_body_fatAInspect
Estimate body fat percentage from BMI, age and sex using Deurenberg equation
| Name | Required | Description | Default |
|---|---|---|---|
| age | Yes | Age in years | |
| bmi | Yes | Body Mass Index | |
| sex | Yes | Biological sex |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It appropriately qualifies the result as an 'estimate' and names the specific equation used, but omits output format details, value ranges, or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single 12-word sentence is perfectly front-loaded with zero waste. Every word earns its place by conveying the operation, inputs, and specific methodology used.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple three-parameter calculation tool with complete schema documentation, the description is largely adequate. It implies the output is a body fat percentage, though it could explicitly state the return format (number vs object) given the absence of an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description merely lists the parameter categories (BMI, age, sex) without adding semantic context, constraints, or relationships beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb (Estimate), clear resource (body fat percentage), and uniquely identifies the methodology (Deurenberg equation), which effectively distinguishes this tool from the sibling `calculate_body_fat_navy`.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description implies usage by naming the Deurenberg method, providing a hint for selection between this and the Navy method sibling, it offers no explicit when-to-use guidance or comparison of the two methodologies.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_bpm_to_msBInspect
Convert BPM tempo to millisecond delay times for different note values
| Name | Required | Description | Default |
|---|---|---|---|
| bpm | Yes | Tempo in beats per minute | |
| note_value | Yes | Musical note value to convert |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full disclosure burden. It specifies output is milliseconds and implies a pure calculation, but omits safety profile (read-only vs destructive), precision/rounding behavior, or whether it validates the note_value against the BPM range.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. Noun phrase structure ('Convert X to Y for Z') frontloads the operation, inputs, and outputs immediately.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 2-parameter calculation tool, but gaps remain: no output schema exists, yet description doesn't explicitly state the return format (milliseconds as number vs object) or whether it returns values for all note values or just the requested one.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions ('Tempo in beats per minute', 'Musical note value'). The description mentions 'different note values' which maps to the enum but adds minimal semantic value beyond what the schema already provides. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Convert') and specific domain (BPM tempo to millisecond delay times, note values). Effectively distinguishes from unrelated siblings (tax, BMI calculators), though could explicitly signal audio/music context to differentiate from calculate_reverb_predelay.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus alternatives like calculate_reverb_predelay or calculate_time_signature_beats, nor prerequisites for the calculation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_braking_distanceCInspect
Reaction + braking distance by road condition
| Name | Required | Description | Default |
|---|---|---|---|
| condition | No | Road | dry |
| speed_kmh | Yes | Speed km/h |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With zero annotations, the description carries the full disclosure burden but fails to specify output format (total distance vs. breakdown?), units (meters?), underlying physics assumptions (reaction time duration, friction coefficients per condition), or calculation standard used. 'Reaction + braking' hints at components but lacks behavioral specifics needed for safe automotive application.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 5 words with no redundancy. However, extreme brevity becomes a liability here—every sentence earns its place, but the single phrase leaves critical gaps that another sentence could fill (output units, reaction time assumption).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Severely incomplete given no output schema and no annotations. A physics-based safety tool should disclose output structure, units (meters/feet?), and assumptions about driver reaction time (industry standard 1.5s-2s). Without this context, agents cannot reliably interpret results or explain limitations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('Speed km/h', 'Road'), establishing baseline 3. The description adds minimal semantic value beyond the schema—mentioning 'by road condition' confirms the parameter affects output but doesn't explain the physical mapping (dry/wet/icy friction coefficients) or that reaction distance depends on speed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Reaction + braking distance by road condition' clearly identifies the calculation domain (physics-based stopping distance) and key variable (road condition). In the context of 200+ sibling calculators, this sufficiently distinguishes the specific automotive safety calculation performed.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to select this tool versus siblings like calculate_speed_distance_time or calculate_distance_securite. No mention of prerequisites, input validation limits, or safety-critical usage context required for braking calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_braquetBInspect
Calculate bicycle gear ratio (braquet) and speed at various cadences
| Name | Required | Description | Default |
|---|---|---|---|
| cog_teeth | Yes | Number of teeth on the rear cog | |
| chainring_teeth | Yes | Number of teeth on the front chainring |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. Mentions 'speed at various cadences' which hints at output behavior (multiple calculations per cadence step), but lacks details on default cadence values, units, or whether wheel size assumptions are used. No safety annotations but implies read-only calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence (9 words) with zero waste. Purpose front-loaded immediately. Parenthetical '(braquet)' efficiently clarifies terminology without verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 2-parameter calculation tool. Mentions 'speed at various cadences' which partially compensates for missing output schema by hinting at return structure. However, lacks specifics on what 'various' means (range/step) and omits unit information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage ('Number of teeth...'), establishing clear baseline. Description mentions 'gear ratio' which contextualizes the parameters but doesn't add syntax, validation rules, or domain constraints beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Calculate) and resource (bicycle gear ratio, 'braquet'), including the French cycling term helps identify the domain. However, fails to distinguish from sibling tool 'calculate_gear_ratio' which likely performs similar calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus 'calculate_gear_ratio', 'calculate_cycling_power', or other related cycling calculators. No prerequisites or exclusions mentioned despite many sibling alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_bra_sizeBInspect
Calculate bra size in FR, US or UK system from underbust and bust measurements (cm)
| Name | Required | Description | Default |
|---|---|---|---|
| system | Yes | ||
| bust_cm | Yes | ||
| underbust_cm | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It does not disclose the output format, possible error conditions (e.g., invalid measurement ratios), whether the operation idempotent, or any side effects. With no output schema and no annotations, the lack of behavioral disclosure is a significant gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb, no redundancy. Every element serves a purpose: verb, resource, valid systems, and input requirements.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (3 parameters, specific domain), the description covers inputs and sizing systems adequately. However, with no output schema and no annotations, it omits expected details like result format. Sibling differentiation is implied but not explicit.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, requiring the description to compensate. It mentions all three parameters conceptually ('underbust', 'bust', 'system') and adds the unit '(cm)' which confirms the measurement unit. However, it does not explain semantic distinctions (e.g., difference between underbust and bust) or valid value ranges beyond what's in the schema structure.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb 'Calculate', resource 'bra size', and scope 'FR, US or UK system'. Mentions inputs 'underbust and bust measurements (cm)' which distinguishes it from the sibling tool 'calculate_bra_size_convert' by implying calculation from raw body measurements rather than conversion between size notations, though it does not explicitly name the sibling.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no explicit guidance on when to use this tool versus alternatives, particularly the sibling 'calculate_bra_size_convert'. Lacks prerequisites or conditions (e.g., that measurements should be in centimeters, implied only by parameter names and parenthetical).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_bra_size_convertBInspect
Convert bra size between FR, US, UK and EU systems
| Name | Required | Description | Default |
|---|---|---|---|
| cup | Yes | Cup letter (A, B, C, D, DD, E, F) | |
| band_size | Yes | Band size in source system (numeric) | |
| from_system | Yes | Source sizing system |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden. It specifies the four supported sizing systems (FR, US, UK, EU) defining the operational scope. However, it fails to clarify what the tool returns—particularly important since the input schema lacks a 'to_system' parameter, leaving ambiguity whether it converts to all systems or a specific target.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of appropriate length with the action verb front-loaded. No redundant or wasteful language; every word conveys essential scope or function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 3-parameter conversion tool with complete schema documentation. However, lacks description of return values (no output schema exists) and does not address the missing 'to_system' parameter behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds minimal semantics beyond the schema, merely listing the sizing systems (FR, US, UK, EU) that are already defined in the from_system enum. It does not clarify parameter relationships (e.g., that band_size must correspond to the from_system).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific action (Convert) and resource (bra size) with clear scope (FR, US, UK, EU systems). However, it does not explicitly differentiate from sibling tool 'calculate_bra_size' (which likely calculates from measurements rather than converting between systems).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus 'calculate_bra_size' or general 'calculate_clothing_size_convert'. No mention of prerequisites (e.g., knowing source size) or output expectations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_bread_hydrationCInspect
Calculate bread dough hydration percentage
| Name | Required | Description | Default |
|---|---|---|---|
| flour_grams | Yes | Flour weight grams | |
| water_grams | Yes | Water weight grams |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided and description discloses no behavioral traits. Does not indicate this is a safe read-only calculation, what return format to expect (e.g., float representing percentage), or that it follows the standard baker's percentage formula (water weight / flour weight × 100).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb. Extremely efficient with no wasted words, though borders on under-specification. Appropriate length for a simple two-parameter calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple calculation with complete schema coverage, but lacks output description (critical since no output schema exists) and calculation methodology explanation that would help an agent predict results or validate inputs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage ('Flour weight grams', 'Water weight grams'), providing complete parameter documentation. Description adds no semantic meaning beyond what schema already provides, warranting baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb (Calculate) and resource (bread dough hydration percentage). Distinguishes from sibling 'calculate_hydration' by specifying 'bread dough', though could clarify that hydration here refers to baker's percentage (water/flour ratio).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus generic 'calculate_hydration' or 'calculate_baking_conversion', or when baker's percentage calculation is appropriate versus other hydration metrics.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_break_evenBInspect
Calculate break-even point (units and revenue)
| Name | Required | Description | Default |
|---|---|---|---|
| fixed_costs | Yes | Total fixed costs | |
| price_per_unit | Yes | Selling price per unit | |
| variable_cost_per_unit | Yes | Variable cost per unit |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Given no annotations, description carries full burden. It discloses outputs (units and revenue) but omits computational behavior (e.g., handling when price_per_unit equals variable_cost_per_unit), error conditions, or safety characteristics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient at 6 words. Front-loaded with action verb. Parenthetical output specification is high-value density. Only deduction is extreme brevity given lack of annotations leaves behavioral gaps.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a straightforward 3-parameter calculation tool with complete schema documentation. However, with no output schema and no annotations, description should ideally explain the break-even formula concept or business logic to ensure correct usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Total fixed costs', 'Selling price per unit', 'Variable cost per unit'), establishing baseline of 3. Description adds no semantic detail beyond schema, but doesn't need to compensate for coverage gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and resource ('break-even point'), with specific output clarification ('units and revenue') that distinguishes it from generic calculation tools. Avoids tautology despite name similarity by specifying return values.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this versus siblings like calculate_profit_margin, calculate_roi, or calculate_markup_margin. Given 200+ calculate_* siblings, explicit differentiation or business context is needed.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_breeding_due_dateBInspect
Calculate expected birth date from mating date for common pets
| Name | Required | Description | Default |
|---|---|---|---|
| animal | Yes | ||
| mating_date | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It does not mention what gestation periods are used for each animal, whether calculations are estimates, error handling behavior, or side effects. The phrase 'Calculate' implies a read-only operation but this is not explicit.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of nine words that front-loads the action ('Calculate') and contains no redundant or wasted language. Every word serves to identify the function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description mentions the expected output (birth date) which partially compensates for the missing output schema. However, with no annotations, no output schema, and 0% input schema coverage, the description fails to fully document parameter formats, valid values, or behavioral constraints expected for a breeding calculation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description must compensate by documenting the two parameters. It references 'mating date' and 'common pets' (animals) matching the schema parameters, but fails to specify the required YYYY-MM-DD date format or the complete list of supported animals (dog, cat, rabbit, hamster) beyond the vague 'common pets'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool calculates an expected birth date from a mating date for common pets, providing a specific verb, resource, and scope. However, it fails to explicitly distinguish from siblings like 'calculate_dog_pregnancy', 'calculate_cat_pregnancy', or 'calculate_pregnancy_due_date' which appear to offer overlapping functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus the more specific pregnancy calculators (calculate_dog_pregnancy, calculate_cat_pregnancy) or the generic calculate_pregnancy_due_date. There is no mention of prerequisites or conditions for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_brevet_pointsBInspect
Calculate French Brevet (middle school exam) score
| Name | Required | Description | Default |
|---|---|---|---|
| math | Yes | Math exam score (/100) | |
| oral | Yes | Oral exam score (/100) | |
| french | Yes | French exam score (/100) | |
| science | Yes | Science score (/50) | |
| history_geo | Yes | History-Geography score (/50) | |
| socle_commun | Yes | Socle commun points (50-400) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden of behavioral disclosure. Description lacks information about output format (points vs grade), calculation methodology (weighting, coefficients), or error conditions beyond schema validation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 7-word sentence is maximally concise and front-loaded. However, extreme brevity leaves gaps in behavioral and contextual disclosure for a specialized domain tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 6-parameter specialized calculation tool with no annotations or output schema, the description is insufficient. It omits what the calculation returns (total points, final grade, pass/fail status) and lacks domain context about the French Brevet scoring system.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with clear parameter semantics (e.g., 'Math exam score (/100)'). Description adds no parameter-specific guidance, meeting the baseline expectation for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' identifies the operation, 'French Brevet' specifies the educational domain, and '(middle school exam)' distinguishes it from the sibling calculate_bac_points (high school diploma).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use guidance, prerequisites, or alternative suggestions provided. While 'middle school' implicitly distinguishes from 'bac' tools, it does not explicitly guide selection between siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_brick_countBInspect
Calculate bricks/blocks for a wall
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | Brick type | parpaing |
| height_m | Yes | Wall height m | |
| length_m | Yes | Wall length m |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It fails to indicate whether the calculation includes mortar joint allowances, waste factors, or returns additional data like weight or cost. It also doesn't disclose if this is a read-only operation (implied but not stated).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a tight, single phrase with zero redundancy. It front-loads the action and object immediately, appropriate for a simple utility function where verbosity would add noise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (3 primitive parameters) and complete schema documentation, the description is minimally sufficient. However, it omits relevant construction context like mortar gaps or waste percentages that typically affect brick counts, leaving a small but notable gap for a specialized calculator.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents that length_m and height_m are wall dimensions and type refers to brick type. The description adds minimal value beyond confirming the construction domain, meeting the baseline expectation for well-documented schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a clear verb ('Calculate') and specifies the resource ('bricks/blocks') and context ('for a wall'), distinguishing it from sibling calculation tools like calculate_paint_needed or calculate_concrete_mix. However, it slightly falters by conflating 'bricks' and 'blocks' without clarification and omitting the 'quantity' aspect.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, prerequisites (e.g., needing wall dimensions), or when to choose 'standard' vs 'parpaing' brick types. It merely states the function without contextual usage advice.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_buoyancyCInspect
Buoyancy force and floating analysis
| Name | Required | Description | Default |
|---|---|---|---|
| volume_m3 | Yes | Object volume m³ | |
| object_mass | Yes | Object mass kg | |
| fluid_density | No | Fluid density kg/m³ |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to specify what the output represents (buoyant force in Newtons? floating/sinking boolean? object density?). It does not mention calculation methodology, precision, or edge cases (e.g., zero density).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely brief at five words. While it avoids verbosity, the extreme brevity results in under-specification rather than efficient information delivery. It is front-loaded but fails to earn its place by conveying actionable meaning.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of annotations and no output schema, the description should compensate by explaining the return value and physical interpretation (e.g., comparing object density to fluid density). It provides only a high-level topic label, leaving significant gaps in the tool's behavioral contract.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (volume_m3, object_mass, and fluid_density are fully documented with units). The description adds no additional parameter context, but the baseline score of 3 is appropriate when the schema carries the semantic load.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Buoyancy force and floating analysis' is largely tautological, restating the tool name ('calculate_buoyancy') with slight variation. It lacks a specific verb (e.g., 'Calculate...') and fails to distinguish from sibling physics calculation tools like calculate_density or calculate_force.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No usage guidance provided. The description does not indicate when to use this tool versus siblings like calculate_force, calculate_density, or convert_weight, nor does it mention prerequisite knowledge (e.g., Archimedes' principle) or required unit systems.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_burn_rateCInspect
Startup burn rate and runway
| Name | Required | Description | Default |
|---|---|---|---|
| cash_balance | Yes | Cash in bank EUR | |
| monthly_revenue | No | Monthly revenue EUR | |
| monthly_expenses | Yes | Monthly expenses EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fails to disclose whether this is a pure calculation (read-only), what format the output takes (burn rate per month, runway in months), or any methodological assumptions (net burn vs gross burn).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at only four words. While no words are wasted, the description is potentially too terse to be informative, lacking any structure or front-loaded key information beyond the bare concept.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple 3-parameter schema with complete coverage and no nested objects, the description is minimally sufficient. However, it omits explanation of the runway calculation methodology and output format, which would be helpful given the lack of output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear descriptions for all three parameters. The description adds no additional semantic context beyond what the schema already provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses noun phrases ('Startup burn rate and runway') without a clear action verb (e.g., 'Calculates', 'Computes'). While it hints at the domain, it lacks specificity about what the tool actually does and barely extends beyond the tool name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this specific calculator versus other financial calculators (e.g., calculate_break_even). No prerequisites or contextual conditions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cable_sectionCInspect
Calculate electrical cable cross-section
| Name | Required | Description | Default |
|---|---|---|---|
| power_w | Yes | Power W | |
| voltage | No | Voltage | |
| length_m | Yes | Cable length m | |
| max_drop_pct | No | Max voltage drop % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure but provides minimal context. It does not disclose what standard/method it uses (e.g., voltage drop vs. ampacity), what units the result returns in (mm², AWG, etc.), or safety-critical limitations (e.g., does not account for temperature derating or installation methods).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficient with no wasted words and is appropriately front-loaded. However, given the domain complexity (electrical engineering), the extreme brevity borders on under-specification rather than optimal conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For an electrical safety calculation tool with no output schema and no annotations, the description is inadequate. It omits critical domain context: applicable standards, conductor material assumptions (copper/aluminum), installation type considerations, and output format/interpretation (minimum cross-section vs. recommended).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds no contextual semantics about the parameters (e.g., that length_m is one-way cable run, that power_w represents the load consumption, or how the default 3% drop relates to specific electrical codes).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a clear verb ('Calculate') and specific resource ('electrical cable cross-section'). However, it fails to differentiate from the sibling tool 'calculate_cable_section_electrical', which creates ambiguity for the agent when selecting between nearly identically named tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, particularly the sibling 'calculate_cable_section_electrical'. No prerequisites, standards references (e.g., NEC, IEC), or material context are mentioned despite this being an electrical safety calculation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cable_section_electricalCInspect
Calculate cable section from power, voltage, distance and max voltage drop
| Name | Required | Description | Default |
|---|---|---|---|
| power_w | Yes | Power in watts | |
| voltage | No | Voltage (default 230V) | |
| distance_m | Yes | One-way cable distance in meters | |
| max_drop_pct | No | Max voltage drop % (default 3) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It fails to specify the output format (mm², AWG, etc.), calculation standards referenced (IEC, NEC), conductor material assumptions (copper vs aluminum), or whether safety margins are applied—all critical for an electrical engineering calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single efficient sentence that front-loads the action verb. While appropriately sized for a simple calculator, it sacrifices necessary technical context for the sake of brevity, preventing a score of 5.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a technical engineering tool with 4 parameters and no output schema, the description lacks crucial contextual details: output units, applicable electrical standards, voltage system assumptions (single/three phase), and environmental installation factors that would allow an agent to validate inputs or interpret results correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage with clear definitions for each parameter (power_w, voltage, distance_m, max_drop_pct). The description lists these inputs but adds no semantic value beyond what the schema already provides, warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Calculate) and resource (cable section) along with the required inputs (power, voltage, distance, max voltage drop). However, it fails to distinguish itself from the sibling tool 'calculate_cable_section', leaving ambiguity about which electrical-specific factors differentiate this tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (like 'calculate_cable_section'), nor does it mention prerequisites such as electrical system type (AC/DC), conductor material, or installation method that might be required for safe cable sizing.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cac_ltv_ratioCInspect
Customer acquisition cost vs lifetime value
| Name | Required | Description | Default |
|---|---|---|---|
| cac | Yes | Customer acquisition cost EUR | |
| ltv | Yes | Customer lifetime value EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of disclosure but reveals almost nothing about behavior. It doesn't state the calculation formula, whether the result is a decimal or percentage, what constitutes a healthy ratio threshold, or any validation behavior beyond the schema's minimum values.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief at 5 words, this represents under-specification rather than efficient conciseness. The single phrase fails to serve as a complete tool description, lacking both action verbs and contextual framing needed for an agent to understand the tool's function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter calculation tool with full schema coverage, the description should at minimum explain the mathematical relationship between inputs and the interpretation of output. Without annotations or output schema, the description leaves critical gaps in explaining what value this calculation provides.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear EUR unit specifications for both parameters. The tool description adds no parameter-specific guidance, but with complete schema documentation, this meets the baseline expectation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description identifies the two financial metrics involved (CAC and LTV) and implies a comparison via 'vs', but fails to specify what calculation is performed (ratio of CAC/LTV or LTV/CAC?) or what the output represents. It uses a noun phrase rather than an action verb, making the actual operation ambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus other financial calculation tools (like calculate_roi or calculate_profit_margin) or what business decisions this metric informs. No prerequisites or contextual requirements are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_caffeine_clearanceCInspect
Caffeine half-life tracker
| Name | Required | Description | Default |
|---|---|---|---|
| mg | Yes | Caffeine mg consumed | |
| hours | No | Hours since consumption |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, so the description carries the full burden of behavioral disclosure. It fails to explain what the calculation returns (remaining caffeine level, clearance rate, or time to elimination), what units are used, or any pharmacological assumptions (e.g., standard 5-hour half-life).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While the three-word description is technically concise, it constitutes under-specification rather than efficient communication. It fails to front-load critical behavioral information necessary for tool selection.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
As a calculation tool with no output schema and no annotations, the description should explain the return value and calculation methodology. It provides insufficient context for an AI agent to understand what numeric result to expect or how to interpret it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear descriptions for both 'mg' and 'hours' parameters. The description adds no additional semantic information beyond what the schema already provides, warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description identifies the domain (caffeine) and loosely suggests tracking half-life, but uses the vague noun 'tracker' rather than a specific verb. It fails to distinguish from siblings calculate_caffeine_half_life and calculate_caffeine_intake, leaving ambiguity about what 'clearance' actually calculates (remaining mg, elimination rate, or time to clear).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus the sibling caffeine calculators. No mention of input prerequisites or typical use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_caffeine_half_lifeCInspect
Calculate remaining caffeine in body after time elapsed
| Name | Required | Description | Default |
|---|---|---|---|
| hours_since | Yes | Hours since consumption | |
| mg_consumed | Yes | Caffeine consumed mg |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It fails to disclose the assumed half-life constant (typically ~5 hours for caffeine), whether individual metabolic variations are considered, or that results are mathematical estimates rather than medical measurements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficiently structured and front-loaded with the core action. However, given the lack of annotations and presence of sibling tools, the extreme brevity leaves critical gaps rather than demonstrating purposeful minimalism.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 2-parameter calculation tool with complete schema coverage, the description covers the basic functional contract. However, it omits important context regarding the pharmacokinetic model (half-life decay) and fails to differentiate from related caffeine calculation siblings.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('Hours since consumption', 'Caffeine consumed mg'), the schema adequately documents inputs. The description adds no additional parameter semantics beyond the schema, warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb (Calculate), resource (caffeine), and scope (remaining in body after time elapsed). However, it fails to distinguish from siblings like 'calculate_caffeine_clearance' or 'calculate_caffeine_intake' by not mentioning the half-life methodology.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like 'calculate_caffeine_clearance'. No prerequisites or constraints mentioned beyond what's in the schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_caffeine_intakeBInspect
Calculate total caffeine intake from beverages and compare to safe daily limit
| Name | Required | Description | Default |
|---|---|---|---|
| drinks | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries full disclosure burden. It adds value by specifying the safety comparison feature, but omits what the 'safe daily limit' threshold actually is (400mg standard? weight-adjusted?), whether results are advisory or medical grade, and output format details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, no filler words, immediate verb-object clarity. Every word earns its place despite the information density.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the nested input complexity (array of objects) and lack of output schema, the description covers the core value proposition but leaves operational gaps. It should specify input units, supported beverage types, or return value structure (total mg, percentage of limit, safety status?).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% and the description minimally compensates. While 'from beverages' maps to the drinks array, it fails to explain the nested structure requires drink type (enum values) and quantity, nor what unit quantity expects (cups, ml, oz?). Critical ambiguity for a calculation tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (calculate), resource (caffeine intake), source (beverages), and specific behavior (comparison to safe daily limit). However, it does not explicitly differentiate from siblings calculate_caffeine_clearance and calculate_caffeine_half_life, though the names imply distinct purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus the pharmacokinetic siblings (clearance/half_life), or prerequisites like whether quantity should represent servings, milliliters, or caffeine milligrams. No mention of target user groups (general consumers vs. medical contexts).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_calories_burnedAInspect
Estimate calories burned during physical activity using MET values
| Name | Required | Description | Default |
|---|---|---|---|
| activity | Yes | Type of activity | |
| weight_kg | Yes | Body weight in kilograms | |
| duration_minutes | Yes | Duration in minutes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It adds valuable context by specifying 'MET values' as the calculation method and 'Estimate' indicating approximate output, but lacks details on output format, precision limits, or whether results are stored.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient single sentence (9 words) with zero waste. Front-loaded with the action verb 'Estimate', followed by output, scope, and methodology. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 3-parameter calculation tool with complete schema coverage. However, given the crowded namespace of 300+ sibling calculate_* tools (including health-related ones like calculate_bmr, calculate_tdee), it lacks differentiation guidance. No output schema is present, but the output is reasonably inferred from the description.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds semantic context by implying the activity parameter maps to MET coefficients, but does not elaborate on parameter interactions, valid ranges beyond schema constraints, or formatting details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses the specific verb 'Estimate' with clear resource 'calories burned' and scopes it to 'physical activity'. Critically, it includes 'using MET values' which distinguishes this from sibling tools like calculate_bmr or calculate_tdee that use different formulas.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage through specificity (MET values for physical activity), but provides no explicit guidance on when to choose this over similar siblings like calculate_tdee, calculate_bmr, or calculate_dog_walking_calories. No alternatives or exclusion criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_canada_combined_taxAInspect
Calculate combined Quebec + federal income tax with the Quebec federal abatement (16.5%)
| Name | Required | Description | Default |
|---|---|---|---|
| income_cad | Yes | Annual income in CAD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses key calculation detail (16.5% Quebec federal abatement) which affects results behaviorally, but omits return value structure, error handling, or read-only nature.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb 'Calculate'. Zero filler words. Every element serves purpose: jurisdiction (Quebec), scope (combined federal), and specific rule (16.5% abatement).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter tax tool with no output schema, description captures the essential domain complexity (Quebec abatement). Would benefit from hinting at return value (e.g., 'returns total tax liability') but covers the critical jurisdictional specificity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear 'Annual income in CAD' description. Description does not mention parameters, but baseline is 3 when schema coverage is high (>80%) as it avoids redundancy.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specifies exact action (Calculate), resource (combined Quebec + federal income tax), and unique scope (Quebec federal abatement). Clearly distinguishes from siblings calculate_canada_federal_tax and calculate_quebec_income_tax by emphasizing the combined nature and specific abatement percentage.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage through specificity ('combined' suggests use when needing both taxes), but lacks explicit guidance like 'use calculate_canada_federal_tax instead for federal-only calculations.' Siblings are not referenced.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_canada_eiAInspect
Calculate Canadian Employment Insurance (EI) premiums for Quebec and non-Quebec residents
| Name | Required | Description | Default |
|---|---|---|---|
| province | No | Province: QC (Quebec rate) or other (standard rate) | QC |
| gross_annual_cad | Yes | Gross annual insurable earnings in CAD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Mentions the Quebec rate distinction (useful behavioral trait), but omits other critical details: output structure (employee/employer portions?), calculation methodology (current year rates?), and whether operation is read-only/pure function (implied by 'Calculate' but not stated).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence (11 words) with zero waste. Front-loaded with action verb, immediately communicates scope variation. Appropriate length for a two-parameter calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for low-complexity tool with well-documented schema, but given no output schema exists, description should ideally disclose return value structure (e.g., 'returns employee premium, employer premium, and total'). Currently minimally viable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. Description reinforces the province parameter semantics by mentioning 'Quebec and non-Quebec residents' alignment, but adds no further syntax guidance or format details beyond what schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Calculate'), precise resource ('Canadian Employment Insurance (EI) premiums'), and explicit scope differentiation ('Quebec and non-Quebec residents') that distinguishes it from sibling Canada tax tools like calculate_canada_federal_tax or calculate_canada_rrq.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage through specificity (EI vs other deductions) and mentions the Quebec/non-Quebec distinction which guides province parameter selection, but lacks explicit when-to-use guidance versus alternatives like calculate_canada_combined_tax or calculate_quebec_income_tax.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_canada_federal_taxBInspect
Calculate Canadian federal income tax (CRA) with basic personal amount deduction
| Name | Required | Description | Default |
|---|---|---|---|
| income_cad | Yes | Annual income in Canadian dollars (CAD) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses that the calculation includes the basic personal amount deduction, but does not mention output format, tax year specificity, read-only nature, or estimation disclaimer.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. Every word serves a purpose: the action (Calculate), the domain (Canadian federal/CRA), and the specific calculation method (basic personal amount deduction).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter calculation tool, the description adequately explains the core logic but lacks output specification (especially important given no output schema) and tax year context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with 'income_cad' fully documented. The description adds context about the deduction logic but does not expand on parameter semantics beyond the schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and resource ('Canadian federal income tax'), plus specific feature ('basic personal amount deduction'). However, it does not explicitly distinguish from sibling 'calculate_canada_combined_tax' or note that provincial taxes are excluded.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no explicit guidance on when to use this tool versus 'calculate_canada_combined_tax' or provincial calculators, nor mentions prerequisites like tax year applicability.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_canada_rrqBInspect
Calculate Quebec Pension Plan (RRQ) contributions for employee
| Name | Required | Description | Default |
|---|---|---|---|
| gross_annual_cad | Yes | Gross annual earnings in CAD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden, yet provides no behavioral details. Does not disclose whether calculation includes employer+employee portions, what tax year rates apply, maximum contributory earnings limits, or output format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, 9 words, front-loaded with verb. Efficient with no redundancy. However, given zero annotations and missing output schema, the brevity leaves significant gaps that additional sentences could fill.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter calculation tool. Identifies the calculation domain clearly. Missing: contribution rates explanation, employer vs employee split details, and output value description. Acceptable but minimal for a regional pension calculation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with clear description 'Gross annual earnings in CAD'. Tool description adds no parameter-specific context (e.g., examples, validation rules, whether to use pre-tax or post-tax gross), warranting baseline score 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' + resource 'Quebec Pension Plan (RRQ) contributions' + scope 'for employee'. Explicitly includes 'Quebec' and 'RRQ' to distinguish from federal Canada tools (calculate_canada_ei, calculate_canada_federal_tax) and from Quebec income tax (calculate_quebec_income_tax) among siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this vs calculate_canada_ei or calculate_quebec_income_tax. Does not mention prerequisites like requiring Quebec employment status or how it relates to CPP (Canada Pension Plan) for non-Quebec residents.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_capital_gains_propertyCInspect
Calculate French property capital gains tax (plus-value immobiliere)
| Name | Required | Description | Default |
|---|---|---|---|
| sale_price | Yes | Sale price in euros | |
| years_held | Yes | Number of years the property was held | |
| purchase_price | Yes | Original purchase price in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With zero annotations provided, the description carries the full burden of behavioral disclosure. It does not indicate whether this performs a pure calculation or accesses external tax databases, what format the return value takes (breakdown vs total), or if results include social charges (prélèvements sociaux) typically associated with French capital gains.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise single sentence (7 words) with no redundancy. Front-loaded with the action verb 'Calculate'. However, the brevity becomes a liability given the tool's complexity and the presence of confusable sibling tools.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for basic identification but incomplete given domain complexity. No output schema exists, yet the description doesn't hint at the return structure (tax amount, effective rate, taper relief details). The lack of differentiation from 'calculate_property_capital_gains_fr' is a significant contextual gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for all three parameters (purchase_price, sale_price, years_held). The description itself adds no parameter-specific context, but baseline 3 is appropriate given the schema's completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly identifies the specific calculation (French property capital gains tax) using both English and the French legal term 'plus-value immobiliere'. However, it fails to distinguish from the nearly identical sibling tool 'calculate_property_capital_gains_fr' that appears in the same toolkit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus the sibling 'calculate_property_capital_gains_fr' or other property tax calculators. No mention of prerequisites (e.g., owning property) or input requirements.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_carbon_footprintCInspect
Estimate annual carbon footprint
| Name | Required | Description | Default |
|---|---|---|---|
| kwh | No | Electricity kWh/year | |
| km_car | No | Car km/year | |
| km_plane | No | Flight km/year | |
| meat_kg_week | No | Meat kg/week |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It fails to mention output format (CO2e units?), calculation methodology, data sources, or idempotency. Only 'Estimate' hints at non-deterministic/calculated behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief (4 words) with no waste, but arguably undersized for the tool's complexity and lack of annotations. Front-loading is moot given brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks output schema disclosure (what units/format does it return?), lacks behavioral context, and provides no usage guidance. While input parameters are documented in schema, the description fails to compensate for missing output documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 4 parameters have clear descriptions), establishing baseline 3. The description adds no parameter semantics beyond the schema (no mention of valid ranges, unit constraints, or interdependencies between kwh/km_car).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb ('Estimate') and resource ('annual carbon footprint'), clearly indicating the tool's function. However, it does not differentiate from similarly named siblings like 'calculate_carbon_sequestration' or environmental calculators.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description contains no guidance on when to use this tool versus alternatives (e.g., calculate_electricity_cost for just energy, or calculate_carbon_sequestration), nor any prerequisites for the estimation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_carbon_sequestrationCInspect
Estimate CO2 sequestration by trees over their lifetime
| Name | Required | Description | Default |
|---|---|---|---|
| count | No | Number of trees (default 1) | |
| age_years | Yes | Age of the trees in years | |
| tree_type | Yes | Species of tree |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. While 'Estimate' implies approximation and 'over their lifetime' defines scope, it fails to disclose output units (kg? tons?), return format, data source methodology, or whether the result is per-tree or total.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is tightly worded and front-loaded with the action verb. However, given the lack of annotations and output schema, the extreme brevity becomes a liability rather than a virtue.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema and no annotations, the description inadequately compensates. It omits critical output details like units (kg CO2 vs. tons), temporal breakdown (annual vs. cumulative), and accuracy caveats needed to interpret the calculation result.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear descriptions for 'tree_type', 'age_years', and 'count'. The description mentions 'trees' generally but adds no semantic clarifications beyond what the schema already provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Estimate') and resource ('CO2 sequestration by trees') and clarifies the temporal scope ('over their lifetime'). However, it does not explicitly differentiate from the sibling tool 'calculate_carbon_footprint' (emissions vs. capture).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, no prerequisites (e.g., mature tree assumptions), and no warnings about the estimation methodology.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_card_draw_probabilityAInspect
Calculate hypergeometric probability of drawing specific cards from a deck
| Name | Required | Description | Default |
|---|---|---|---|
| deck_size | No | Total number of cards in the deck (default 52) | |
| draw_count | Yes | Number of cards drawn | |
| target_cards | Yes | Number of target cards wanted in the draw | |
| cards_in_deck_matching | Yes | Number of target cards in the deck |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the safety burden. It adds valuable context by specifying 'hypergeometric' distribution (sampling without replacement), which explains the mathematical model used. However, it lacks output format details (probability as decimal vs percentage) and validation behavior (e.g., handling impossible draws).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. The key terms ('hypergeometric', 'deck') are front-loaded, and every word earns its place. Appropriate length for the complexity level.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 4 parameters with full schema coverage and no output schema, the description is minimally complete but has a gap regarding return value format. For a mathematical tool, specifying whether the output is a probability (0-1) or percentage would be helpful context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear definitions for all 4 parameters (deck_size, draw_count, target_cards, cards_in_deck_matching). Since the schema fully documents the parameters, the description doesn't need to add param-specific details. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Calculate' with precise resource 'hypergeometric probability' and scope 'drawing specific cards from a deck'. It effectively distinguishes from siblings like calculate_dice_probability and calculate_lottery_odds by specifying the mathematical method (hypergeometric) and domain (deck/cards).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus sibling probability calculators (calculate_dice_probability, calculate_lottery_odds, calculate_poker_hand_probability). No prerequisites or conditions mentioned despite the specific mathematical domain.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_car_depreciationAInspect
Calculate car residual value: Y1:-25%, Y2:-15%, Y3:-10%, Y4-5:-8%, Y6+:-5%
| Name | Required | Description | Default |
|---|---|---|---|
| age_years | Yes | Age in years | |
| purchase_price | Yes | Original price |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full behavioral disclosure burden and succeeds by revealing the specific step-function depreciation rates applied to each year bracket. This allows the agent to predict exactly what computation will occur, though it lacks information about return format or geographic applicability of these rates.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient single-line description front-loaded with the action verb, followed immediately by the specific calculation parameters. No redundant words or filler content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 2-parameter calculation tool, the description adequately explains the computation logic but lacks output format specification (currency, numeric value, formatted string) given the absence of an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 100% schema description coverage, the description adds valuable semantic context by mapping the abstract 'age_years' parameter to concrete year brackets (Y1, Y2, Y3, Y4-5, Y6+) and indicating how purchase_price factors into the percentage calculations.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Calculate car residual value) and provides the exact depreciation algorithm (Y1:-25%, Y2:-15%, etc.), distinguishing it from sibling tools like calculate_car_lease_vs_buy through its specific focus on residual value calculation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., calculate_car_lease_vs_buy) or prerequisites. It only presents the calculation formula without contextual usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_car_lease_vs_buyCInspect
Compare car leasing versus buying with loan
| Name | Required | Description | Default |
|---|---|---|---|
| car_price | Yes | Car purchase price EUR | |
| loan_rate | Yes | Loan annual rate percent | |
| loan_months | Yes | Loan duration months | |
| lease_months | Yes | Lease duration months | |
| lease_monthly | Yes | Monthly lease payment EUR | |
| residual_value | Yes | Car residual value at lease end EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden of behavioral disclosure, yet it fails to mention that this is a safe read-only calculation, what the output format contains (e.g., cost breakdown, recommendation), or any limitations like currency assumptions (EUR implied by schema but not stated in description).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is front-loaded with zero waste words, efficiently communicating the core purpose. However, given the complexity of a 6-parameter financial comparison tool with no output schema, this extreme brevity approaches underspecification rather than optimal conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex financial tool with six interacting parameters and no output schema or annotations, the description lacks crucial context about the comparison methodology (e.g., net present value, total cost of ownership), output structure, and whether tax implications or fees are included in the calculation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already fully documents all 6 parameters (car_price, lease_monthly, etc.) and their relationships. The description adds no additional semantic context about how the lease parameters interact with loan parameters, earning only the baseline score for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the action 'Compare' and the specific financial scenario 'car leasing versus buying with loan', distinguishing it from generic loan calculators or depreciation tools in the sibling list. However, it does not specify what metric is compared (total cost, monthly cash flow, etc.), slightly limiting its clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like calculate_loan_payment, calculate_car_depreciation, or calculate_mortgage. There are no prerequisites mentioned, such as requiring specific financial inputs or scenarios where this comparison is most valid.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_carpet_flooringCInspect
Calculate flooring cost with waste
| Name | Required | Description | Default |
|---|---|---|---|
| width_m | Yes | Width m | |
| length_m | Yes | Length m | |
| price_m2 | No | EUR/m² | |
| waste_pct | No | Waste % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must carry full burden. While it mentions 'with waste', it does not explain how waste is applied (added to area vs. cost), what the output format is, or that this is a pure calculation with no side effects. Insufficient for a financial calculation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at 5 words. No redundant or wasted language. Front-loaded with action verb. Appropriate density given the schema completeness, though arguably too terse given lack of annotations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists and no annotations provided, yet description fails to explain what values are returned (total cost? cost per square meter? breakdown?). For a calculation tool with no supporting metadata, the description should disclose the calculation result format and methodology.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 4 parameters documented). Description adds no parameter semantics beyond the schema, but baseline 3 is appropriate when schema coverage is high.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (Calculate) and resource (flooring cost) and includes key behavioral detail (with waste). However, it does not explicitly differentiate from siblings like calculate_floor_area or calculate_tile_quantity, though the 'carpet' in the name helps distinguish it.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus the numerous sibling calculation tools (e.g., calculate_tile_quantity, calculate_fabric_needed, calculate_floor_area). No prerequisites or conditions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cat_ageCInspect
Cat age in human years
| Name | Required | Description | Default |
|---|---|---|---|
| cat_years | Yes | Cat age in years |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description discloses no behavioral traits (e.g., whether this uses a simple linear formula or distinct life-stage multipliers, precision of result, or that it is read-only). The description carries the full burden and provides minimal context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely terse at five words; while there is no word waste, the fragment format borders on under-specification for a tool lacking an output schema. A complete sentence would improve clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter calculator, the description hints at the return value (human years), but without an output schema, the description should explicitly state the return format and calculation method. Adequate but minimal.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is appropriately met. The description adds implicit context that the output is human years, but provides no additional semantic detail about the cat_years parameter beyond the schema's 'Cat age in years'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The phrase 'Cat age in human years' implies a conversion calculation but lacks a verb and is ambiguous about input/output direction. It does not distinguish from sibling tools like calculate_dog_age or calculate_pet_age, which likely have nearly identical descriptions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this specific tool versus calculate_pet_age or calculate_dog_age, nor any mention of prerequisites or constraints.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cat_foodCInspect
Calculate daily cat food quantity based on weight, age and lifestyle
| Name | Required | Description | Default |
|---|---|---|---|
| age | Yes | ||
| indoor | No | ||
| weight_kg | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. States it calculates based on inputs but fails to disclose output units (grams/cups), whether result is an estimate, read-only nature, or any safety caveats about pet feeding.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb. Efficient but extremely brief. Given lack of annotations and schema coverage, appropriate length would allow for parameter and behavioral details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers basic intent for a simple 3-parameter calculator, but lacks output format details, units, or distinguishing context from similar pet calculators. Minimum viable given low complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Maps 'lifestyle' to the boolean 'indoor' parameter, adding useful semantic context. With 0% schema coverage, however, it omits explaining age enum values (kitten/adult/senior), that indoor defaults to true, or weight_kg constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Calculate' and resource 'daily cat food quantity'. Mentions key inputs (weight, age, lifestyle). Distinguishes from siblings by specifying 'cat' (differentiating from calculate_dog_food), but doesn't clarify relationship to calculate_pet_food_portion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use versus alternatives (calculate_pet_food_portion) or prerequisites. No warnings about veterinary consultation or feeding limits.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cat_pregnancyAInspect
Calculate cat due date from mating date
| Name | Required | Description | Default |
|---|---|---|---|
| mating_date | Yes | Mating date YYYY-MM-DD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It establishes the tool performs a calculation but omits details such as the assumed gestation period (approximately 63-67 days for cats), output format, or whether the result is an estimation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with no redundant words. Every element serves a purpose: the action ('Calculate'), the domain ('cat due date'), and the required input ('from mating date').
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one required parameter, 100% schema coverage, no nested objects), the description adequately covers the primary use case. However, it could be improved by mentioning the output format or gestation assumptions given the absence of an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage ('Mating date YYYY-MM-DD'), establishing a baseline of 3. The description references 'mating date' to explain the parameter's role in the calculation, adding minimal semantic context beyond the schema but not additional constraints or format details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb ('Calculate'), resource ('cat due date'), and input requirement ('from mating date'). It clearly distinguishes from siblings like 'calculate_dog_pregnancy' and 'calculate_pregnancy_due_date' by specifying 'cat' as the target species.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no explicit guidance on when to use this tool versus alternatives such as 'calculate_breeding_due_date' or 'calculate_dog_pregnancy'. There are no stated prerequisites or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cheque_repasBInspect
Calculate Belgian meal voucher (cheque-repas / maaltijdcheque) benefit
| Name | Required | Description | Default |
|---|---|---|---|
| days_per_month | No | Working days per month | |
| employee_contribution | No | Employee contribution per voucher (min 1.09 EUR) | |
| employer_contribution | No | Employer contribution per voucher (max 6.91 EUR) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. It fails to mention whether this uses current legal rates, if results are estimates, whether the calculation is deterministic, or what return format to expect ( despite no output schema ).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence front-loaded with the action verb. Every element serves a purpose: 'Belgian' specifies jurisdiction, parenthetical provides bilingual searchability for Belgium's two official languages, and 'benefit' clarifies the output intent.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage and only 3 optional parameters, the tool is low complexity. However, lacking an output schema and annotations, the description should ideally specify what the 'benefit' represents (monthly total EUR value? annual? employer cost vs employee value?) and that it follows Belgian social regulations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with complete descriptions ('Employee contribution per voucher (min 1.09 EUR)', etc.). The description provides framing context that these parameters calculate a 'benefit', but does not add syntax details or calculation methodology beyond what the schema already specifies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with clear resource 'Belgian meal voucher (cheque-repas / maaltijdcheque) benefit'. The inclusion of French/Dutch terminology ('cheque-repas / maaltijdcheque') precisely targets the Belgian context, distinguishing it from siblings like calculate_belgian_salary or calculate_belgian_income_tax.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus other Belgian benefit calculators (e.g., calculate_belgian_salary, calculate_belgian_car_advantage) or prerequisites like requiring knowledge of official contribution limits.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_child_supportBInspect
Estimate French child support (pension alimentaire) based on income, custody and number of children
| Name | Required | Description | Default |
|---|---|---|---|
| income | Yes | Net monthly income of the paying parent in euros | |
| custody | No | Custody type: full (garde principale), alternating (alternee), reduced (visite et hebergement) | full |
| children_count | Yes | Number of children (1-6) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It adds the key behavioral trait that this is an 'Estimate' (not a guaranteed legal determination) and specifies French jurisdiction, which implies civil law behavioral context. However, lacks disclosure on calculation methodology (official grid vs. approximation), output currency, or whether results are per month/year.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action and domain, no redundancy. Efficiently incorporates French legal term in parentheses. Slight deduction because brevity sacrifices necessary behavioral context for a financial/legal tool with no output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, a legal/financial estimation tool requires more disclosure: output format (amounts, frequency), disclaimer that this is not legal advice, or reference to French legal standards applied. Description is insufficient for agent to understand the full contract of the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. Description maps 'income, custody and number of children' to parameters but adds minimal semantic value beyond schema descriptions. Schema already defines units (euros) and French legal custody terms (garde principale, alternee), so description doesn't need to repeat this.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Estimate') and resource ('French child support'), with jurisdictional specificity ('French', 'pension alimentaire') that distinguishes it from generic calculators and other financial tools in the large sibling set. Falls short of 5 only by not explicitly stating the use case context (legal proceedings, family law) or contrasting with other family-related calculators.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use versus alternatives, prerequisites (e.g., legal separation status), or whether this applies to specific French legal regimes. Simply lists inputs without workflow context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_chinese_zodiacAInspect
Determine Chinese zodiac animal and element from birth year
| Name | Required | Description | Default |
|---|---|---|---|
| birth_year | Yes | Birth year |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully discloses that the tool returns two distinct outputs ('animal and element'), adding behavioral context. However, it lacks details on return format, error handling for edge years, or that Chinese zodiac cycles every 12/60 years.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 9 words with zero redundancy. Front-loaded with action verb 'Determine'. Every word earns its place; no filler or repetition of structured data.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter calculation tool. Mentions both output components (animal and element) which compensates somewhat for missing output schema. Could benefit from mentioning the cyclical nature or specific return structure, but sufficient for basic invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with 'birth_year' fully described in schema constraints (1900-2100). Description mentions 'birth year' but adds no additional semantic context (e.g., 'Gregorian calendar year', 'four-digit format'). Baseline 3 appropriate given schema sufficiency.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Determine' with clear resource 'Chinese zodiac animal and element' and scope 'from birth year'. Among 400+ sibling tools, this is the only one handling Chinese zodiac, making it clearly distinguishable.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use guidance, prerequisites, or alternatives mentioned. While the domain is unique among siblings, the description lacks direction on when this calculation is appropriate versus other date-based tools like age or birth-based calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_churn_rateCInspect
Customer churn rate
| Name | Required | Description | Default |
|---|---|---|---|
| period_months | No | Period in months | |
| lost_customers | Yes | Customers lost | |
| start_customers | Yes | Customers at period start |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description fails to disclose any behavioral traits. It does not state whether this is a pure calculation, what formula is used (lost/start), or what format/value the tool returns. The agent cannot determine if this mutates state or is read-only from the description alone.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief (3 words), this represents under-specification rather than proper conciseness. The single phrase does not earn its place as it provides no actionable information beyond the tool name itself.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite the simplicity of the tool (3 primitive parameters, no nesting), the description is inadequate. With no output schema provided, the description fails to explain the return value (percentage? decimal? string?), leaving the agent uninformed about the calculation result format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds no additional semantic information about parameters (e.g., explaining that period_months affects the rate calculation period or clarifying business definitions of 'lost customers').
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Customer churn rate' is essentially a tautology that restates the tool name without adding a verb or action. It does not specify that the tool performs a calculation, nor does it distinguish this from sibling calculate_* tools (e.g., calculate_burn_rate).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No usage guidance is provided. There is no indication of when to use this versus other business metrics calculators, prerequisites for the data inputs, or expected value ranges for the parameters.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_clothing_size_convertCInspect
Convert clothing size between EU, US and UK systems
| Name | Required | Description | Default |
|---|---|---|---|
| sex | Yes | Sex | |
| size | Yes | Size number in source system | |
| garment | Yes | Type of garment | |
| from_system | Yes | Source system |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full disclosure burden. It fails to specify what the tool returns (e.g., converted values for all target systems or just one), whether conversions are approximate, or any standardization limitations. The mutation/safety profile is not addressed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of nine words with no redundancy. While efficient, the extreme brevity contributes to information gaps. Structure is front-loaded with the verb, though additional sentences would have improved utility.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With four required parameters, zero annotations, and no output schema, the 9-word description leaves significant gaps. It omits the specific garment types supported (critical given the enum constraint), doesn't explain the output format, and provides no behavioral context for users to understand the conversion logic.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions the three sizing systems (EU, US, UK) which aligns with the from_system enum values, but adds no further context about the garment parameter constraints (shirt/pants/dress only) or why sex is required for the calculation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the action (convert), resource (clothing size), and specific scope (EU, US, UK systems). However, it fails to differentiate from close siblings like calculate_bra_size_convert or calculate_shoe_size_convert, which also handle clothing/apparel sizing conversions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like calculate_bra_size_convert, calculate_shoe_size_convert, or the generic convert_* tools. No mention of prerequisites or constraints beyond the implicit system names.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_compost_volumeBInspect
Calculate the volume and weight of compost needed for a garden surface
| Name | Required | Description | Default |
|---|---|---|---|
| depth_cm | No | Compost layer depth in centimeters (default 5cm) | |
| surface_m2 | Yes | Surface area in square meters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions that weight is calculated (implying density assumptions), but fails to disclose what density value is used, what units are returned for weight, or whether the calculation accounts for settling/compaction. This minimal disclosure meets baseline expectations but leaves operational assumptions opaque.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of 12 words with no redundant content. The verb-object structure ('Calculate the volume and weight...') immediately conveys purpose without preamble, making it appropriately front-loaded for quick agent comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema and annotations, the description should disclose return units (cubic meters for volume, likely kilograms for weight) and the density assumption for compost weight calculation. Without these details, agents cannot fully predict the output format or validate result plausibility.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage (surface_m2 and depth_cm are well documented). The description adds no semantic information about parameter interactions or acceptable ranges beyond the schema, warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific calculation performed (volume and weight of compost) and the target context (garden surface). However, it does not explicitly distinguish from similar siblings like 'calculate_garden_soil' or 'calculate_raised_bed_soil', which could cause selection ambiguity given the large number of garden-related calculators available.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'calculate_garden_soil' or 'calculate_fertilizer_npk'. There are no prerequisites mentioned, no exclusions, and no indication of whether this is for new gardens, top-dressing, or specific compost types.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_compound_interestBInspect
Calculate compound interest: A = P(1+r/n)^(nt)
| Name | Required | Description | Default |
|---|---|---|---|
| years | Yes | Investment duration in years | |
| principal | Yes | Initial amount | |
| annual_rate | Yes | Annual interest rate in % | |
| compounds_per_year | No | Compounding frequency per year |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. While the formula explains the calculation logic, it omits operational traits such as read-only safety, return format/value, precision handling, or idempotency of the calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise (single sentence) and front-loaded with the action verb. It wastes no words, though the extreme brevity may be insufficient for proper tool selection given the large sibling set with similar names.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 4-parameter calculation tool with no output schema, the description fails to explain what value is returned (presumably the accumulated amount A) or highlight that compounds_per_year defaults to 12. Critical missing context is the differentiation from 'calculate_compound_interest_monthly', which is essential for correct tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions, establishing a baseline of 3. The formula adds value by mapping mathematical notation (P, r, n, t) to the parameter names (principal, annual_rate, compounds_per_year, years), clarifying the semantic relationship between the domain concept and implementation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool calculates compound interest and provides the specific mathematical formula (A = P(1+r/n)^(nt)), which identifies it as the general compounding frequency variant. However, it does not explicitly differentiate from the sibling tool 'calculate_compound_interest_monthly' or clarify that it handles variable compounding frequencies.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'calculate_compound_interest_monthly' or 'calculate_simple_interest'. There are no prerequisites, exclusions, or selection criteria mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_compound_interest_monthlyCInspect
Calculate final amount with monthly contributions and compound interest
| Name | Required | Description | Default |
|---|---|---|---|
| years | Yes | Number of years | |
| principal | Yes | Initial capital EUR | |
| annual_rate | Yes | Annual interest rate percent | |
| monthly_contribution | Yes | Monthly contribution EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Calculate' implies a read-only operation, the description fails to specify compounding frequency (monthly vs annual?), rounding behavior, or whether the output includes interest earned separately versus only the total.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is appropriately front-loaded with the action verb and contains no redundant words. However, given the tool's complexity and lack of annotations, extreme brevity results in under-specification rather than optimal conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the core calculation purpose and mentions the monthly contribution feature essential for this variant. However, given zero annotations, no output schema, and a crowded sibling space with similar financial calculators, it lacks necessary context about compounding methodology and result format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (all 4 parameters have detailed descriptions including units like EUR and percent), the baseline score applies. The description adds minimal semantic value beyond the schema but correctly highlights the monthly contribution aspect.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb (Calculate) and identifies the resource (final amount) with the key differentiator 'monthly contributions' distinguishing it from the generic calculate_compound_interest sibling tool. However, it does not fully clarify distinctions from other siblings like calculate_future_value or calculate_savings_goal.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions 'monthly contributions' which implicitly suggests usage for recurring investment scenarios, but provides no explicit guidance on when to use this tool versus alternatives (calculate_compound_interest, calculate_future_value) or when-not conditions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_concrete_mixBInspect
Calculate concrete ingredients for a given volume
| Name | Required | Description | Default |
|---|---|---|---|
| volume_m3 | Yes | Volume in m³ |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose what specific ingredients are calculated (cement, sand, water ratios), what units results are in (kg, liters, bags), or what mix standard is assumed. No mention of whether results include wastage factors.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient at 7 words. Single sentence delivers core purpose immediately with no filler or redundancy. Front-loaded with action verb.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter calculation tool without output schema, but minimal. Fails to characterize the return value (ingredient quantities), which would be essential for agent to know how to use results. Acceptable but clear gaps remain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (volume_m3 fully documented as 'Volume in m³'). Description references 'volume' which aligns with the parameter name, but adds no syntax details, examples, or clarification beyond the schema's minimum constraint (0.01). Baseline 3 appropriate when schema does heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Calculate), resource (concrete ingredients), and scope (for a given volume). Distinguishes likely sibling calculate_concrete_stairs by focusing on 'ingredients' rather than structural dimensions. However, lacks detail on what specific ingredients (cement, sand, aggregate) are returned.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus alternative calculation tools like calculate_gravel_quantity or calculate_concrete_stairs. No mention of prerequisites (e.g., knowing target volume) or assumptions (standard mix ratios).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_concrete_stairsBInspect
Calculate concrete stair dimensions, volume and materials using Blondel's formula
| Name | Required | Description | Default |
|---|---|---|---|
| width_m | No | Stair width in meters (default 0.9m) | |
| height_m | Yes | Total stair height to climb in meters | |
| num_steps | Yes | Number of steps | |
| thickness_cm | No | Slab thickness under each tread in cm (default 15cm) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions Blondel's formula, providing methodological context, but lacks information on whether this is a read-only calculation, what the return format contains, or any error conditions (e.g., invalid stair proportions).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of 11 words with zero waste. It front-loads the action verb and packs in the resource type, outputs, and calculation method without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich schema (100% covered, 4 simple parameters) and lack of output schema, the description adequately lists the calculation outputs (dimensions, volume, materials) and specifies the formula used. It could be improved by briefly describing the return structure (e.g., cubic meters, step dimensions).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the structured fields already document all four parameters (width_m, height_m, num_steps, thickness_cm) including units and defaults. The description adds minimal semantic context beyond the schema, mentioning 'materials' which loosely maps to thickness_cm, warranting the baseline score for complete schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb (Calculate), resource (concrete stair dimensions, volume and materials), and specific method (Blondel's formula). This distinguishes it from the sibling 'calculate_staircase' by specifying the concrete context and formula-driven approach, though it could explicitly mention the construction use-case versus interior stairs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'calculate_staircase', nor does it mention prerequisites (e.g., having stair measurements ready) or constraints (e.g., maximum dimensions).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_condominium_chargesCInspect
Calculate individual share of condominium charges
| Name | Required | Description | Default |
|---|---|---|---|
| total_charges | Yes | Total annual condominium charges EUR | |
| ownership_share_pct | Yes | Ownership share (tantièmes) percent |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It fails to mention output format, rounding behavior, currency handling (though implied by schema), or validation constraints beyond what the schema defines. It does not disclose whether this is a pure calculation or has side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single efficient sentence with no redundancy. However, given the absence of annotations and output schema, it is arguably too terse—front-loading essential details like the calculation method or output type would improve utility without sacrificing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a financial calculation tool with no annotations and no output schema, the description is insufficient. It does not describe the return value (the calculated amount in EUR), nor does it explain domain-specific terms like 'tantièmes' (which appears only in the schema).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema adequately documents both parameters ('Total annual condominium charges EUR' and 'Ownership share (tantièmes) percent'). The description references these concepts ('condominium charges', 'individual share') but adds no semantic detail beyond what the schema already provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and resource ('condominium charges'), but 'individual share' is ambiguous regarding the calculation method. It does not clarify that this uses percentage-based ownership shares (tantièmes) versus equal splits or other methods, which could lead to confusion with tools like calculate_tip_split.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. Given the large number of financial calculation siblings (calculate_property_tax_fr, calculate_rental_yield, etc.), explicit context about when this applies (e.g., 'for co-owned properties') would be necessary for proper selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_coneCInspect
Cone volume and surface area
| Name | Required | Description | Default |
|---|---|---|---|
| height | Yes | Height | |
| radius | Yes | Base radius |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. While it mentions what is calculated (volume and surface area), it fails to specify whether the tool returns both values simultaneously, the output format structure, error handling behavior, or mathematical assumptions (e.g., right circular cone).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely brief (4 words) with zero redundancy or filler content. However, this borders on underspecification rather than ideal conciseness, as it sacrifices completeness for brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (2 parameters, simple math) and complete schema documentation, the description meets minimum viability by identifying the calculation domain. However, without an output schema, it should ideally specify the return value structure (e.g., that it returns both metrics in a single call).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage ('Height', 'Base radius'), establishing a baseline score of 3. The description adds no supplementary parameter semantics, units clarification, or usage examples beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description identifies the geometric shape (cone) and the specific calculations performed (volume and surface area), which distinguishes it from siblings like calculate_cylinder or calculate_sphere. However, it lacks an action verb (e.g., 'Calculates') and is merely a noun phrase, falling short of explicit purpose statement.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to select this tool versus the numerous sibling calculation tools (e.g., calculate_cylinder for cylindrical shapes). There are no prerequisites, constraints, or alternative suggestions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_confidence_intervalCInspect
Confidence interval for a mean
| Name | Required | Description | Default |
|---|---|---|---|
| std_dev | Yes | Standard deviation | |
| confidence | No | Confidence level | 95 |
| sample_mean | Yes | Sample mean | |
| sample_size | Yes | Sample size |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure, yet reveals nothing about output format (returns bounds? margin of error?), statistical assumptions (requires normal distribution?), or whether std_dev refers to sample or population standard deviation. It provides zero behavioral transparency beyond the tool name itself.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief at only 5 words, this represents under-specification rather than effective conciseness. For a 4-parameter statistical tool with complex domain logic and no supporting annotations, this length is inappropriately minimal. No information is provided about return values, error conditions, or statistical validity constraints.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of annotations and output schema, plus the moderate complexity of statistical computation (4 parameters including enums), the description is significantly incomplete. It omits statistical methodology, output structure, and domain assumptions necessary for correct invocation in the 'calculate_*' tool ecosystem.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds no semantic value beyond what's already in the schema property descriptions ('Sample mean', 'Standard deviation', etc.). It fails to clarify critical statistical distinctions, such as whether std_dev should be the sample standard deviation (with n-1 denominator) given the presence of sample_size.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Confidence interval for a mean' identifies the statistical concept but uses noun phrasing instead of an action verb (e.g., 'Calculates...'). While it loosely maps to the tool name, it fails to distinguish from statistical siblings like calculate_statistics or calculate_sample_size, and omits key domain context like whether it uses z-distribution or t-distribution.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides absolutely no guidance on when to use this tool versus alternatives (e.g., when to use calculate_sample_size instead), no prerequisites for valid input (e.g., normality assumptions), and no warnings about statistical misuse. This leaves the agent without any decision-making framework.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cooking_conversionCInspect
Convert cooking measurements between common units
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Amount to convert | |
| to_unit | Yes | Target unit | |
| from_unit | Yes | Source unit |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Critically fails to disclose how it handles conversions between volume units (cups, ml) and weight units (g, oz), which require ingredient density assumptions for accuracy in cooking contexts.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single six-word sentence is appropriately concise and front-loaded, though minimalism leaves significant gaps in explanation given the tool's behavioral complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Without annotations or output schema, description should explain conversion logic (e.g., water density assumptions) and return value format. Current description insufficient for a tool handling unit conversions across measurement types.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for amount, from_unit, and to_unit. Description adds only the phrase 'common units' which provides minimal semantic context beyond the schema's explicit enum lists.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States a clear verb+resource (convert cooking measurements) and mentions scope (common units). However, fails to distinguish from sibling tool `convert_cooking` or `calculate_baking_conversion` in the crowded calculator namespace.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus the similar `convert_cooking` sibling, nor any prerequisites or constraints about valid conversion types.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cooking_timeCInspect
Estimate cooking time based on food type, weight and method
| Name | Required | Description | Default |
|---|---|---|---|
| food | Yes | ||
| method | Yes | ||
| weight_kg | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to mention whether the calculation includes resting time, what safety standards are used (e.g., minimum internal temperatures), or what unit/format the result is returned in (minutes, hours, etc.). The word 'Estimate' suggests approximation but lacks specific caveats.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that immediately conveys the core function without filler. However, given the complete lack of schema descriptions and annotations, it is arguably too concise—it should expand to compensate for the missing structured metadata.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 3 parameters with 0% schema coverage, no annotations, no output schema, and the existence of a highly similar sibling tool (calculate_meat_cooking_time), the description is inadequate. It requires explicit sibling differentiation, parameter enum documentation, and output format information to be complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description mentions 'food type, weight and method' which maps to the three parameters, serving as basic documentation. However, it fails to explain the supported enum values (beef/chicken/pork/fish/lamb for food; oven/grill/boil for method) or clarify that weight is expected in kilograms, leaving significant semantic gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool estimates cooking time based on three specific factors (food type, weight, method), providing a specific verb and resource. However, it fails to differentiate from the sibling tool 'calculate_meat_cooking_time', which creates ambiguity about whether this tool handles non-meat foods or when to prefer one over the other.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'calculate_meat_cooking_time'. There are no prerequisites, warnings about food safety, or indications that the user should verify temperatures with a thermometer. The description implies usage but provides no explicit when/when-not guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cost_per_useCInspect
Calculate cost per use to evaluate purchase value
| Name | Required | Description | Default |
|---|---|---|---|
| item_price | Yes | Item purchase price | |
| expected_uses | Yes | Expected number of uses |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description discloses no behavioral traits. It does not indicate whether this is a read-only mathematical operation (presumed), what data format is returned, or whether there are precision limits or rounding behaviors for the calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is front-loaded with the action verb and contains zero wasted words. However, extreme brevity in a densely populated tool server leaves it underspecified, though this is technically a completeness issue rather than a conciseness failure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter calculation tool without an output schema, the description meets minimum viability by stating the core calculation purpose. However, given the presence of 400+ sibling calculation tools, it lacks the contextual differentiation and output format disclosure that would help an agent confidently select and integrate this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('Item purchase price', 'Expected number of uses'), the baseline is 3. The description implies the mathematical relationship (cost divided by uses) by naming the output metric, but does not add syntax details, example values, or unit requirements beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') with the resource ('cost per use') and states the goal ('evaluate purchase value'). However, with hundreds of similar calculation siblings (e.g., calculate_unit_price, calculate_cost_price), it fails to explicitly differentiate when to use 'cost per use' versus price-per-unit metrics.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like calculate_unit_price or calculate_break_even. There are no prerequisites, constraints, or scenarios mentioned to help the agent select this over sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cost_priceBInspect
Calculate unit cost price from raw materials, labor, and overhead
| Name | Required | Description | Default |
|---|---|---|---|
| labor | Yes | Labor cost | |
| overhead | Yes | Overhead/indirect costs | |
| quantity | Yes | Number of units produced | |
| raw_materials | Yes | Raw material cost |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to explain the calculation formula (sum of costs divided by quantity), the return value format, or whether the calculation includes any standard adjustments. The word 'Calculate' implies a deterministic computation but specifics are missing.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no redundant words. The action verb leads the description, followed by the output type and inputs, making it appropriately front-loaded for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (four flat numeric parameters) and complete schema coverage, the description is minimally adequate. However, it could be improved by mentioning the quantity parameter, indicating the output is a per-unit monetary value, or noting that this applies to manufacturing/production contexts.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description mentions three of the four parameters (omitting quantity) and adds minimal semantic context beyond the schema. While 'unit cost' implies the division by quantity, the description does not explicitly clarify this relationship or add format guidance beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the tool's purpose (calculating unit cost price) and specifies the three primary cost inputs (raw materials, labor, overhead), which distinguishes it from sibling calculators like calculate_cost_per_use or calculate_hourly_cost. However, it lacks explicit differentiation regarding when to choose this over similar manufacturing cost tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, prerequisites (such as having total costs ready), or specific use cases. There is no mention of when not to use it or what business context it's designed for.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_crop_factorCInspect
Calculate camera crop factor and equivalent focal length based on sensor width
| Name | Required | Description | Default |
|---|---|---|---|
| sensor_width_mm | Yes | Camera sensor width in millimeters (full frame = 36mm) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure, yet it fails to explain the return format (does it return a ratio, a multiplier, a focal length value?), whether it assumes a 35mm full-frame reference standard (implied by 'full frame = 36mm' in schema but not description), or what mathematical operation is performed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no filler words. Every element earns its place: the action verb, the dual outputs (crop factor/equivalent focal length), and the input dependency (sensor width). Front-loaded and appropriately sized for a single-parameter calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite low complexity (1 parameter, simple calculation), there is a significant gap between the description's claim of calculating 'equivalent focal length' and the schema's provision of only sensor width. Without an output schema or explanation of how focal length is derived (e.g., assuming 50mm standard, returning a formula), the description leaves the agent uncertain about actual return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, documenting the single parameter with units and context (full frame reference). The description adds no additional syntax details or format requirements beyond what the schema already provides, meeting the baseline expectation for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Calculate') and identifies the domain (camera crop factor), but it claims to calculate 'equivalent focal length' based solely on 'sensor width'—a calculation that typically requires both sensor dimensions and an input focal length value. Given the schema only provides sensor_width_mm, this creates ambiguity about what the tool actually outputs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this versus sibling photography tools like calculate_depth_of_field or calculate_exposure_triangle, nor when to prefer it over manual calculation. No prerequisites or input constraints beyond the schema are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_crypto_profit_lossAInspect
Calculate profit or loss on a cryptocurrency trade including trading fees
| Name | Required | Description | Default |
|---|---|---|---|
| quantity | Yes | Quantity of cryptocurrency traded | |
| buy_price | Yes | Purchase price per unit in fiat currency | |
| sell_price | Yes | Sale price per unit in fiat currency | |
| buy_fee_pct | No | Buy transaction fee percentage (default 0.1%) | |
| sell_fee_pct | No | Sell transaction fee percentage (default 0.1%) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It adds valuable context that trading fees are factored into calculations ('including trading fees'), but omits critical behavioral details like return format (absolute vs percentage), whether it returns gross vs net values, or fee application methodology (subtracted from proceeds vs added to cost basis).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. Information is front-loaded with the core action and domain immediately clear. No redundancy with structured data fields.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 5 parameters with complete schema coverage and no output schema, the description adequately explains the calculation purpose but gaps remain. It should ideally describe what the tool returns (profit amount, percentage, fee breakdown) to compensate for missing output schema, especially given financial calculation context where return format is critical.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, providing detailed descriptions for all 5 parameters. The tool description implies the semantic relationship between parameters (buy_price, sell_price, quantity, fees) but does not add syntax details, validation rules, or format guidance beyond what the schema already provides. Baseline 3 is appropriate for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Calculate' with clear resource 'profit or loss on a cryptocurrency trade' and scope 'including trading fees'. It effectively distinguishes from siblings like calculate_crypto_tax_fr (tax-specific), calculate_mining_profitability (mining operations), and calculate_impermanent_loss (DeFi liquidity provision) by specifying standard trading P&L with fees.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives like calculate_crypto_tax_fr or calculate_capital_gains_property. No prerequisites mentioned (e.g., requiring completed buy and sell transactions) or exclusions for incomplete trades.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_crypto_tax_frAInspect
Calculate French flat tax (30% PFU) on cryptocurrency capital gains at withdrawal
| Name | Required | Description | Default |
|---|---|---|---|
| total_gains_eur | Yes | Total unrealized gains in the portfolio in EUR | |
| withdrawal_amount_eur | Yes | Amount being withdrawn/sold in EUR | |
| total_portfolio_value_eur | Yes | Total current portfolio value in EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. It successfully communicates the specific 30% PFU tax rate and French jurisdiction—critical behavioral context for tax calculation. However, it omits output format details, whether this is a simulation vs. official calculation, and any legal disclaimers typical for tax tools.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 11-word sentence with zero waste. Front-loads all critical identifiers: jurisdiction (French), tax type (flat tax/PFU), rate (30%), domain (cryptocurrency), and action context (withdrawal). Maximum information density.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage and no output schema provided, the description adequately covers the input scope and tax type. However, given the complexity of tax law and absence of output schema, it should ideally indicate that it returns estimated tax liability in EUR to fully prepare the agent for result handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear parameter definitions already provided in the schema. The description adds contextual linkage by mentioning 'withdrawal' which aligns with the withdrawal_amount_eur parameter, but adds no additional syntax or format guidance beyond the schema baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description specifies exact verb (Calculate), jurisdiction (French), specific tax regime (flat tax/PFU at 30%), asset class (cryptocurrency), and trigger event (withdrawal). Clearly distinguishes from siblings like calculate_crypto_profit_loss (general P&L tracking) and calculate_french_income_tax (general income tax).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'at withdrawal' provides implicit contextual guidance on when to use this versus general portfolio tracking tools. However, it lacks explicit when-not guidance or direct comparison to calculate_crypto_profit_loss for non-taxable gain calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_currency_cross_rateAInspect
Calculate cross exchange rate between two currencies via USD
| Name | Required | Description | Default |
|---|---|---|---|
| rate_a_usd | Yes | Units of currency A per 1 USD | |
| rate_b_usd | Yes | Units of currency B per 1 USD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses the calculation methodology ('via USD'), indicating this performs triangular cross-rate computation. However, it lacks details on output format, precision, or behavior when inputs approach zero (despite schema validation).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb, zero redundancy. Every word earns its place—'cross' indicates indirect calculation, 'via USD' indicates the triangular method.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriately complete for a simple 2-parameter calculation tool. Minor gap: could explicitly state that it returns the derived cross-rate value (units of B per 1 A or vice versa), though this is somewhat implied by the name.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (rate_a_usd and rate_b_usd well-documented), establishing baseline 3. The description adds valuable semantic context with 'via USD', clarifying that inputs should be exchange rates against USD, not direct rates between A and B.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States a specific action (Calculate) and resource (cross exchange rate) with methodology (via USD). However, it fails to distinguish from sibling tool 'calculate_currency_exchange' which likely handles direct exchange calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus 'calculate_currency_exchange' or other currency calculators. No mention of prerequisites (needing both USD rates) or when this triangular method is preferable.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_currency_exchangeBInspect
Calculate currency exchange with bank margin and show fees lost
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Amount to exchange in source currency | |
| to_rate | Yes | Target currency rate vs USD (e.g. JPY=150) | |
| from_rate | Yes | Source currency rate vs USD (e.g. EUR=1.08) | |
| bank_margin_pct | No | Bank/exchange margin percentage (default 2.5%) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It successfully indicates the tool reveals 'fees lost' beyond simple conversion, which is valuable behavioral context. However, it lacks disclosure about output structure, whether the calculation is deterministic (read-only), or if there are any rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise at 9 words with zero redundancy. Key information is front-loaded with the action verb 'Calculate', and every word contributes to understanding the tool's specific value proposition (bank margin + fee disclosure).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 100% schema coverage and no output schema, the description is minimally adequate. It hints at output content ('show fees lost') but remains silent on the USD-baseline assumption mentioned in parameter descriptions and the calculation methodology. It meets the threshold for use but leaves significant gaps for an agent seeking to understand the full utility.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions 'bank margin' which loosely maps to the bank_margin_pct parameter, but adds no additional semantic context regarding the USD-based rate expectation (implied only in schema descriptions) or the significance of the default 2.5% margin.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Calculate', 'show') and clearly identifies the resource (currency exchange) with specific scope elements ('bank margin', 'fees lost'). While it implies distinction from siblings like calculate_exchange_rate_margin (which likely calculates only the margin), it does not explicitly clarify when to use this tool versus calculate_currency_cross_rate or other exchange-related siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus the numerous sibling currency calculators (e.g., calculate_currency_cross_rate, calculate_exchange_rate_margin, calculate_exchange_margin). It omits prerequisites such as required rate formats or when this calculation is preferred over simple exchange.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_curtain_fabricCInspect
Calculate fabric needed for curtains
| Name | Required | Description | Default |
|---|---|---|---|
| num_panels | No | Number of curtain panels | |
| fullness_ratio | No | Fullness ratio (2 = double fullness) | |
| window_width_cm | Yes | Window width cm | |
| window_height_cm | Yes | Window height cm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose output format, units (meters vs yards), whether results include seam allowances, or if the calculation is read-only (safe to retry).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 5 words, but given lack of annotations and output schema, this brevity leaves critical gaps. Appropriately front-loaded but undersized for the information burden it must carry.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Incomplete for a 4-parameter calculation tool with no annotations or output schema. Does not explain return value format, units, or calculation methodology (e.g., whether it accounts for pattern matching).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions, establishing baseline. Description adds no semantic context beyond schema (e.g., how fullness_ratio interacts with window width or typical values for curtains).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb and resource ('Calculate fabric needed for curtains') but fails to distinguish from siblings calculate_fabric_needed, calculate_fabric_yardage, or calculate_curtain_width. Also vague on what 'needed' means (length, area, cost?) or output units.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no when/when-not guidance, prerequisites (e.g., needing window measurements), or alternatives. Agent cannot determine when to use this versus calculate_fabric_yardage or calculate_curtain_width.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_curtain_widthCInspect
Calculate curtain width by fullness
| Name | Required | Description | Default |
|---|---|---|---|
| fullness | No | Fullness | standard |
| window_cm | Yes | Window width cm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose what the calculation actually does (e.g., applies multiplication factors based on fullness level), what units are returned, or whether it returns the total width or per-panel width.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 5 words with verb-first structure. No wasted words, but the extreme brevity contributes to the lack of completeness. Appropriate density if other dimensions were richer.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the calculate_curtain_fabric sibling and lack of output schema, the description should explain what distinguishes this tool (returns width measurements vs fabric requirements) and what the return value represents (e.g., total curtain width in cm). Currently insufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (window_cm: 'Window width cm', fullness: 'Fullness'), establishing baseline 3. Description mentions 'fullness' which reinforces the enum parameter, but adds no behavioral context about what the window_cm represents or how parameters interact.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly identifies the action (calculate), resource (curtain width), and key method (by fullness). However, 'by fullness' assumes domain knowledge and doesn't explicitly distinguish from the sibling calculate_curtain_fabric tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus calculate_curtain_fabric or other related tools. No mention of prerequisites, such as measuring window width first, or when not to use this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cycling_powerBInspect
Estimate cycling power output considering gradient, speed and total mass
| Name | Required | Description | Default |
|---|---|---|---|
| speed_kmh | Yes | Speed in km/h | |
| weight_kg | Yes | Rider weight in kilograms | |
| gradient_pct | Yes | Road gradient in percent (positive = uphill) | |
| bike_weight_kg | No | Bike weight in kilograms |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. The term 'Estimate' properly sets expectation that this is an approximation, but lacks disclosure of calculation method, output units (watts), idempotency, or precision limitations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. Front-loaded with verb and subject, every word earns its place. Appropriate length for a straightforward 4-parameter calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate but incomplete. With complete input schema coverage and no output schema, the description should ideally specify output units (watts) and confirm it returns mechanical power. As-is, agent knows inputs but not output format or precise calculation scope.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% establishing baseline 3. Description adds valuable semantic context by referring to 'total mass', implying the relationship between weight_kg and bike_weight_kg parameters (that they sum to total system mass), which aids agent understanding beyond raw schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb 'Estimate' and resource 'cycling power output'. Clearly identifies domain (cycling) and key physics inputs (gradient, speed, mass) which distinguishes it from generic power calculators like calculate_electrical_power or calculate_power_unit_convert among siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no explicit guidance on when to use versus alternatives (e.g., calculate_force for general physics) or prerequisites. Usage is purely implied by the parameter list mentioned in the description.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_cylinderCInspect
Cylinder volume and surface area
| Name | Required | Description | Default |
|---|---|---|---|
| height | Yes | Height | |
| radius | Yes | Radius |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full disclosure burden. It fails to indicate whether the tool returns volume, surface area, or both values; the output format; or whether the calculation is purely mathematical with no side effects. Only slightly better than complete absence by identifying the mathematical domain.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four-word fragment is terse to the point of underspecification. While not verbose, it lacks necessary content to be considered efficiently concise—every word is present but insufficient information is conveyed for a tool with mathematical output.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Simple two-parameter tool with no output schema requires description to explain return values, yet it omits what the tool actually produces. Missing critical context for an agent to know whether to expect a scalar, object with 'volume' and 'surface_area' keys, or something else.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both radius and height have descriptions, albeit minimal ones: 'Radius' and 'Height'). The description adds no parameter-specific semantics, but with full schema coverage, the baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Cylinder volume and surface area' is a noun phrase without a specific verb (e.g., 'Calculate'). It minimally identifies the domain and distinguishes from siblings like calculate_cone or calculate_sphere, but reads as a tautology restating the tool name's implied function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this specific geometry tool versus other shape calculators (calculate_cone, calculate_sphere) or generic calculate_volume. No prerequisites or alternative selection criteria mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_daily_proteinAInspect
Calculate recommended daily protein intake based on weight and fitness goal
| Name | Required | Description | Default |
|---|---|---|---|
| goal | Yes | Fitness goal | |
| weight_kg | Yes | Body weight in kilograms |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, and description fails to disclose calculation methodology (grams per kg ratios), output units (grams), or whether results are estimates. Carries full burden but omits behavioral specifics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. Information is front-loaded with the action verb and clearly structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 2-parameter calculator with no nested objects. Minor gap: lacking output units (grams) since no output schema exists, but sufficient for tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with complete parameter descriptions. Description mentions 'weight and fitness goal' confirming the schema semantics but adds no additional syntax guidance, format details, or constraints beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb 'Calculate' with specific resource 'recommended daily protein intake'. Scope 'based on weight and fitness goal' precisely distinguishes from siblings like calculate_bmr, calculate_calories_burned, and calculate_daily_vitamins.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implicit usage is clear from the specific domain (protein calculation), but lacks explicit when-to-use guidance or named alternatives for users who might need calculate_daily_vitamins or calculate_tdee instead.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_daily_vitaminsBInspect
Return recommended daily allowances (RDA) for key vitamins and minerals by age and sex
| Name | Required | Description | Default |
|---|---|---|---|
| age | Yes | ||
| sex | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. While 'Return' implies read-only, it lacks details on what specific vitamins/minerals are included, data sources/standards, regional applicability, response format, or error handling for edge cases.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single, well-structured sentence that is appropriately front-loaded with the action verb. No redundant words; every element contributes to understanding the tool's function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Minimal but adequate for a simple lookup tool. However, given zero annotations and no output schema, the description should elaborate on return value structure or list covered nutrients to be fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so description must compensate. It mentions 'age and sex' mapping to the two parameters, explaining their purpose (determining RDA), but omits units for age (years) and valid values for sex despite the enum constraint.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Return' with clear resource 'recommended daily allowances (RDA) for key vitamins and minerals' and filtering scope 'by age and sex'. It distinguishes from siblings like calculate_daily_protein by specifying vitamins/minerals, though it could clarify what 'key' encompasses or the standard used (FDA/WHO).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use guidance or alternatives mentioned. Given numerous sibling health calculators (calculate_daily_protein, calculate_bmi, calculate_bmr), the description fails to guide selection between this and other nutritional tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_data_transfer_timeBInspect
Calculate file transfer time at a given connection speed
| Name | Required | Description | Default |
|---|---|---|---|
| speed_mbps | Yes | Connection speed in Mbps | |
| file_size_gb | Yes | File size in GB |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to specify the output unit (seconds, minutes, hours), whether the calculation assumes theoretical maximum speed or accounts for network overhead, or the return format. It only states that a calculation occurs, lacking transparency into the computation's assumptions or result schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of a single, efficient sentence with no redundant words. It is appropriately front-loaded with the action verb and contains zero waste, making it easy to scan.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has only 2 simple numeric parameters with 100% schema coverage and no output schema, the description is minimally viable. However, it lacks critical contextual information such as the units of the returned time value and whether the result includes overhead calculations, which are necessary for a user to interpret results correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('File size in GB', 'Connection speed in Mbps'), the schema already adequately documents the parameters. The description reinforces the relationship between parameters and purpose but does not add syntax details, examples, or constraints beyond what the schema's 'exclusiveMinimum' and types already provide. Baseline 3 is appropriate when schema coverage is high.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the resource being computed ('file transfer time') and the key input ('connection speed'). While it implies the networking domain, it does not explicitly differentiate from the sibling tool 'calculate_speed_distance_time' or clarify that this targets data/network transfers specifically versus generic physics calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites (e.g., having both file size and speed values ready) or expected input formats. Given the large number of sibling calculation tools, exclusion criteria or usage context would be valuable but are absent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_day_of_weekCInspect
Find the day of the week for any date
| Name | Required | Description | Default |
|---|---|---|---|
| date | Yes | Date in YYYY-MM-DD format |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden of behavioral disclosure. It fails to indicate the return format (string name vs. integer), locale handling, or whether the operation is read-only/safe, leaving significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 7 words with no redundant phrases. The single sentence efficiently conveys core intent without waste, though brevity contributes to informational gaps.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacking both annotations and output schema, the description omits critical details like return value format (e.g., 'Monday' vs '1'), supported date ranges, or error behaviors. For a calculation tool with many siblings, this context is necessary to ensure correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (date parameter fully documented), establishing a baseline of 3. The description adds no additional parameter semantics, examples, or format clarification beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Find') and resource ('day of the week'), establishing a specific computational purpose. However, it does not explicitly differentiate from date-related siblings like calculate_days_between or calculate_time_difference.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus other date/calculation siblings (e.g., calculate_days_between). No preconditions, formats, or alternatives are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_days_betweenBInspect
Calculate days, weeks, approximate months and working days between two dates
| Name | Required | Description | Default |
|---|---|---|---|
| end_date | Yes | YYYY-MM-DD — End date | |
| start_date | Yes | YYYY-MM-DD — Start date |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Mentions 'approximate months' which hints at estimation methodology, but omits critical behavioral details: working day definition (which locale/holidays?), whether date boundaries are inclusive/exclusive, and return structure (object with multiple fields vs single value). With no annotations and no output schema, the description carries full disclosure burden but leaves significant gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with verb, no redundant words. Every element earns its place: action (Calculate), specific outputs (days/weeks/months/working days), and scope (between two dates). No structural waste despite missing information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 2-parameter calculation tool with simple inputs, but incomplete given lack of output schema. Should specify what 'working days' means (weekdays only? holidays excluded?) and whether it returns multiple values or requires a mode selection. Sufficient for basic invocation but leaves operational ambiguity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear YYYY-MM-DD format descriptions for both parameters. Description refers to 'two dates' which aligns with parameters but adds no additional semantic context (e.g., validation rules, ordering constraints). Baseline 3 appropriate given complete schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific outputs (days, weeks, approximate months, working days) and scope (between two dates) with clear verb 'Calculate'. However, fails to differentiate from sibling tools like 'calculate_working_days' or 'calculate_time_difference' which suggests overlap without guidance on which to choose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this multi-output tool versus specific siblings like 'calculate_working_days' or generic tools like 'calculate_time_difference'. Does not mention prerequisites (e.g., start_date must precede end_date) or locale requirements for working day calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_debt_capacityAInspect
Calculate maximum loan capacity using French HCSF 35% debt ratio rule
| Name | Required | Description | Default |
|---|---|---|---|
| rate | No | Annual interest rate in % (default 3.5) | |
| duration_years | No | Loan duration in years (default 25) | |
| existing_debts | No | Existing monthly debt payments in EUR (default 0) | |
| monthly_income | Yes | Net monthly income in EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses the specific 35% debt ratio calculation method, which is crucial behavioral context. However, it omits safety properties (idempotent/pure function), output units (EUR), or whether results include insurance/fees.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 12-word sentence with zero redundancy. Front-loaded with action verb. Every element (French, HCSF, 35%, debt ratio) adds vital distinguishing information among numerous financial calculator siblings.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for a calculation tool with 100% schema coverage and no output schema. Specifies the calculation methodology and domain. Minor gap: doesn't specify return value format (currency, max amount vs breakdown).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. The description mentions the 35% rule which contextualizes how existing_debts and monthly_income interact, but adds no explicit parameter relationships or format details beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (Calculate), resource (maximum loan capacity), and precise methodology (French HCSF 35% debt ratio rule). The explicit mention of 'French HCSF' clearly distinguishes this from sibling tools like calculate_debt_service_ratio or calculate_mortgage.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear domain context (French HCSF regulations) that implies when to use it, but lacks explicit 'when not to use' guidance or named alternatives. The regulatory specificity helps the agent select it for French lending scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_debt_service_ratioBInspect
Calculate debt-to-income ratio and maximum additional loan capacity
| Name | Required | Description | Default |
|---|---|---|---|
| monthly_debts | Yes | Existing monthly debt payments EUR | |
| monthly_income | Yes | Net monthly income EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It reveals that the tool computes two distinct metrics (the ratio and additional capacity), which is useful behavioral context. However, it omits details about the return format, calculation assumptions (e.g., standard DTI thresholds), or whether the operation is stateless/idempotent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with zero redundancy. Every word earns its place by identifying both the primary calculation (debt-to-income) and secondary output (loan capacity), delivering maximum information density in minimal space.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple calculator with two well-documented parameters and no output schema, the description adequately covers the tool's purpose by naming both calculated outputs. While an output schema would improve completeness, the description is sufficient for an AI to select and invoke the tool given the high schema coverage and straightforward calculation domain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage ('Net monthly income EUR', 'Existing monthly debt payments EUR'), so the structured data already defines parameter semantics. The description adds no supplementary parameter guidance, warranting the baseline score of 3 for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Calculate') and identifies the resources being processed ('debt-to-income ratio' and 'maximum additional loan capacity'), clearly stating what the tool computes. However, given siblings like 'calculate_debt_to_income' and 'calculate_debt_capacity', it fails to differentiate when this specific combination tool should be preferred over the individual alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus its siblings (calculate_debt_to_income, calculate_debt_capacity), prerequisites for invocation, or expected use cases. Agents must infer applicability solely from the name and parameter schema without contextual cues.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_debt_to_incomeDInspect
Calculate debt-to-income ratio
| Name | Required | Description | Default |
|---|---|---|---|
| monthly_debt | Yes | Total monthly debt payments | |
| monthly_income | Yes | Gross monthly income |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided and description discloses nothing about side effects, return format (percentage vs decimal), validation errors, or calculation methodology (gross vs net income handling). Despite being a pure calculation function, the agent receives zero behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief (4 words), this is under-specification masquerading as conciseness. The single 'sentence' wastes vertical space by failing to leverage the description field for context that would help select this over sibling tools.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple 2-parameter structure with full schema coverage, minimal description could suffice, but the presence of conceptually similar sibling calculators (debt_service_ratio, debt_capacity) creates a responsibility to differentiate that is unfulfilled. No output schema increases the burden on the description to explain what 'ratio' means (e.g., 36% vs 0.36).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters fully documented ('Total monthly debt payments', 'Gross monthly income'). The description adds no semantic value beyond the schema, but baseline 3 is appropriate when schema carries the load.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Calculate debt-to-income ratio' is tautological—it merely restates the tool name with a verb. It fails to distinguish this from siblings like calculate_debt_service_ratio or calculate_debt_capacity, and doesn't explain what DTI measures or its significance (e.g., loan qualification metric).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus calculate_debt_service_ratio or calculate_loan_to_value. No mention of prerequisites, input validation requirements, or typical use cases (e.g., mortgage applications).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_delivery_costBInspect
Estimate shipping cost from weight, distance and service (standard vs express)
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | Service level | standard |
| weight_kg | Yes | Package weight kg | |
| distance_km | Yes | Delivery distance km |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of disclosure. It qualifies the operation as an 'estimate' which indicates the result is approximate, but it does not disclose whether the calculation is deterministic, if it requires external API calls, rate limits, or the expected return value structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of eleven words. It front-loads the verb 'Estimate' and immediately specifies the resource and required inputs without redundant phrases or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple three-parameter calculation tool without output schema, the description provides minimal viable coverage of inputs. However, it omits any indication of the return value format (currency, numeric value, object) and does not clarify the calculation scope or methodology.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all three parameters (weight_kg, distance_km, type) clearly documented in the schema. The description mentions 'weight, distance and service' which aligns with the schema, but adds no additional semantic context (e.g., valid ranges, format details) beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool estimates shipping costs using specific inputs (weight, distance, service level). It identifies the resource (shipping cost) and action (estimate) with specific parameters, though it does not explicitly differentiate from sibling tools like calculate_international_shipping or calculate_shipping_volumetric.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives such as calculate_international_shipping or calculate_shipping_volumetric. There are no stated prerequisites, exclusions, or conditions for use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_densityCInspect
Calculate density, mass, or volume
| Name | Required | Description | Default |
|---|---|---|---|
| density | No | kg/m³ | |
| mass_kg | No | Mass kg | |
| volume_m3 | No | Volume m³ |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It fails to disclose that this tool computes the third variable using the physics formula ρ=m/V (requiring exactly two inputs), or what happens if all three are provided. No mention of validation behavior or output format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The five-word description wastes no space and is appropriately front-loaded. However, it is underspecified for a tool with inverse calculation logic - conciseness becomes a liability when critical behavioral details are omitted.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 3-parameter calculation tool with no output schema and no annotations, the description is incomplete. It omits the essential physics relationship, input requirements (exactly 2 of 3), units context, and return value description.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (kg/m³, Mass kg, Volume m³), the baseline is 3. The description lists the three calculable entities (density, mass, volume) matching the parameters, but adds no semantic context about their relationship or which two are required to calculate the third.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the tool calculates 'density, mass, or volume' using a clear verb and resources. However, it fails to differentiate from siblings like calculate_volume or calculate_density_convert, and doesn't explain that it solves for the missing variable given two inputs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus calculate_volume, calculate_density_convert, or other physics calculators. Given that all three parameters are optional (0 required), the lack of usage constraints is problematic.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_density_convertBInspect
Convert density between kg/m³, g/cm³, lb/ft³, oz/in³
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Density value | |
| to_unit | Yes | Target unit | |
| from_unit | Yes | Source unit |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It successfully lists the supported density units, which constrains behavior, but omits other important details like return format, error handling for invalid unit combinations, or confirmation that this is a pure calculation without side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence of nine words with no filler. It is appropriately front-loaded with the action verb. It could earn a 5 by including sibling differentiation without sacrificing brevity, but as-is it is efficient and unambiguous.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple three-parameter conversion tool without output schema, the description covers the essential use case and supported units. However, it lacks the sibling differentiation needed to prevent confusion with 'calculate_density', which would be necessary for completeness given the tool's presence in a crowded calculator namespace.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, the baseline is 3. The description adds significant value by mapping the schema's underscored enum values (kg_m3, g_cm3) to their proper physical unit representations (kg/m³, g/cm³), helping the agent interpret parameter semantics beyond the raw schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool converts density between specific units (kg/m³, g/cm³, lb/ft³, oz/in³), providing specific verb and resource. However, it fails to differentiate from sibling tool 'calculate_density', which likely computes density from other inputs rather than converting between density units.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'calculate_density' or the generic 'convert' tools. It does not mention prerequisites, when not to use it, or that it requires physical density values rather than mass/volume inputs.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_depth_of_fieldBInspect
Calculate depth of field, near/far focus limits and hyperfocal distance for a camera lens
| Name | Required | Description | Default |
|---|---|---|---|
| aperture | Yes | Lens aperture (f-number, e.g. 2.8) | |
| distance_m | Yes | Subject distance in meters | |
| focal_length_mm | Yes | Lens focal length in millimeters | |
| sensor_width_mm | No | Camera sensor width in mm (default 36 for full frame) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided. Description carries the burden by listing what calculations are performed (DOF, focus limits, hyperfocal), which hints at the return values. However, lacks details on precision, units of output, or calculation methodology (circle of confusion assumptions).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with the action and domain. No redundant words. Efficiently lists the three calculated outputs without verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema, the description compensates partially by enumerating the calculated concepts (near/far limits, hyperfocal distance). Could be improved by stating the return format or units, but adequate given the self-explanatory calculation domain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with clear units (mm, f-number, meters) and defaults (sensor_width_mm). Description does not add parameter syntax or validation details beyond the schema, warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Calculate) and resources (depth of field, near/far focus limits, hyperfocal distance) for camera lenses. Clear and specific, but does not explicitly distinguish from sibling `calculate_hyperfocal_distance` which overlaps on one output.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to choose this comprehensive DOF tool versus the sibling `calculate_hyperfocal_distance` or when to use each. No prerequisites or conditions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dew_pointCInspect
Dew point temperature
| Name | Required | Description | Default |
|---|---|---|---|
| temp_c | Yes | Temperature °C | |
| humidity_pct | Yes | Relative humidity % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but offers nothing. It does not indicate whether this is a pure calculation, what formula/approximation is used, valid input ranges beyond the schema minimums, or what output format to expect (float, integer, object).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At three words, it is brief, but this represents under-specification rather than efficient communication. The single fragment fails to earn its place by communicating meaningful operational context beyond the tool name itself.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite being a simple two-parameter calculation tool, the description is insufficient given the absence of an output schema and annotations. It should at minimum state that it calculates the dew point from temperature and relative humidity, and indicate the return value type.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (temp_c and humidity_pct are both documented), the baseline score is 3. The description adds no additional semantics about the relationship between temperature and humidity or what dew point represents, but it doesn't need to compensate for schema gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Dew point temperature' is a noun phrase that restates the concept from the tool name (calculate_dew_point) without the action verb. It fails to specify what the tool actually does (calculates, converts, returns) or how it relates to the input parameters.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus meteorologically-related siblings like calculate_heat_index, calculate_humidity, or calculate_wind_chill. No prerequisites, constraints, or alternative tools are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dice_probabilityBInspect
Calculate dice roll probability for exact values, minimum or maximum targets
| Name | Required | Description | Default |
|---|---|---|---|
| target | Yes | Target value to calculate probability for | |
| num_dice | Yes | Number of dice to roll | |
| num_sides | No | Number of sides on each die (default d6) | |
| comparison | Yes | Comparison type: exact match, at least target, or at most target |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to mention the return format (percentage vs decimal), validation limits (max 20 dice/100 sides), or safety characteristics (read-only calculation). 'Calculate' implies non-destructive behavior but does not explicitly confirm it.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It front-loads the action verb and immediately follows with the scope. However, extreme brevity comes at the cost of completeness given the lack of annotations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description is incomplete for a calculation tool. It fails to describe the output format, default dice behavior (d6), or error handling behavior that an agent would need to properly invoke and interpret results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description maps 'minimum or maximum targets' to the comparison enum values, adding user-friendly context, but does not add significant semantic detail beyond what the schema already provides for the four parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and resource ('dice roll probability'), and clearly scopes the functionality to 'exact values, minimum or maximum targets'. It effectively distinguishes this from siblings like calculate_card_draw_probability or calculate_lottery_odds by specifying 'dice'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description lists the three specific comparison modes (exact, minimum, maximum), providing implied guidance on when to use the tool. However, it lacks explicit when-to-use/when-not-to-use guidance or references to alternative tools like calculate_probability_binomial.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dilutionCInspect
Dilution formula C1V1=C2V2
| Name | Required | Description | Default |
|---|---|---|---|
| c1 | No | Initial concentration | |
| c2 | No | Final concentration | |
| v1 | No | Initial volume | |
| v2 | No | Final volume |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, and description fails to disclose: which of the 4 variables gets calculated (since 0 parameters are required), validation constraints (e.g., positive values only), or what the output format looks like.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Only 3 words (essentially 'Dilution formula [equation]'). While not verbose, this is under-specification rather than true conciseness—lacks necessary structure to explain a 4-variable calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema, no annotations, and complex interdependent parameters, the description should explain the calculation logic, constraints, and return value. Currently inadequate for agent to invoke correctly without guessing which parameters to provide.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (all 4 parameters documented). Description adds only the formula notation C1V1=C2V2 which maps to parameter names, but adds no semantic detail beyond the schema's 'Initial/Final concentration/volume' labels.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the domain (dilution) and specific formula (C1V1=C2V2), but doesn't clarify what the tool calculates (which variable is solved for when 3 are provided) or how it differs from sibling 'calculate_solution_dilution'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus 'calculate_solution_dilution' or other concentration calculators. No mention of prerequisites or input requirements.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_discountBInspect
Calculate discounted price with optional successive discounts
| Name | Required | Description | Default |
|---|---|---|---|
| discount_pct | Yes | First discount percentage | |
| discount2_pct | No | Optional second successive discount | |
| original_price | Yes | Original price |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It successfully indicates the tool handles successive (chained) discounts, but fails to disclose output format, rounding behavior, currency handling, or whether it returns just the final price or also intermediate values.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no redundancy. The key differentiator 'optional successive discounts' is included without waste, making it appropriately front-loaded for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple calculation tool with well-documented parameters, the description is minimally viable. It implies the output is the final discounted price, but given the lack of output schema, it should explicitly state the return value format and how successive discounts are applied (multiplicatively vs additively).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds minimal semantic value beyond the schema, though it emphasizes the 'successive' nature of the optional second discount which aligns with the discount2_pct parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifies the verb (Calculate) and resource (discounted price) and distinguishes this from simple percentage calculators by mentioning 'optional successive discounts'. However, it fails to differentiate from the sibling tool 'calculate_discount_effective', which likely calculates the effective discount rate rather than final price.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While 'optional successive discounts' implicitly suggests when to use the tool (for chained discounts), there is no explicit guidance on when to choose this over alternatives like 'calculate_discount_effective' or other pricing tools, and no 'when-not-to-use' guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_discount_effectiveCInspect
Effective discount with stacked discounts
| Name | Required | Description | Default |
|---|---|---|---|
| discount_1_pct | Yes | First discount % | |
| discount_2_pct | No | Second discount % | |
| original_price | Yes | Original price |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Mentions 'stacked discounts' implying multiple discounts are combined, but fails to disclose HOW they stack (additive vs multiplicative), what 'effective' means (final price vs percentage), or what the return value represents.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at 6 words. Front-loaded with key concept ('Effective discount'). While efficient, the extreme brevity leaves significant semantic gaps that sentences could fill.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 3-parameter calculation tool with no output schema, the description inadequately explains the calculation methodology, return format, and distinction from similar tools. 'Effective' and 'stacked' terms lack definitional clarity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. Description adds semantic context 'stacked discounts' implying relationship between discount_1_pct and discount_2_pct, but does not fully explain interaction logic or calculation order.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the tool calculates 'effective discount with stacked discounts', which identifies the specific domain (discounts) and distinctive feature (stacking/effective rate). It implicitly distinguishes from sibling 'calculate_discount' by mentioning stacking, though it could be more explicit about the calculation purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus the sibling 'calculate_discount' tool, or under what conditions stacking applies. No mention of prerequisites or expected usage patterns.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_distance_2dCInspect
Distance between two 2D points
| Name | Required | Description | Default |
|---|---|---|---|
| x1 | Yes | X1 | |
| x2 | Yes | X2 | |
| y1 | Yes | Y1 | |
| y2 | Yes | Y2 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description carries the full burden without disclosing behavioral details such as the Euclidean formula used, expected units, return value type, or precision constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at only five words with no redundancy. However, the extreme brevity borders on under-specification for behavioral context, preventing a perfect score.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple mathematical calculation with four numeric inputs, the description plus schema provides minimal viable context. However, lacking output schema and return value description leaves gaps in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with minimal descriptions ('X1', 'Y1', etc.). The description adds semantic context by implying parameters form two coordinate pairs, but does not explain coordinate system expectations or units beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool calculates distance between two 2D points, using specific resource terms. It distinguishes from the sibling 'calculate_distance_3d' by specifying '2D', though it could strengthen further by specifying 'Euclidean' distance.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus alternatives like 'calculate_pythagoras' or 'calculate_distance_3d', nor any prerequisites or unit requirements mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_distance_3dCInspect
Distance between two 3D points
| Name | Required | Description | Default |
|---|---|---|---|
| x1 | Yes | ||
| x2 | Yes | ||
| y1 | Yes | ||
| y2 | Yes | ||
| z1 | Yes | ||
| z2 | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to specify the distance formula used (Euclidean vs. Manhattan), the return value units, or that this is a pure read-only calculation with no side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The four-word description is efficiently structured with no redundancy. However, given the complete lack of schema descriptions and annotations, it may be overly terse rather than appropriately concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a pure mathematical function with no output schema, the description adequately conveys the basic operation but lacks completeness regarding the calculation method (Euclidean distance) and expected return format. The high parameter count with zero documentation in schema limits completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% with six undocumented numeric parameters. The description partially compensates by mentioning 'two 3D points', which explains the semantic grouping of parameters (x1,y1,z1 as first point, x2,y2,z2 as second). However, it provides no details on coordinate systems, units, or value constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Distance between two 3D points' clearly identifies the core operation and resource (3D points). It distinguishes itself from sibling calculate_distance_2d by explicitly mentioning '3D'. However, it uses a noun phrase rather than an action verb (e.g., 'Calculate'), which slightly reduces precision.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance is provided for when to use this tool versus alternatives (like calculate_distance_2d). While the '3D' qualifier implies use for spatial coordinates with z-values, there are no explicit when-to-use or when-not-to-use exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_distance_securiteAInspect
Calculate safe following distance using the 2-second rule (French highway code)
| Name | Required | Description | Default |
|---|---|---|---|
| speed_kmh | Yes | Vehicle speed in km/h |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden. It successfully reveals the calculation method ('2-second rule'), but omits output format/units, assumptions (e.g., dry conditions), or whether the result includes reaction time vs physical braking distance.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of nine words with action-oriented front-loading ('Calculate...'). Zero redundancy; every word earns its place by conveying the specific calculation, method, and legal framework.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter calculator with no output schema, the description is reasonably complete by specifying the algorithm and jurisdiction. However, it could enhance completeness by indicating the output unit (meters) or that results are advisory estimates.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage with 'Vehicle speed in km/h' already documented. The description mentions no additional parameter constraints, edge cases, or semantic nuances beyond what the schema provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb ('Calculate'), clear resource ('safe following distance'), distinctive method ('2-second rule'), and jurisdiction ('French highway code'), clearly distinguishing it from generic siblings like calculate_distance_2d or calculate_braking_distance.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The 'French highway code' reference implies jurisdictional context, but the description lacks explicit guidance on when to prefer this over calculate_braking_distance or other traffic safety tools, and mentions no prerequisites or limitations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_distance_to_horizonCInspect
Calculate the distance to the horizon from a given height
| Name | Required | Description | Default |
|---|---|---|---|
| height_m | Yes | Observer height above ground in metres |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description must carry the full burden of behavioral disclosure. It fails to specify the output units (kilometers, meters, miles), physical assumptions (Earth curvature model, refraction), or that the result represents the visible horizon line-of-sight distance. This leaves critical behavioral traits undocumented.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is appropriately sized and front-loaded with the core action. However, while concise, it is arguably under-specified rather than efficiently dense; one additional sentence clarifying output units or physical model would significantly improve value without sacrificing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (single simple parameter) and lack of output schema, the description is minimally adequate. However, for a calculation tool returning a physical measurement, the omission of output units and calculation assumptions (spherical Earth approximation) represents a meaningful completeness gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (height_m is well-documented as 'Observer height above ground in metres'), establishing a baseline of 3. The description references 'from a given height' which aligns with the parameter, but adds no additional semantic context about valid ranges, units clarification, or the physical meaning of the height above ground.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clear resource ('distance to the horizon'), establishing what the tool does. However, it does not explicitly distinguish this from sibling tools like calculate_distance_2d or calculate_distance_3d, which could cause confusion given the extensive list of calculate_* siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., when to use general distance calculators vs. this horizon-specific calculation). There are no prerequisites, constraints, or exclusions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dog_ageCInspect
Dog age in human years (modern method)
| Name | Required | Description | Default |
|---|---|---|---|
| size | No | Dog size | medium |
| dog_years | Yes | Dog age in years |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. While it mentions 'modern method' (hinting at non-linear calculation), it fails to explain what this method entails, the logarithmic nature of dog aging, or that results are estimates based on biological research rather than precise conversion.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely terse at 7 words with zero redundancy. The key differentiator ('modern method') is present. However, it borders on under-specification given the lack of annotations or output schema that would provide additional behavioral context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero annotations and no output schema, the description should explain the 'modern method' calculation approach, clarify that size significantly impacts aging rates (small dogs age slower), and indicate what numeric output to expect. Currently incomplete for the tool's behavioral complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, clearly documenting both dog_years and size parameters with their types and constraints. The description adds no semantic layer beyond the schema, but none is strictly required given the complete schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the resource (dog age), the target format (human years), and distinguishes the approach (modern method vs. simple multiplication). It implicitly differentiates from sibling tools like calculate_cat_age and calculate_pet_age through species-specific naming.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus calculate_pet_age (generic) or calculate_cat_age. No explanation of when the optional 'size' parameter should be provided or how it affects accuracy.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dog_foodBInspect
Calculate daily dog food quantity based on weight, age and activity level
| Name | Required | Description | Default |
|---|---|---|---|
| age | Yes | ||
| activity | Yes | ||
| weight_kg | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations present, yet description fails to disclose return value format (grams, cups?), calculation methodology, or whether outputs are estimates vs veterinary precision standards.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single front-loaded sentence of 10 words; no redundancy, though brevity contributes to other dimension deficits.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Absence of output schema is not compensated; description states what is calculated but not return structure, while inputs are minimally covered.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description minimally compensates by listing the three parameters (weight, age, activity level) but provides no semantic detail on enum values (puppy/adult/senior) or units.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with clear resource 'daily dog food quantity' and explicitly names 'dog', distinguishing from siblings like calculate_cat_food and calculate_pet_food_portion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to prefer this over calculate_pet_food_portion or calculate_cat_food, or prerequisites like breed-specific applicability.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dog_pregnancyBInspect
Calculate dog due date from mating date
| Name | Required | Description | Default |
|---|---|---|---|
| mating_date | Yes | Mating date YYYY-MM-DD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to mention that this calculates an estimated due date based on average canine gestation (typically ~63 days), does not account for breed-specific variations, and provides no information about output format or precision limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is exactly six words with zero redundancy. It front-loads the action verb and immediately specifies the domain (dog) and function (due date calculation), making it maximally scannable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter tool with complete schema coverage, the description provides the minimum viable information to invoke the tool correctly. However, given the domain complexity (gestation biology), it lacks important context about calculation methodology and assumptions that would aid interpretation of results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('Mating date YYYY-MM-DD'), the schema fully documents the single parameter. The description adds semantic context by linking 'mating date' to 'dog due date', establishing the calculation domain, but adds no formatting guidance beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb (Calculate), resource (dog due date), and input requirement (mating date). It distinguishes from the sibling 'calculate_cat_pregnancy' by specifying 'dog', though it does not differentiate from potentially generic alternatives like 'calculate_breeding_due_date' or clarify gestational assumptions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description offers no guidance on when to use this tool versus alternatives (e.g., 'calculate_pregnancy_due_date' for humans or 'calculate_breeding_due_date' for generic breeding). It omits prerequisites such as confirming pregnancy or veterinary considerations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dog_walking_caloriesCInspect
Calculate calories burned by walker and dog during a walk
| Name | Required | Description | Default |
|---|---|---|---|
| pace | Yes | ||
| duration_min | Yes | ||
| dog_weight_kg | Yes | ||
| walker_weight_kg | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It adds valuable context that the calculation covers BOTH walker and dog simultaneously, which is not obvious from the name alone. However, it lacks disclosure on safety, idempotency, or return format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of nine words is efficiently structured and front-loaded with the verb. However, given the 0% schema coverage and lack of annotations, this brevity leaves significant gaps rather than being optimally concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema, no annotations, and 0% parameter description coverage, the nine-word description is insufficient. It fails to explain the calculation methodology or what inputs are expected for a tool with four required parameters.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, requiring the description to compensate, but it fails to do so. No mention of required parameters (weights, duration, pace) or units (kg, minutes), leaving critical semantics undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (calculate calories) and target entities (walker and dog) clearly, distinguishing it from the generic sibling 'calculate_calories_burned'. However, it does not explicitly differentiate why to use this over other fitness calculators.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives like 'calculate_calories_burned' or 'calculate_bmr', and does not mention prerequisites (e.g., needing both weights).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dollar_cost_averageCInspect
Calculate DCA portfolio value and performance for recurring crypto investments
| Name | Required | Description | Default |
|---|---|---|---|
| periods | Yes | Number of investment periods | |
| average_price | Yes | Average purchase price per unit over all periods | |
| current_price | Yes | Current market price per unit | |
| investment_per_period | Yes | Amount invested per period in fiat currency |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Does not disclose whether this fetches live market data or performs pure calculation (input parameters suggest pure math), what specific 'performance' metrics are returned, or any side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb, no redundancy. However, extreme brevity limits informational value given the absence of output schema and annotations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks output schema and description fails to specify what 'performance' includes (ROI, absolute return, percentage gain?) or return data structure. For a calculation tool with 100% input coverage, omitting output specification is a critical gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage. The description adds 'DCA' context which semantically links parameters like investment_per_period and periods to the dollar-cost averaging strategy, but adds no format, validation rules, or examples beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Calculate' with specific resource 'DCA portfolio value and performance' and scope 'recurring crypto investments'. The DCA specificity distinguishes it from sibling calculate_crypto_profit_loss.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No when-to-use guidance, prerequisites, or alternatives mentioned. Fails to clarify when to use this versus calculate_crypto_profit_loss or other investment calculators in the large sibling set.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dpe_energy_classAInspect
Determine French DPE energy class from primary energy consumption
| Name | Required | Description | Default |
|---|---|---|---|
| kwh_m2_year | Yes | Primary energy consumption in kWh/m2/year |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to specify if the operation is read-only, what the output format is (e.g., letter grades A-G), or any specific domain constraints beyond the input parameter. The term 'Determine' implies a calculation but does not explicitly confirm idempotency or safety.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence of nine words with zero waste. Every word earns its place by conveying the action, domain specificity, and input requirement.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (one simple parameter) and complete schema coverage, the description adequately covers the tool's purpose. However, without an output schema, the description could be enhanced by specifying the return value format (energy class letters), which is absent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 100% description coverage, establishing a baseline of 3. The description mentions 'primary energy consumption' which aligns with the parameter kwh_m2_year, but adds minimal semantic meaning beyond what the schema already provides (unit and description).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses the specific verb 'Determine' with the clear resource 'French DPE energy class', explicitly stating what the tool does. The mention of 'French DPE' effectively distinguishes it from numerous sibling calculation tools like calculate_energy_physics or calculate_electricity_cost.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implicit usage guidance through the specific domain term 'French DPE', indicating this is for French energy performance diagnostics. However, it lacks explicit when-to-use guidance, prerequisites, or named alternatives among the 300+ sibling calculate tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_drain_slopeAInspect
Calculate minimum drain pipe slope according to French DTU norms
| Name | Required | Description | Default |
|---|---|---|---|
| fixture_type | Yes | Type of sanitary fixture being drained | |
| pipe_diameter_mm | No | Drain pipe diameter in millimeters (default 100mm) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions compliance with technical standards (DTU) but fails to disclose whether this is read-only, what units/format the result returns (percent, degrees, ratio), or potential error conditions (invalid fixture/diameter combinations).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb 'Calculate', zero redundancy. Every word earns its place by specifying domain (French DTU norms) and resource (drain pipe slope).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a low-complexity calculation tool with 2 parameters and complete schema coverage, the description adequately explains the function. Minor gap: lacks description of return value format (slope percentage/ratio) given no output schema exists.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with fixture_type and pipe_diameter_mm fully documented in the JSON schema. The description adds minimal semantic context beyond implying the fixture_type relates to drains, so baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with clear resource 'drain pipe slope' and distinguishes from generic siblings (calculate_slope) by specifying domain 'according to French DTU norms'. Among hundreds of calculate_* tools, this precisely identifies the plumbing/compliance context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The mention of 'French DTU norms' implies usage context (French building/plumbing projects), but provides no explicit when-to-use guidance, prerequisites, or distinctions from general slope calculation tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_dress_alterationsCInspect
Calculate alteration adjustments needed for a dress
| Name | Required | Description | Default |
|---|---|---|---|
| target_size | Yes | Target FR dress size | |
| measurement_bust | Yes | Actual bust measurement cm | |
| measurement_hips | Yes | Actual hip measurement cm | |
| measurement_waist | Yes | Actual waist measurement cm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are absent, so the description carries full burden. While 'Calculate' implies a read-only operation, the description fails to disclose output format (e.g., does it return centimeters to add/remove, percentage adjustments, or sewing instructions?) or that it specifically targets French (FR) dress sizing as indicated by the enum values.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 7 words. Front-loaded with the action verb. However, the brevity comes at the cost of missing behavioral and contextual details that would help an agent understand what the calculation produces.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema and no annotations, the description should explain the calculation results (e.g., positive/negative cm values indicating 'take in' or 'let out'). It omits that this uses French sizing standards and doesn't clarify the practical meaning of 'alteration adjustments' in the output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 4 parameters have clear descriptions including units and that target_size is FR-specific), so the baseline is 3. The description adds no additional parameter guidance, but the schema sufficiently documents semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States a specific verb ('Calculate') and resource ('dress'), and implies the domain (sewing/tailoring). However, it does not explicitly distinguish from sibling `calculate_clothing_size_convert`, which converts between size systems rather than calculating fabric adjustments.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use or when-not-to-use guidance is provided. No mention of prerequisites (e.g., requiring actual body measurements vs. standard size charts) or comparison to sibling tools like `calculate_fabric_needed` or `calculate_clothing_size_convert`.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_due_dateBInspect
Calculate estimated due date using Naegele's rule and return trimester milestone dates
| Name | Required | Description | Default |
|---|---|---|---|
| last_period_date | Yes | YYYY-MM-DD — First day of last menstrual period |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the burden of disclosure. It reveals the algorithm (Naegele's rule) and return structure (trimester dates), but lacks error handling, constraint, or side-effect information.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single front-loaded sentence with zero filler. However, brevity contributes to ambiguity with pregnancy-related siblings; a few additional words clarifying scope would improve utility without sacrificing structure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequately explains the single input and implied outputs for a basic calculation tool, but incomplete regarding sibling differentiation. No output schema exists, though description partially compensates by listing trimester milestones.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage ('First day of last menstrual period'), establishing baseline 3. Description implies parameter usage through Naegele's rule context but adds no explicit parameter semantics beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific calculation method (Naegele's rule) and outputs (trimester milestones), but does not distinguish from sibling `calculate_pregnancy_due_date` despite identical-sounding purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives, particularly the similarly-named `calculate_pregnancy_due_date` or `calculate_breeding_due_date` siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_earthquake_energyBInspect
Calculate energy released by an earthquake from its magnitude
| Name | Required | Description | Default |
|---|---|---|---|
| magnitude | Yes | Richter/moment magnitude |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose critical behavioral traits: output units (Joules, TNT equivalent?), calculation method (Gutenberg-Richter formula?), and whether result is an approximation. The agent cannot infer the scale or format of the returned energy value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 9-word sentence with zero waste. Purpose is front-loaded immediately. No redundant phrases or tautologies. Appropriate length for a single-parameter calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 1-parameter tool but missing expected details for a scientific calculator. With no output schema, the description should mention what the calculation returns (units, scale). Currently minimal but functional.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('Richter/moment magnitude'), establishing baseline score of 3. Description mentions 'from its magnitude' but adds no syntax guidance, valid ranges explanation, or clarifying examples beyond the schema's minimum/maximum constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and specific resource ('energy released by an earthquake'). Successfully identifies the seismology domain, distinguishing it from generic physics calculators like calculate_energy_physics. However, lacks explicit differentiation from the sibling calculate_energy_physics tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives. Does not mention prerequisites (e.g., needing a magnitude value) or when calculate_energy_physics might be preferred for general kinetic/potential energy calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ects_creditsCInspect
Estimate ECTS credit workload
| Name | Required | Description | Default |
|---|---|---|---|
| weeks | Yes | Number of weeks | |
| hours_per_week | Yes | Study hours per week | |
| hours_per_credit | No | Hours per ECTS credit (standard: 25-30) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so description carries full burden for behavioral disclosure. It fails to explain what the tool returns (number of credits? total hours?), the calculation methodology, or whether results are rounded. No mention of validation behavior or edge cases.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is extremely brief (4 words) with no filler or redundancy. While not wasteful, it is insufficiently informative for tool selection given the lack of annotations and output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema, no annotations, and no explanation of the input-to-output relationship, the description fails to provide adequate context. It should explain that it converts study hours into ECTS credit values and what the return value represents.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear parameter descriptions (hours_per_week, weeks, hours_per_credit). The description adds no semantic meaning beyond what the schema already provides, meeting baseline expectations for well-documented schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the general function (estimating workload) but uses vague verb 'Estimate' instead of clarifying the calculation logic. With hundreds of calculate_* siblings including education-related tools like calculate_scholarship_comparison, it fails to distinguish when to use this specific tool versus others.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description provides no guidance on when to use this tool versus alternatives, prerequisites for inputs, or expected use cases. No mention of when the default hours_per_credit value applies.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_electrical_powerCInspect
Electrical power mono/tri-phase
| Name | Required | Description | Default |
|---|---|---|---|
| phase | No | Phase | mono |
| cos_phi | No | Power factor | |
| current | Yes | Amps | |
| voltage | Yes | Volts |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, so the description carries the full burden. It mentions 'mono/tri-phase' implying different calculation modes but does not explain what the tool returns (watts, VA, kW), the significance of the power factor (cos_phi), or any constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extreme brevity results in an incomplete sentence fragment ('Electrical power mono/tri-phase') that sacrifices clarity for terseness. While not verbose, it constitutes under-specification rather than efficient conciseness, failing to front-load critical information like calculation output.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema and annotations, the description should explain the calculation result and units. It provides no information on return values, calculation methodology (apparent vs real power), or when to use cos_phi. Inadequate for a domain-specific engineering tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing a baseline of 3. The description references 'mono/tri-phase' which maps to the phase parameter, but this adds minimal value since the schema already lists the enum values and descriptions. No additional context provided for cos_phi (power factor) implications.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the domain (electrical power) and indicates support for mono/tri-phase systems, which adds some specificity beyond the tool name. However, it lacks a clear verb (e.g., 'Calculate') and fails to distinguish from electrical siblings like calculate_ohms_law or calculate_electricity_cost.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to select this tool versus alternatives like calculate_ohms_law or calculate_cable_section_electrical. No mention of prerequisites or use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_electricity_costCInspect
Calculate electricity cost for an appliance
| Name | Required | Description | Default |
|---|---|---|---|
| days | No | Days | |
| power_w | Yes | Watts | |
| hours_day | Yes | Hours/day | |
| price_kwh | No | EUR/kWh |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It omits critical behavioral context: the implied EUR currency (from price_kwh default 0.2516), the default 30-day calculation period, and the fact that this is a pure calculation with no side effects. Does not describe output format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely terse at 6 words. Front-loaded with clear action. No wasted words, but arguably too minimal—omits useful context that would justify the single sentence's existence (currency, timeframe).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple calculator with 4 parameters, the description is insufficient. Lacks mention of EUR currency, the default monthly timeframe (30 days), and what value is returned. Sibling tool 'calculate_electricity_cost_appliance' creates ambiguity that is unresolved.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. The description mentions 'appliance' which provides minor semantic context for the 'power_w' parameter, but adds no detail on parameter relationships (e.g., that cost = power_w × hours_day × days × price_kwh / 1000).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and resource ('electricity cost'), with scope ('for an appliance'). However, it fails to differentiate from sibling tool 'calculate_electricity_cost_appliance', which has an almost identical semantic meaning.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus 'calculate_electricity_cost_appliance' or other energy-related calculators. No prerequisites or conditions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_electricity_cost_applianceCInspect
Annual electricity cost of an appliance
| Name | Required | Description | Default |
|---|---|---|---|
| power_w | Yes | Power in watts | |
| hours_day | Yes | Hours used per day | |
| price_kwh | No | EUR per kWh |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description discloses no behavioral traits such as whether this performs a pure calculation, what currency assumptions apply (beyond the EUR mention in parameter schema), or what return format to expect.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The five-word description is efficiently structured without filler, though its extreme brevity leaves significant contextual gaps that could have been addressed in one additional sentence.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With full schema coverage and three straightforward parameters, the description is minimally adequate. However, given no output schema and no annotations, it should ideally specify the calculation methodology or return value format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage, documenting power consumption, daily usage hours, and price per kWh. The description adds no supplemental parameter guidance, meeting the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool calculates annual electricity costs specifically for appliances. However, it fails to distinguish this tool from the sibling 'calculate_electricity_cost', leaving ambiguity about when to use this specific variant.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like 'calculate_electricity_cost'. No prerequisites, conditions, or explicit use cases are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ellipseCInspect
Ellipse area and circumference
| Name | Required | Description | Default |
|---|---|---|---|
| a | Yes | Semi-major axis | |
| b | Yes | Semi-minor axis |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided and description carries full burden. Fails to disclose calculation method (Ramanujan approximation vs exact formula), units, precision, or return structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at four words with zero redundancy or waste. However, the extreme brevity results in underspecification rather than optimal conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Minimal but functional for a simple two-parameter geometric tool. Absence of output schema or return value description leaves minor gap, though tool complexity is low.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions ('Semi-major axis', 'Semi-minor axis'). Description adds no parameter semantics beyond schema, warranting baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Ellipse area and circumference' lacks an action verb (calculate/compute) and merely labels the outputs. It does not explicitly state what the tool does or distinguish from generic calculators like calculate_area.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus other geometric area calculators (e.g., calculate_area) or prerequisites for the semi-axis inputs.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_emergency_fundCInspect
Calculate recommended emergency fund target
| Name | Required | Description | Default |
|---|---|---|---|
| dependents | Yes | Number of dependents | |
| job_stability | Yes | Job stability level | |
| monthly_expenses | Yes | Monthly expenses EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to describe the calculation methodology (e.g., 3-6 months of expenses adjusted for stability/dependents), output format (EUR amount), or safety characteristics (read-only operation), leaving significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficiently structured with the action verb first and no redundant words. However, given the absence of annotations and output schema, this extreme brevity becomes under-specification rather than effective conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having 100% input schema coverage, the tool lacks annotations and output schema. For a financial calculation implementing specific risk-adjusted logic, the description omits critical context: the calculation formula, output currency/amount semantics, and interpretation guidelines, rendering it incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('Monthly expenses EUR', 'Number of dependents', 'Job stability level'), the schema fully documents parameters. The description adds no supplemental parameter guidance (e.g., how job_stability weights the calculation), meriting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the resource ('recommended emergency fund target'), distinguishing it from generic savings calculators like 'calculate_savings_goal'. However, it misses the opportunity to clarify the scope (e.g., time-based coverage in months) or methodology that would fully differentiate it from other financial planning tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'calculate_savings_goal' or 'calculate_retirement_savings_gap'. There are no prerequisites, conditions, or exclusions mentioned that would help an agent determine applicability.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_employer_cost_frDInspect
Total employer cost France
| Name | Required | Description | Default |
|---|---|---|---|
| gross_monthly | Yes | Monthly gross salary EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of disclosure yet reveals nothing about the calculation methodology, what charges are included (CNSS, provident, transport, etc.), or the return value format. It does not indicate whether this is a complex estimate or a fixed-rate calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of only four words. This represents under-specification rather than effective conciseness, as no meaningful information is conveyed beyond the tool name itself. The front-loading is moot given the total lack of detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of numerous sibling calculation tools including country-specific payroll tools, the description should clarify how this differs from calculating gross/net salary. With no output schema and no return value description, the definition is incomplete even for a single-parameter tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the single parameter 'gross_monthly' fully documented as 'Monthly gross salary EUR'. The description adds no semantic information about this parameter (e.g., whether it's annualized, if there's a minimum threshold), but the baseline score of 3 applies due to high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Total employer cost France' restates the tool name with minimal expansion. It fails to specify what components constitute 'employer cost' (e.g., gross salary + social contributions + taxes) and does not distinguish this tool from the sibling 'calculate_french_salary' or 'calculate_auto_entrepreneur', leaving the scope ambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'calculate_french_salary'. There is no mention of prerequisites, expected input format constraints beyond the schema, or scenarios where this calculation would be needed.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_energy_physicsCInspect
Calculate kinetic (½mv²), potential (mgh), mass-energy (E=mc²), or work (F·d)
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Energy type | |
| force_n | No | Force N (work) | |
| mass_kg | No | Mass in kg | |
| height_m | No | Height m (potential) | |
| distance_m | No | Distance m (work) | |
| velocity_ms | No | Velocity m/s (kinetic) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose output format/units, error handling for invalid parameter combinations (e.g., missing force for work calculation), or whether results include intermediate steps. No mention of side effects or state mutation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence efficiently packs four formulas and the resource type. Front-loaded with action verb 'Calculate'. Compact but sacrifices necessary context about parameter dependencies and output format due to extreme brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema and conditional parameter logic (different required fields per energy type), the description should explain return values and parameter relationships. Currently silent on output units (Joules?), conditional requirements, and error cases, leaving significant gaps for a 6-parameter tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description maps enum values (kinetic, potential, emc2, work) to physics formulas, adding conceptual context. However, it does not explain conditional parameter requirements (which fields are needed for each calculation type) beyond the basic schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific calculation types (kinetic, potential, mass-energy, work) with formulas, making the multi-purpose nature clear. Distinguishes implicitly from single-purpose sibling `calculate_kinetic_energy` by showing it handles four physics domains, though explicit differentiation would strengthen this.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this multi-purpose tool versus specialized siblings like `calculate_kinetic_energy` or `convert_energy`. No mention of prerequisites or parameter dependencies (e.g., which fields are required for each energy type).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_equationBInspect
Solve 1st degree (ax+b=0) or 2nd degree (ax²+bx+c=0) equations
| Name | Required | Description | Default |
|---|---|---|---|
| a | Yes | Coefficient a | |
| b | Yes | Coefficient b | |
| c | No | Coefficient c (for degree 2) | |
| degree | Yes | Equation degree |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description offers minimal behavioral disclosure beyond input formatting. It omits critical computation details: return format (roots vs discriminant), handling of complex/imaginary solutions, division-by-zero protection, or whether results are approximate/exact.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single dense sentence with zero waste; front-loaded with the core operation. However, appropriate sizing is questionable given the behavioral gaps, though this deficiency falls more under Contextual Completeness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 100% schema coverage, the parametric side is addressed, but the description remains incomplete regarding output schema (absent) and sibling differentiation. Adequate as a minimal reference, but clear gaps exist for safe invocation without surprise outputs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
While schema coverage is 100% (baseline 3), the description adds meaningful context by presenting the equation structures 'ax+b=0' and 'ax²+bx+c=0', clarifying the mathematical relationship between parameters a, b, and c that raw field descriptions lack.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a clear verb ('Solve') and specific resource ('1st degree... or 2nd degree... equations') with mathematical notation. However, it fails to explicitly distinguish from the sibling tool 'calculate_quadratic_equation' despite handling overlapping 2nd-degree functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus the specialized 'calculate_quadratic_equation' sibling, nor does it mention prerequisites (e.g., handling cases where a=0 or discriminant <0).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ev_charging_costCInspect
Electric vehicle charging cost and time
| Name | Required | Description | Default |
|---|---|---|---|
| price_kwh | No | Price per kWh | |
| target_pct | No | Target charge % | |
| battery_kwh | Yes | Battery capacity kWh | |
| current_pct | Yes | Current charge % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Claims to calculate 'time' in addition to cost, but with no charging power parameter provided, the agent cannot verify this without hidden assumptions. Since no annotations exist, the description should disclose calculation methodology or assumed constants (e.g., charging rate) to be transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief (6 words) with no redundancy. Front-loaded with key terms. However, brevity sacrifices necessary behavioral context for a tool claiming dual outputs.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Minimal but adequate for a 4-parameter calculator. Mentions both output values (cost and time) compensating for the missing output schema, though lacks calculation assumptions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. The description does not add param semantics, but baseline 3 applies per guidelines for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the domain (EV charging) and outputs (cost and time) but uses a noun phrase without an action verb (e.g., 'Calculate'). Distinguishes from siblings like calculate_electricity_cost by specifying 'Electric vehicle'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use versus siblings like calculate_electricity_cost or calculate_fuel_consumption, nor any prerequisites or assumptions required for the calculation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_excavationBInspect
Calculate excavation and swelled volume for earthwork
| Name | Required | Description | Default |
|---|---|---|---|
| depth_m | Yes | Depth in meters | |
| width_m | Yes | Width in meters | |
| length_m | Yes | Length in meters | |
| soil_type | No | Soil type (swell: normal=1.25, rocky=1.50, clay=1.30) | normal |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions 'swelled volume' (implying soil expansion behavior) but does not explain the calculation methodology, swell factors, or what the tool returns. It does not mention side effects, authentication requirements, or rate limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence (8 words) with the action verb front-loaded. There is no redundant or wasted text, making it highly concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 4-parameter input with 100% schema coverage and no output schema, the description adequately covers the core purpose but lacks information about return values (units, format) or calculation specifics that would complete the agent's understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description does not add parameter semantics beyond the schema (which already documents units and soil type swell factors), but this is acceptable given the comprehensive schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool calculates 'excavation and swelled volume for earthwork,' providing a specific verb (calculate), resources (excavation/swelled volume),and domain (earthwork). It implicitly distinguishes from generic volume calculators by referencing 'swelled volume,' though it could explicitly differentiate from siblings like calculate_volume.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as calculate_volume or calculate_concrete_volume. There are no 'when to use,' 'when not to use,' or prerequisite instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_exchange_marginDInspect
Hidden exchange rate margin
| Name | Required | Description | Default |
|---|---|---|---|
| bank_rate | Yes | Bank/bureau rate | |
| market_rate | Yes | Mid-market rate |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but provides none. It does not explain what 'hidden margin' means in output terms (percentage points? percentage markup? cost analysis?), what formula is used, or whether the result indicates favorable/unfavorable rates.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief at only four words, this is underspecification rather than efficient conciseness. No sentence earns its place because the fragment provides insufficient actionable information for an agent to understand the tool's function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a financial calculation tool with 2 parameters and no output schema or annotations, the description is grossly incomplete. It should explain the calculation methodology, what the 'hidden' aspect refers to (cost concealed in poor exchange rates), and what the return value represents.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (bank_rate and market_rate both described in schema). The description adds no parameter information, but the baseline is 3 when schema coverage is high, per the rubric.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Hidden exchange rate margin' is a noun phrase lacking an action verb (calculate, compute, compare). It repeats the concept from the tool name without clarifying what calculation is performed (percentage markup? absolute spread? markup analysis?), and fails to differentiate from siblings like calculate_currency_exchange or calculate_markup_margin.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus calculate_currency_exchange or other financial calculation tools. No mention of prerequisites (e.g., needing both rates) or expected use cases (comparing bank offers, detecting hidden fees).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_exchange_rate_marginBInspect
Calculate the margin charged on a currency exchange
| Name | Required | Description | Default |
|---|---|---|---|
| bank_rate | Yes | Rate offered by bank/exchange | |
| mid_market_rate | Yes | Mid-market (real) exchange rate |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. Fails to specify what 'margin' represents (percentage spread, basis points, absolute difference), whether calculation is idempotent, or expected output format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 7-word sentence with zero waste. Purpose stated immediately at front. No redundant or filler content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers basic calculation intent for simple 2-parameter tool, but lacks output specification (what units/format the margin is returned in) which would complete the contract given no output schema exists.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with clear definitions ('Rate offered by bank/exchange', 'Mid-market (real) exchange rate'). Description adds no parameter semantics, but baseline 3 applies since schema does the work.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Calculate) and resource (margin on currency exchange). However, fails to differentiate from nearly identical sibling tool 'calculate_exchange_margin' available in the same server.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus siblings like 'calculate_exchange_margin' or 'calculate_currency_exchange'. No prerequisites or alternative selection criteria provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_expected_value_betBInspect
Calculate expected value and profitability of a bet or investment decision
| Name | Required | Description | Default |
|---|---|---|---|
| bet_cost | No | Upfront cost to enter the bet (default 0) | |
| win_amount | Yes | Net amount won if outcome is positive | |
| loss_amount | Yes | Net amount lost if outcome is negative | |
| win_probability | Yes | Probability of winning (0 to 1) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, leaving full burden on description. The description fails to disclose the calculation methodology (how expected value is computed from the inputs), whether outputs are raw numbers or formatted strings, or how bet_cost interacts with win_amount (gross vs net). No mention of return value structure given the absence of output schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero redundancy. Front-loaded with specific action and object, earning full marks for appropriate sizing.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 4-parameter calculation tool with complete input schema, but gaps remain: no output schema exists, yet description does not specify what values are returned (expected value only? profitability percentage? both?). Missing behavioral details given zero annotation coverage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. The tool description adds minimal semantic value beyond the schema—while it frames the parameters within 'bet or investment decision' context, it does not clarify parameter relationships (e.g., whether win_amount includes the bet_cost return or just net profit). Baseline 3 appropriate when schema carries the definition.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and resource ('expected value and profitability') with specific domain context ('bet or investment decision'). It distinguishes from generic financial calculators by specifying the EV/profitability focus, though it could explicitly differentiate from siblings like calculate_roi or calculate_break_even.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this versus alternatives like calculate_roi, calculate_profit_margin, or calculate_break_even. The description only states the domain (bet/investment) without conditions, prerequisites, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_exposure_triangleAInspect
Calculate the missing exposure value (aperture, shutter speed or ISO) given the other two
| Name | Required | Description | Default |
|---|---|---|---|
| iso | Yes | ISO sensitivity value | |
| aperture | Yes | Aperture f-number | |
| shutter_speed | Yes | Shutter speed in seconds (e.g. 0.004 for 1/250s) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full disclosure burden. It fails to state whether the operation is read-only (though implied), how to specify which of the three parameters should be calculated when the schema requires all three inputs, what the return format looks like, or any error conditions for invalid exposure combinations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero waste. The core value proposition (calculate missing exposure value) is front-loaded immediately, followed by the specific parameter list and usage logic ('given the other two').
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While the description covers the core calculation logic, it leaves ambiguity regarding the input requirement contradiction (schema requires 3 params, description implies providing 2) and provides no hint about the return value structure, which is necessary given the lack of output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for each parameter. The description adds crucial semantic context by explaining that these three parameters form an interdependent 'exposure triangle' where one can be derived from the others—a conceptual relationship not captured in the isolated schema field descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the specific action (Calculate), the target resource (missing exposure value), and the exact parameters involved (aperture, shutter speed, ISO). It effectively distinguishes this photography-specific tool from the hundreds of other calculate_* siblings by referencing the 'exposure triangle' domain concept.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'given the other two' implies usage context (when you know two of three values), providing implicit guidance. However, it lacks explicit when-not-to-use rules, prerequisite explanations (e.g., that exactly two values must be valid/known), or references to sibling photography tools like calculate_depth_of_field.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_fabric_neededCInspect
Calculate fabric meters needed for a garment
| Name | Required | Description | Default |
|---|---|---|---|
| size | Yes | Garment size | |
| garment_type | Yes | Garment type | |
| fabric_width_cm | Yes | Fabric roll width cm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full burden. While it specifies the output unit (meters), it omits critical domain context: whether calculations include seam allowances, pattern matching, nap/directional requirements, or if it simply computes basic area division. No mention of return format or precision.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 7 words with no filler. However, the brevity comes at the cost of omitting useful domain context (e.g., 'including pattern layout considerations' or 'returns linear meters'). Appropriate length for the schema complexity but could front-load key assumptions.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a garment calculation tool with 3 parameters, the description covers the basic operation but inadequately addresses the domain complexity. Should specify if this calculates linear meters (length) of fabric required given the roll width, and whether it accounts for pattern pieces layout, matching, or shrinkage—critical information for garment construction planning.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear parameter descriptions. The description mentions 'meters' (output unit) which complements the schema's 'fabric_width_cm' (input unit), but provides no additional semantic context about garment sizing conventions or fabric width standards.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb (Calculate), resource (fabric meters), and target (garment). However, it fails to distinguish from the sibling tool `calculate_fabric_yardage` (which likely calculates in yards vs. meters) or `calculate_curtain_fabric` despite 'curtains' being a valid enum value for garment_type.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this specific tool versus alternatives like `calculate_fabric_yardage` or domain-specific tools like `calculate_curtain_fabric`. No mention of prerequisites or assumptions about pattern layout.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_fabric_yardageAInspect
Calculate fabric needed for a garment in meters (includes 10% for pattern matching)
| Name | Required | Description | Default |
|---|---|---|---|
| size | Yes | ||
| garment | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and successfully discloses the 10% automatic buffer for pattern matching and the output unit (meters). However, it omits mention of read-only safety or behavior with unsupported garment types.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single efficient sentence with the core action front-loaded. The parenthetical '(includes 10% for pattern matching)' adds crucial behavioral detail without verbosity—every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter calculation tool with no output schema or annotations, the description covers the primary calculation logic and units but is incomplete regarding parameter semantics, which is critical given the 0% schema coverage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description fails to sufficiently compensate: it implicitly references the 'garment' parameter but provides no context for the 'size' parameter, its enum values (S, M, L, XL), or how these dimensions affect the calculation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb ('Calculate'), resource ('fabric'), and unit ('meters'), while distinguishing itself from sibling tools like 'calculate_curtain_fabric' and 'calculate_fabric_needed' by specifying it's for 'garment' and including a unique 10% buffer for pattern matching.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage through the 'garment' focus and the 10% buffer note, but lacks explicit guidance on when to use this versus siblings like 'calculate_fabric_needed' or prerequisites for the pattern matching allowance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_factorial_permutationCInspect
Calculate factorial, permutations P(n,r), and combinations C(n,r)
| Name | Required | Description | Default |
|---|---|---|---|
| n | Yes | n value | |
| r | No | r value for P(n,r) and C(n,r) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description fails to disclose key behavioral traits: it doesn't specify whether the tool returns all three calculations or selects based on input presence, doesn't warn about the computational limit (n ≤ 170), and doesn't describe output format or handling of invalid inputs (e.g., r > n).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no redundancy or filler. However, the lack of structure (e.g., separating the three calculation modes or noting parameter conditionality) makes it dense and less scannable for an agent trying to determine input requirements.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and no annotations, the description inadequately describes what the tool returns or how the optional r parameter triggers different calculation modes. It should clarify that factorial requires only n, while P and C require both n and r.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with basic descriptions ('n value', 'r value for P(n,r) and C(n,r)'). The description mentions P(n,r) and C(n,r) which connects to the r parameter semantics, but adds minimal meaningful context about the relationship between parameters (n ≥ r) or that r is optional for factorial-only mode.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies three specific mathematical operations (factorial, permutations P(n,r), combinations C(n,r)) that distinguish this tool from the many generic calculate_* siblings. However, it lacks explicit differentiation from general mathematical tools like calculate_equation or calculate_anything.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this vs. alternatives, nor does it explain that r is optional and required only for permutations/combinations but not for factorial calculation. No mention of constraints like r ≤ n.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_fertilizer_npkBInspect
Calculate NPK fertilizer quantities needed based on crop type and soil type
| Name | Required | Description | Default |
|---|---|---|---|
| crop_type | Yes | Type of crop to fertilize | |
| soil_type | Yes | Type of soil | |
| surface_m2 | Yes | Surface area in square meters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It fails to disclose what the output contains (e.g., kg of Nitrogen/Phosphorus/Potassium), units of measurement, rate limits, or whether the calculation includes application frequency guidance.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with 9 words. Front-loaded action verb ('Calculate'), zero redundancy, efficient structure. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a calculation tool with well-documented inputs, but lacks description of the return values (critical given no output schema exists). Does not mention that results are likely in weight units (kg) or specify the NPK ratio format returned.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description mentions 'crop type and soil type' which adds semantic context mapping to two parameters, though it omits `surface_m2`. Since the schema fully documents all three parameters, no additional compensation is required.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Uses specific verb 'Calculate' and resource 'NPK fertilizer quantities' with clear inputs (crop type, soil type). However, it does not explicitly differentiate from siblings like `calculate_garden_soil` or `calculate_compost_volume`, which could be confused with this soil-amendment tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus related agricultural calculators like `calculate_garden_soil`, `calculate_seed_quantity`, or `calculate_soil_ph_amendment`. No alternatives or prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_fish_tank_heaterBInspect
Calculate aquarium heater wattage needed
| Name | Required | Description | Default |
|---|---|---|---|
| room_temp_c | Yes | Room temperature °C | |
| target_temp_c | Yes | Target water temperature °C | |
| volume_liters | Yes | Tank volume liters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of disclosure. It fails to mention calculation assumptions (heat loss factors, safety margins), whether the result is approximate, or the output format/structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient at five words with zero redundancy. The description is appropriately front-loaded and sized for a straightforward calculation tool with clear inputs.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple three-parameter calculation tool, but minimal given the lack of output schema and annotations. It does not explain the return value format or calculation methodology, which would help for a domain-specific utility.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all three parameters are documented), establishing baseline 3. The description adds no parameter-specific context, but none is required given the complete schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and resource ('aquarium heater wattage'), clearly stating the tool's function. However, it does not explicitly differentiate from siblings like 'calculate_aquarium_volume' or 'calculate_water_heater_size', though the domain is distinct.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives, nor any prerequisites or conditions for use. The description states only what the tool does, not when to invoke it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_flight_distanceBInspect
Calculate great-circle distance between two coordinates
| Name | Required | Description | Default |
|---|---|---|---|
| lat1 | Yes | Departure latitude | |
| lat2 | Yes | Arrival latitude | |
| lon1 | Yes | Departure longitude | |
| lon2 | Yes | Arrival longitude |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It identifies the mathematical model ('great-circle') which implies a spherical Earth approximation, but fails to specify the return value format, units (nautical miles, km, etc.), precision, or whether the calculation is read-only/safe.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficiently front-loaded with the verb 'Calculate' and contains no redundant words. Every term serves a purpose: 'great-circle' specifies the algorithm and 'two coordinates' identifies the inputs.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complete input schema (4 well-documented parameters) and lack of output schema, the description is minimally adequate but incomplete. It should specify the output units and format (e.g., 'returns distance in kilometers') to fully inform agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear 'Departure/Arrival' labels for each lat/lon pair. The description adds only the phrase 'two coordinates' which maps to the four parameters implicitly but adds no syntax clarification, validation rules, or formatting details beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifies the exact algorithm ('great-circle distance') and the operands ('two coordinates'), which distinguishes it from sibling tools like calculate_distance_2d or calculate_distance_3d. However, it does not explicitly state that these are geographic coordinates or that this is intended for aviation/flight planning, which would further clarify scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to select this tool versus alternatives such as calculate_distance_2d (Cartesian) or calculate_distance_3d. It omits the typical use case (flight planning, navigation) and does not mention prerequisites or expected input ranges beyond what the schema enforces.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_floor_areaBInspect
Calculate total floor area and Carrez habitable area from a list of rooms
| Name | Required | Description | Default |
|---|---|---|---|
| rooms | Yes | Rooms with length and width in meters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully indicates the dual-output nature (total floor area AND Carrez area), but lacks disclosure on what 'Carrez' entails (e.g., height exclusions <1.80m), whether the calculation is immutable, or what the return structure looks like (object with two fields? array?).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with zero redundancy. Every word serves to identify the operation ('Calculate'), the dual outputs ('total floor area and Carrez habitable area'), and the input mechanism ('from a list of rooms').
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (single array parameter) and high schema coverage, the description is functionally complete. It appropriately mentions both calculated values since no output schema exists to document the return fields. A minor gap is the lack of explanation for Carrez-specific calculation rules, which may be necessary for legal compliance contexts.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions 'from a list of rooms', which aligns with the schema but does not supplement it with semantic details about room validity (e.g., whether dimensions should include walls) or the expected units beyond what the schema property names and descriptions already convey.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Calculate') and resources ('total floor area', 'Carrez habitable area'), and identifies the input source ('list of rooms'). It distinguishes itself from generic siblings like 'calculate_area' by specifying the French Carrez legal standard. However, it does not explicitly differentiate from the sibling 'calculate_surface_carrez', which could create selection ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., 'calculate_surface_carrez' or 'calculate_area'). It omits prerequisites, such as whether room dimensions must be net or gross, or whether intermediate spaces like hallways should be included.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_flow_rate_convertBInspect
Convert flow rate between L/min, L/h, m³/h, GPM, CFM
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Flow rate value | |
| to_unit | Yes | Target unit | |
| from_unit | Yes | Source unit |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses nothing about precision/rounding behavior, validation constraints beyond the schema, or whether the operation is idempotent. Merely stating 'Convert' is minimal behavioral disclosure for a tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 9 words. No wasted words, but arguably too minimal given lack of annotations and output schema. Front-loaded with the action but could benefit from a second sentence covering output or constraints.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate but minimal. With no output schema, description omits what the tool returns (converted value). With no annotations, omits safety/read-only status. Sufficient for a simple conversion utility but leaves gaps for agent reasoning about result interpretation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds value by mapping technical enum codes (l_min, m3_h) to human-readable display units (L/min, m³/h), clarifying the parameter semantics beyond the schema's generic 'Source unit' and 'Target unit' descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb 'Convert' and resource 'flow rate'. Lists supported units (L/min, L/h, etc.) which clarifies scope. However, lacks explicit differentiation from sibling 'calculate_pipe_flow_rate' (which calculates flow rate from pipe parameters) and 'convert_volume' (which handles static volume).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives like 'convert_volume' or when flow rate conversion is needed versus other calculation tools. No mention of prerequisites or appropriate contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_food_cost_per_servingBInspect
Calculate recipe cost per serving from ingredient prices and quantities
| Name | Required | Description | Default |
|---|---|---|---|
| servings | Yes | ||
| ingredients | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It discloses input requirements (prices and quantities) but omits behavioral details like error handling (e.g., if used_quantity > total_quantity), output format, or whether results are cached. Adequate but minimal for a calculation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb, zero redundancy. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero schema descriptions and nested object complexity (ingredients with 4 required sub-fields), the description is minimally viable but incomplete. It fails to explain the relationship between total_quantity and used_quantity, and lacks output schema coverage, leaving gaps for a tool of this complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, requiring description to compensate. It mentions 'prices and quantities' which semantically maps to parameters, but doesn't clarify the nested ingredient structure (array of objects) or distinguish between 'total_quantity' and 'used_quantity', leaving significant ambiguity despite providing basic semantic context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and resource ('recipe cost per serving'). The phrase 'from ingredient prices and quantities' specifies the domain, though it doesn't explicitly distinguish from siblings like 'calculate_recipe_nutrition' or 'calculate_recipe_scale'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. While it implies prerequisite data (ingredient prices and quantities), it lacks explicit 'when to use' or 'when not to use' instructions relative to the many sibling calculation tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_forceCInspect
Newton's 2nd law: F=ma
| Name | Required | Description | Default |
|---|---|---|---|
| force_n | No | Newtons | |
| mass_kg | No | Mass kg | |
| acceleration | No | m/s² |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral disclosure burden, but only provides the formula notation. It fails to explain critical behavior: that all parameters are optional (required: 0) suggests the tool solves for the missing variable, but this calculation logic, error handling for insufficient inputs, or output format remain undisclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise with zero redundancy and front-loaded formula. However, given the complexity of a three-parameter physics solver with optional inputs, this level of brevity sacrifices necessary behavioral clarity, making it arguably too terse rather than appropriately sized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Inadequate for a tool with three optional parameters and no output schema. The description fails to explain the calculation logic (solving for any missing variable given the other two), expected precision, or return structure. Critical gaps exist regarding how the tool handles partial input combinations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (force_n: Newtons, mass_kg: Mass kg, acceleration: m/s²), so baseline score applies. The description maps F=ma to these parameters implicitly but adds no additional semantic context, validation rules, or value constraints beyond the schema's type information.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the underlying physics principle (Newton's 2nd law: F=ma) which implies calculating force, but doesn't explicitly clarify the tool's specific function or distinguish from sibling physics calculators like calculate_kinetic_energy or calculate_energy_physics. It hints at the domain but lacks specificity about what exactly gets computed.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives, nor does it explain prerequisites for the calculation (e.g., needing any two variables to solve for the third). No mention of related tools or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_fractionCInspect
Perform fraction operations: add, subtract, multiply, divide, simplify
| Name | Required | Description | Default |
|---|---|---|---|
| den1 | Yes | Denominator 1 | |
| den2 | No | Denominator 2 | |
| num1 | Yes | Numerator 1 | |
| num2 | No | Numerator 2 | |
| operation | Yes | Operation |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full disclosure burden. It fails to mention behavioral traits such as error handling (e.g., division by zero), whether results are automatically simplified, return value format, or the idempotent/safe nature of the operations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no redundant words. However, it may be overly terse for the tool's complexity—it lacks the second sentence needed to explain behavioral nuances or parameter dependencies, preventing a score of 5.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema and annotations, the description should explain return values and operation-specific parameter requirements. It minimally identifies the tool's domain but leaves significant gaps regarding how the 5 parameters interact differently across the 5 operation types.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema adequately documents individual parameters (num1, den1, operation, etc.). The description lists the operations matching the enum but adds no semantic depth regarding parameter relationships (e.g., that num1/den1 form the first fraction) or conditional requirements.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb 'Perform' and clearly identifies the resource 'fraction operations' along with the specific supported operations (add, subtract, multiply, divide, simplify). However,it fails to distinguish this tool from the sibling 'calculate_fraction_operations', leaving ambiguity about which tool to select.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus 'calculate_fraction_operations' or other calculate_* tools. It also omits critical usage constraints, such as which parameters are required for specific operations (e.g., that 'simplify' only requires the first fraction while binary operations require both).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_fraction_operationsCInspect
Perform arithmetic operations on two fractions
| Name | Required | Description | Default |
|---|---|---|---|
| den1 | Yes | Denominator of first fraction | |
| den2 | Yes | Denominator of second fraction | |
| num1 | Yes | Numerator of first fraction | |
| num2 | Yes | Numerator of second fraction | |
| operation | Yes | Operation to perform |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose return format (simplified fraction? decimal?), error handling for zero denominators, or confirmation that this is a read-only calculation with no side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of appropriate length. No wasted words, though given the lack of annotations and output schema, the extreme brevity leaves significant documentation gaps.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Without an output schema, the description fails to specify what the tool returns (e.g., reduced fraction, improper/mixed number, decimal). Additionally provides no behavioral details despite zero annotation coverage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for all 5 parameters (num1, den1, num2, den2, operation). The description adds no additional semantic context beyond the schema (e.g., constraints on values, format expectations), warranting baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Perform' and resource 'fractions', and inherently distinguishes from sibling 'calculate_fraction' by specifying 'two fractions' and 'operations' (plural), implying binary operations vs single fraction manipulation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus 'calculate_fraction' or other arithmetic tools. No mention of prerequisites (e.g., non-zero denominators) or when to prefer decimal calculations instead.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_freezer_durationCInspect
Return maximum recommended freezer storage duration for a food type
| Name | Required | Description | Default |
|---|---|---|---|
| food_type | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden but fails to specify return value format (days? months? string?), data sources for recommendations, or safety caveats. The term 'Return' is vague regarding output structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 9 words front-loaded with the action verb 'Return'. No redundant text, though extreme brevity contributes to information gaps in other dimensions.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero schema descriptions, no output schema, and no annotations, the description is insufficient. It omits return value type/units, the basis for 'recommended' durations (regulatory vs. general), and enum semantics.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, requiring the description to compensate. While it references 'food type' implying the parameter, it does not describe the enum values (raw_meat, cooked_meat, etc.) or explain that these categories have different storage limits. Baseline compensation is minimal.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns 'maximum recommended freezer storage duration' for a 'food type'—specific verb and resource. However, it does not explicitly distinguish from sibling 'calculate_freezer_thaw_time', which could be confused as related functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., calculate_freezer_thaw_time) or prerequisites. It simply states the function without contextual usage boundaries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_freezer_thaw_timeBInspect
Estimate thawing time for frozen food by weight and method
| Name | Required | Description | Default |
|---|---|---|---|
| method | Yes | Thawing method | |
| weight_kg | Yes | Food weight kg |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It only reveals that outputs are 'Estimate's (approximations) but fails to specify output format (hours/minutes), safety-critical behaviors (e.g., that microwave thawing requires immediate cooking), or whether results follow USDA/FDA guidelines.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence efficiently conveys the tool's purpose and required inputs. Information is front-loaded with no redundant phrases or tautological restatements of the tool name.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a low-complexity tool with two well-documented parameters, but leaves gaps given the absence of annotations and output schema. Specifically lacks disclosure of return units and food safety warnings that would be essential for a thawing time calculator.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear property descriptions ('Food weight kg', 'Thawing method') and enum constraints. The description references parameters ('by weight and method') but adds no semantic clarification beyond what the schema already explicitly defines, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses the specific verb 'Estimate' and clearly identifies the resource as 'thawing time for frozen food'. It implicitly distinguishes from siblings like calculate_freezer_duration (storage time) and calculate_cooking_time via domain-specific terminology 'thawing', though it lacks explicit cross-referencing to siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., cooking from frozen), nor does it mention prerequisites or safety considerations required for thawing food. It only states what the tool calculates without contextual usage boundaries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_french_income_taxAInspect
Calculate French income tax (IR) for 2026 using progressive brackets per Article 197 CGI with family quotient system
| Name | Required | Description | Default |
|---|---|---|---|
| parts | No | Number of fiscal shares (1=single, 2=married, +0.5 per child) | |
| income | Yes | Annual net taxable income in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses calculation methodology (progressive brackets, family quotient system) but omits other behavioral traits like idempotency, data persistence, rate limits, or authentication requirements. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single dense sentence with zero waste. Front-loads the action ('Calculate French income tax') and packs essential qualifying details (year, legal article, calculation method) without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for a 2-parameter calculation tool with 100% schema coverage. Provides sufficient domain specificity for tool selection, though lacks output schema description (not required if absent) and behavioral disclosures that annotations would normally cover.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with full parameter descriptions. The description adds valuable domain context by mentioning 'family quotient system', which semantically enriches understanding of the 'parts' parameter beyond the schema's technical description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with explicit resource 'French income tax (IR)' and distinguishes from siblings via specific legal references (Article 197 CGI), year (2026), and unique method (family quotient system), clearly targeting French tax versus Belgian, US, or Canadian variants.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear domain context ('French', '2026', 'Article 197 CGI') that implicitly guides selection for French tax calculations, but lacks explicit 'when-not-to-use' or named sibling alternatives (e.g., does not reference calculate_belgian_income_tax as an alternative).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_french_salaryAInspect
Convert French gross salary to net salary for 2026 (cadre, non-cadre, or civil servant). Returns monthly/annual net, social contributions, employer cost
| Name | Required | Description | Default |
|---|---|---|---|
| status | No | Employment status | cadre |
| gross_monthly | Yes | Gross monthly salary in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully discloses the output structure ('Returns monthly/annual net, social contributions, employer cost') which compensates for missing output schema, and specifies the 2026 tax regime. However, it omits other behavioral details like mutability, idempotency, or error handling characteristics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two tightly constructed clauses with zero redundancy. Front-loaded with the core action, followed by scope limitations (2026, categories) and output description. Every word serves a purpose; appropriate length for the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 2 simple parameters, no annotations, and no output schema, the description is nearly complete. It explains the calculation domain (2026 French rules) and output values. Could marginally improve by noting the required nature of gross_monthly or default status value, but schema covers these adequately.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters fully described. The description adds value by translating the enum value 'fonction_publique' to 'civil servant' for clarity and reinforcing the 'gross monthly' concept. The baseline is 3; this earns a 4 for providing contextual meaning (2026 applicability) and human-readable translations of enum values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States a specific action ('Convert') and resource ('French gross salary to net salary') with clear scope including year (2026) and employee categories (cadre, non-cadre, civil servant). Implicitly distinguishes from other geographic salary calculators (Belgian, Swiss) but does not explicitly differentiate from related French tools like calculate_employer_cost_fr.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides temporal context ('for 2026') and target audiences (specific employment statuses), implying when to use it. However, lacks explicit when-not-to-use guidance or named alternatives for other salary-related calculations (e.g., 'use calculate_french_income_tax for tax-specific queries').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_french_vatAInspect
Calculate French VAT (TVA) — convert between HT (before tax) and TTC (after tax). Supports all 4 French VAT rates
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Input mode: ht=before tax, ttc=after tax | ht |
| rate | No | VAT rate percentage | 20 |
| amount | Yes | Amount in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Explains the conversion behavior (HT↔TTC) and scope (4 French rates), but omits return value format given no output schema exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action, em-dash for secondary detail, zero waste. Every clause earns its place by defining scope or behavior.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a straightforward calculation tool with 100% schema coverage. Missing output description is a minor gap, but the calculation purpose is sufficiently clear from the description and schema combined.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions. Tool description adds valuable context linking HT/TTC terminology to the mode parameter and specifying the 4 French rates constraint, enriching semantic understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb (Calculate) + resource (French VAT/TVA), with French-specific terminology (HT/TTC) that distinguishes it from generic VAT tools (calculate_vat_generic) and other country-specific siblings (calculate_belgian_vat, etc.).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear contextual boundaries (French-specific, 4 rates), implying when to use it, but does not explicitly name alternatives like calculate_vat_generic or calculate_vat_reverse for reverse calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_frequency_noteAInspect
Calculate the frequency of a musical note based on equal temperament tuning
| Name | Required | Description | Default |
|---|---|---|---|
| note | Yes | Note name in chromatic scale | |
| octave | Yes | Octave number (A4 = concert pitch reference) | |
| tuning_reference | No | Tuning reference frequency in Hz (default A4=440Hz) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so description carries full disclosure burden. It specifies 'equal temperament tuning' (a specific 12-tone mathematical system), but fails to disclose return value format, units (Hz), precision/rounding behavior, or that this is a pure computation with no side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of nine words with zero redundancy. Every word earns its place; front-loaded with the action (Calculate) and specific domain context (musical note, equal temperament).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Input parameters are fully documented in schema with 100% coverage, but tool lacks output schema and description fails to specify return value format, units, or error conditions. Adequate for simple calculation but has clear gaps regarding output specification.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3 per rubric. Description adds minimal parameter-specific context beyond the schema, though mentioning 'equal temperament' implicitly clarifies the tuning_reference parameter's purpose.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with clear resource 'frequency of a musical note' and specifies the tuning system 'equal temperament', distinguishing it from the many generic calculation siblings like calculate_wavelength_frequency or calculate_bpm_to_ms.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus similar frequency or musical calculation siblings, nor any mention of prerequisites. Agent must infer usage solely from the name and schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_fuel_consumptionBInspect
Calculate fuel consumption in L/100km and MPG from distance and fuel used
| Name | Required | Description | Default |
|---|---|---|---|
| distance_km | Yes | Distance in km | |
| fuel_liters | Yes | Fuel consumed in liters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully indicates that the tool returns calculations in both L/100km and MPG formats, hinting at dual outputs. However, it does not specify precision, rounding behavior, or whether both values are always returned versus selectable output formats.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the action ('Calculate'), specifies both output units (L/100km and MPG), and indicates the inputs ('distance and fuel used') with zero extraneous text or wasted clauses.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter calculation tool with no output schema, the description is reasonably complete by specifying the dual output formats. However, given the existence of multiple related fuel calculation siblings, it could better clarify its specific role in computing from raw trip data versus converting existing efficiency values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both distance_km and fuel_liters have descriptions in the schema). The description adds semantic context by labeling these as 'distance' and 'fuel used', confirming they are the primary inputs, but does not add syntax details or constraint explanations beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific calculation (fuel consumption in L/100km and MPG) and the required inputs (distance and fuel used). By specifying raw trip data as inputs, it implicitly distinguishes this from sibling unit-conversion tools like calculate_fuel_economy_conversion, though it does not explicitly name those alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus the available sibling conversion tools (calculate_fuel_economy_conversion, convert_fuel_consumption). It does not mention prerequisites or when this calculation is appropriate versus simple unit conversion.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_fuel_costCInspect
Calculate fuel cost for a journey
| Name | Required | Description | Default |
|---|---|---|---|
| fuel_price | Yes | Price/liter | |
| consumption | Yes | L/100km | |
| distance_km | Yes | Distance km |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full disclosure burden. It fails to specify the return format (currency amount? total cost?), precision level, or computation method. It implies a pure calculation (no side effects) but doesn't confirm this.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 6 words with no redundancy. However, it may be excessively minimal given the crowded tool namespace—one additional sentence for sibling differentiation or output specification would improve utility without sacrificing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks critical context given no output schema exists: it doesn't specify what the tool returns (total cost? cost per kilometer?). Also fails to leverage the 'journey' context to explain the relationship between the three required parameters or distinguish from consumption-focused siblings.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear parameter descriptions (Distance km, L/100km, Price/liter). The description adds no semantic clarification beyond the schema, meeting the baseline for well-documented schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the clear verb 'Calculate' and resource 'fuel cost' with scope 'for a journey'. However, it fails to explicitly distinguish from sibling tool 'calculate_fuel_consumption', leaving potential ambiguity between cost (monetary) and consumption (volume) calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus siblings like 'calculate_fuel_consumption' or 'calculate_fuel_economy_conversion'. No prerequisites, exclusions, or selection criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_fuel_economy_conversionBInspect
Convert fuel economy between L/100km, MPG (US), MPG (UK) and km/L
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Fuel economy value to convert | |
| to_unit | Yes | Target unit of fuel economy | |
| from_unit | Yes | Source unit of fuel economy |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, and the description carries the full burden of behavioral disclosure. It states none of the following: whether the operation is pure/stateless, expected precision or rounding behavior, or how the inverse relationship between consumption and economy is handled. For a mathematical conversion tool with no side effects indicated, this is minimal disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the action ('Convert') and immediately lists all supported units. There is no redundant or wasted text; every element serves to clarify scope.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple conversion utility with three well-documented parameters and no output schema, the description is functionally complete. The units are explicitly listed, the value parameter is self-explanatory, and the expected output (converted fuel economy value) is implicit in the tool's stated purpose.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 100% coverage with basic descriptions ('Source unit', etc.), but the description adds critical semantic value by mapping the enum codes (l_100km, mpg_us) to human-readable formats ('L/100km', 'MPG (US)'). This helps the agent understand the valid options without parsing the schema enums.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the tool as a converter for fuel economy metrics and enumerates the four specific supported units (L/100km, MPG US/UK, km/L). However, it fails to distinguish from the sibling tool `convert_fuel_consumption`, which likely handles volume-per-distance (consumption) rather than distance-per-volume (economy) conversions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus siblings like `calculate_fuel_consumption` or `convert_fuel_consumption`. Users must infer that this is specifically for 'fuel economy' (efficiency) conversions based on the listed units, with no explicit mention of prerequisites or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_future_valueCInspect
Calculate future value of a present sum
| Name | Required | Description | Default |
|---|---|---|---|
| rate | Yes | Annual rate percent | |
| years | Yes | Number of years | |
| present_value | Yes | Present value EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to specify whether this uses simple or compound interest, the compounding frequency (despite schema implying annual), output format, or any side effects. It only restates the operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While the single sentence is not wasteful, it is inappropriately terse for a financial calculation tool with 100+ siblings. Key context is omitted; the brevity constitutes under-specification rather than efficient communication.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 3-parameter financial tool with no output schema and no annotations, the description fails to disclose calculation methodology (simple vs compound interest), currency handling specifics, or return value structure. Critical domain context is missing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear descriptions for all three parameters ('Present value EUR', 'Annual rate percent', 'Number of years'). The description adds no semantic detail beyond the schema, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a clear action ('Calculate') and resource ('future value of a present sum'), but it does not distinguish this tool from siblings like 'calculate_compound_interest' or 'calculate_present_value' despite having over 100 calculate_* siblings. It minimally expands the tool name without clarifying scope or specific use case.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like 'calculate_compound_interest' or 'calculate_present_value'. Given the sibling density and overlapping financial domains, explicit usage criteria are absent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_garden_soilCInspect
Calculate soil volume and bags needed
| Name | Required | Description | Default |
|---|---|---|---|
| width_m | Yes | Width m | |
| depth_cm | Yes | Depth cm | |
| length_m | Yes | Length m |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet description discloses nothing about calculation methodology, bag size assumptions, output units, or that this is a read-only calculation with no side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief (6 words) with no wasted content, though arguably underspecified given lack of annotations and output schema. Front-loaded with the core action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers the basic calculation intent but leaves critical gaps: undefined 'bag' standard size, no output format disclosure, and missing context to distinguish from similar soil/volume calculators in the bundle.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with basic descriptions ('Width m', 'Depth cm'). Description adds no additional meaning beyond the schema, meeting the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific outputs (soil volume and bags) with clear verb 'Calculate', but fails to differentiate from sibling tool 'calculate_raised_bed_soil' which likely performs similar calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus siblings like 'calculate_raised_bed_soil' or 'calculate_compost_volume', nor any prerequisites or assumptions (e.g., standard bag size).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_garden_sunlight_hoursBInspect
Estimate effective daily sunlight hours for a garden based on latitude, month and orientation
| Name | Required | Description | Default |
|---|---|---|---|
| month | Yes | Month number (1=January, 12=December) | |
| latitude | Yes | Latitude in degrees (-90 to 90) | |
| orientation | Yes | Garden orientation / aspect |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. The term 'effective' adds valuable context implying the calculation accounts for solar angle/obliqueness rather than just daylight duration. However, it omits output format, calculation methodology details (astronomical vs empirical), precision limits, or safety classifications.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero redundancy. It is front-loaded with the action and subject, and every word serves a purpose. Appropriate length for a straightforward calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (three primitive parameters with 100% schema coverage) and lack of output schema, the description provides sufficient context for selection and invocation. It appropriately delegates parameter details to the schema, though mention of return value format (hours as decimal) would strengthen completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions the three parameters by name but adds no additional semantic context, syntax guidance, or relationship explanations beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a clear verb ('Estimate') and resource ('effective daily sunlight hours for a garden') and specifies key inputs. It loses a point because it doesn't explicitly differentiate from siblings like 'calculate_sun_exposure' or 'calculate_solar_panel_output', though 'for a garden' provides implicit domain scoping.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to select this tool versus alternatives such as 'calculate_sun_exposure' (general exposure) or 'calculate_solar_panel_output' (energy focus). The description does not mention prerequisites, expected use cases, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_garden_water_needsBInspect
Calculate weekly and monthly water needs for a garden based on plant type and season
| Name | Required | Description | Default |
|---|---|---|---|
| season | Yes | Current season | |
| plant_type | Yes | Type of plants in the garden | |
| surface_m2 | Yes | Garden surface area in square meters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Mentions 'weekly and monthly' timeframes but omits critical behavioral details: output units (liters/gallons), whether it accounts for rainfall, evaporation rates, or that results are estimates based on typical conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action and subject. No redundant words or structural waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 3-parameter calculation tool, but lacks output value description (units, format) which would be helpful given no output schema is defined. No dangerous gaps, but not rich either.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (3 parameters fully documented). Description mentions 'plant type and season' which aligns with schema but adds no semantic depth beyond what schema already provides (baseline 3).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb ('Calculate') and resource ('weekly and monthly water needs'). Implicitly distinguishes from sibling garden tools (soil, sunlight) by specifying 'water', though does not explicitly reference siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use versus alternatives (e.g., calculate_garden_soil) or prerequisites like climate data requirements. Merely states what it does.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_gas_fee_ethCInspect
Calculate Ethereum transaction gas fee in ETH and USD
| Name | Required | Description | Default |
|---|---|---|---|
| gas_limit | No | Gas limit for the transaction (default 21000 for simple transfer) | |
| eth_price_usd | No | Current ETH price in USD (default 3000) | |
| gas_price_gwei | Yes | Gas price in Gwei |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden of behavioral disclosure. It fails to indicate whether this requires external API calls, if results are cached, rate limits, or whether the calculation is purely mathematical with no side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is front-loaded and free of waste, appropriate for the tool's simplicity. However, it borders on under-specification given the lack of output schema and annotations—slightly more detail would improve utility without harming conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description identifies what is calculated but lacks critical context given the missing output schema: it does not describe the return format (e.g., whether it returns an object with separate ETH and USD fields or a single value), nor does it explain the calculation methodology (gas_limit * gas_price_gwei / 1e9).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description mentions 'ETH and USD' which maps to implied outputs but does not add semantic context beyond the schema's descriptions (e.g., explaining the relationship between gas_price_gwei and gas_limit or noting that 21000 is the standard transfer limit).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb (Calculate), resource (Ethereum transaction gas fee), and output currencies (ETH and USD). However, it does not explicitly differentiate from sibling tools like calculate_crypto_profit_loss or calculate_mining_profitability, which also deal with cryptocurrency calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., when to use calculate_gas_fee_eth vs calculate_crypto_profit_loss) nor mentions any prerequisites like needing current gas prices or ETH price data.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_gcd_lcmAInspect
Calculate GCD (PGCD) and LCM (PPCM) of two integers using Euclidean algorithm
| Name | Required | Description | Default |
|---|---|---|---|
| a | Yes | First integer | |
| b | Yes | Second integer |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It adds valuable behavioral context by specifying the 'Euclidean algorithm' implementation method, but fails to disclose output format (whether it returns an object with both values, an array, etc.), error conditions, or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single, efficiently structured sentence that front-loads the action verb ('Calculate'), specifies the mathematical concepts with parenthetical translations (PGCD/PPCM), identifies the inputs, and notes the algorithm. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (2 primitive parameters, 100% schema coverage) but absence of an output schema, the description adequately covers inputs but lacks specification of what the tool returns (both GCD and LCM values, presumably, but in what structure?).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 100% description coverage ('First integer', 'Second integer'), establishing a baseline of 3. The description mentions 'two integers' but adds minimal semantic meaning beyond the schema, such as whether order matters or the mathematical relationship between the inputs.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific mathematical operation (Calculate GCD and LCM), the inputs (two integers), and the algorithmic method (Euclidean). It distinguishes itself from the hundreds of sibling calculation tools by explicitly naming these specific number-theoretic functions and the algorithm used.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description doesn't explicitly state 'use this when...' or name alternatives, the specificity of 'GCD (PGCD) and LCM (PPCM)' provides clear implied usage context for mathematical operations on integers. However, it lacks explicit guidance on when to prefer this over generic calculation tools or conversion utilities in the large sibling set.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_gear_ratioCInspect
Gear ratio and torque multiplier
| Name | Required | Description | Default |
|---|---|---|---|
| driven_teeth | Yes | Driven gear teeth | |
| driving_teeth | Yes | Driving gear teeth |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but reveals no behavioral traits. It does not disclose whether this performs a simple division, returns a decimal or fraction, mentions rounding behavior, or indicate if the 'torque multiplier' is inverse of the gear ratio.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At only 4 words, the description is underspecified rather than appropriately concise. It lacks a front-loaded action statement and wastes no words only because it contains no substantive guidance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 2-parameter calculation tool without output schema, the description should explain the calculation logic (output = driven/driving) and what the numeric result represents. Currently incomplete as it leaves the user-agent relationship between inputs and the dual outputs (ratio vs multiplier) undefined.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('Driving gear teeth', 'Driven gear teeth'), the baseline is 3. The description adds minimal semantic value—it does not clarify the relationship between parameters (which divides into which) or expected units beyond what the schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Gear ratio and torque multiplier' is a noun phrase that restates concepts from the tool name without a clear action verb (e.g., 'calculate'). It fails to distinguish from the 400+ other calculate_* sibling tools or specify what the tool actually returns.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus other calculation tools, no explanation of which gear ratio formula is used (driven/driving vs driving/driven), and no mention of prerequisites or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_glycemic_loadCInspect
Calculate glycemic load (GL) per food and total for a meal
| Name | Required | Description | Default |
|---|---|---|---|
| foods | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose output format (returns array + total? object?), whether calculation is pure/idempotent, or that it requires no external database lookups. Only hints at dual output (per-food and total).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action. Highly efficient wording, though excessive brevity contributes to inadequate parameter documentation. No redundant or wasted sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Inadequate for tool complexity. Nested object structure with four required fields and specific domain formula (GL = (GI × carbs per portion) / 100) warrants explanation of inputs. No output schema exists, yet description omits return value structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%. Description mentions 'per food' but fails to explain the four sub-parameters (gi, carbs_g, portion_g, name) or their units/relationships. User cannot infer from description that gi means glycemic index (0-100) or that portion_g is required.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific calculation (glycemic load/GL) and scope (per food and meal total). Clear verb-resource pairing. However, does not explicitly differentiate from nutrition siblings like calculate_daily_protein or calculate_recipe_nutrition in the text itself.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use versus alternatives. Does not indicate prerequisites (e.g., needing GI values) or that users must calculate GI elsewhere if not known.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_gpa_frenchCInspect
Convert French school grades (out of 20) to GPA and academic mention
| Name | Required | Description | Default |
|---|---|---|---|
| note_1 | Yes | Grade 1 out of 20 | |
| note_2 | Yes | Grade 2 out of 20 | |
| note_3 | No | Grade 3 out of 20 (optional) | |
| note_4 | No | Grade 4 out of 20 (optional) | |
| note_5 | No | Grade 5 out of 20 (optional) | |
| coeff_1 | No | Coefficient for grade 1 | |
| coeff_2 | No | Coefficient for grade 2 | |
| coeff_3 | No | Coefficient for grade 3 (0 if unused) | |
| coeff_4 | No | Coefficient for grade 4 (0 if unused) | |
| coeff_5 | No | Coefficient for grade 5 (0 if unused) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden but omits critical behavioral details: it fails to explain the weighted average calculation (implied by coeff_* parameters), the GPA scale used (4.0 vs 5.0), or the thresholds for French mention classifications (e.g., Assez Bien, Très Bien).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single 9-word sentence is front-loaded with the action verb and efficiently conveys the core purpose. However, extreme brevity results in insufficient detail for a 10-parameter calculation tool, suggesting one additional sentence explaining the weighted nature would improve utility without excess.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 100% schema coverage and lack of output schema, the description adequately identifies inputs and outputs but remains incomplete regarding calculation methodology. It does not explain that the tool computes a weighted average or clarify what constitutes an 'academic mention' in the French system (Passable, Bien, etc.).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, providing "Grade X out of 20" and "Coefficient for grade X" for all 10 parameters. The description adds no additional semantic context beyond the schema (e.g., doesn't mention that coefficients enable weighted calculations or that 5 grades maximum are supported), earning the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the conversion action ("Convert"), input resource ("French school grades"), and dual outputs ("GPA and academic mention"). It distinguishes from siblings like calculate_grade_average by specifying the French 20-point scale context, though it doesn't clarify how it differs from calculate_bac_points or calculate_brevet_points.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like calculate_grade_average or the French-specific exam calculators. No mention of prerequisites (e.g., having valid 0-20 scale grades) or when the weighted coefficient feature should be used.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_grade_averageBInspect
Calculate simple or weighted grade average
| Name | Required | Description | Default |
|---|---|---|---|
| grades | Yes | Array of grades | |
| coefficients | No | Optional array of coefficients/weights |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure, yet it fails to mention whether parameters require equal length, return value format, or validation errors. The term 'Calculate' implies a pure function, but specific behavioral contracts are absent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is compact and front-loaded with no redundant words, efficiently conveying the core functionality in seven words. However, given the presence of many sibling calculation tools, the extreme brevity leaves insufficient room for necessary disambiguation and usage context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While adequate for basic invocation with well-documented schema parameters, the description is incomplete given the tool's context: it omits validation rules (e.g., coefficient/grade array parity), output format expectations, and sibling differentiation that would be necessary for reliable agent operation without trial and error.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage providing basic parameter descriptions, the description adds valuable semantic context by framing the operation as 'simple or weighted,' which clarifies the relationship between the 'grades' and 'coefficients' parameters. This helps the agent understand that coefficients are optional weights rather than mandatory inputs.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific action ('Calculate') and resource ('grade average'), and distinguishes from generic average calculation by specifying 'grade' and supporting both 'simple or weighted' modes. However, it does not explicitly differentiate from the sibling tool 'calculate_average', which could cause selection ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'simple or weighted' implies usage patterns—use without coefficients for simple averages and with coefficients for weighted averages. However, it lacks explicit guidance on when to prefer this over 'calculate_average' or constraints like array length matching requirements.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_grade_neededBInspect
Calculate the grade needed on remaining exams to reach target average
| Name | Required | Description | Default |
|---|---|---|---|
| exams_done | Yes | Number of exams completed | |
| exams_total | Yes | Total number of exams | |
| target_average | Yes | Target final average | |
| current_average | Yes | Current average out of 20 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to state that this is a read-only calculation, does not describe error behavior when targets are impossible (e.g., requiring >20/20), and omits any mention of the return value format or scale.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence with zero unnecessary words. Every word earns its place by conveying the exact mathematical operation performed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a calculation tool with four well-documented parameters but no output schema, the description adequately explains the operation but falls short of completeness by not describing the output (the numeric grade required) or noting that results are constrained by the 0-20 scale mentioned in the schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions 'remaining exams' which implies the relationship between exams_done and exams_total, adding slight semantic value, but does not explain parameter interactions, units (out of 20), or validation constraints beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb (Calculate) and clearly identifies the resource (grade needed on remaining exams) and purpose (to reach target average). It implicitly distinguishes from sibling 'calculate_grade_average' by focusing on forward-looking prediction rather than backward-looking computation, though it does not explicitly name alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus siblings like 'calculate_grade_average', nor does it mention prerequisites such as needing to know current average or having completed at least one exam (implied by parameters but not stated).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_gravel_quantityCInspect
Calculate gravel volume and weight
| Name | Required | Description | Default |
|---|---|---|---|
| width_m | Yes | Width m | |
| depth_cm | Yes | Depth cm | |
| length_m | Yes | Length m |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden but fails to specify assumed gravel density, output units (cubic meters vs tons), return format, or whether volume and weight are both calculated or selectable.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At four words, it is extremely brief and front-loaded, but it is under-specified rather than efficiently concise. The single sentence provides minimal value beyond the tool name itself.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of an output schema and annotations, the description should explain return values, calculation methodology, and unit assumptions. It provides none of these essential details for a calculation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (Width m, Depth cm, Length m). The description adds no additional parameter context, syntax, or relationships, but the baseline score of 3 applies for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the verb (Calculate), resource (gravel), and outputs (volume and weight). While it lacks explicit sibling differentiation from other material calculators like calculate_concrete_mix, the specific material target provides adequate clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites such as needing area dimensions or use cases (e.g., driveway construction vs aquarium substrate).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_grocery_unit_comparisonAInspect
Compare unit prices of grocery items — normalizes g→kg, mL/cL→L
| Name | Required | Description | Default |
|---|---|---|---|
| items | Yes | Items: name, price, quantity, unit (kg/g/L/mL/cl/unit) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully discloses the normalization behavior (converting grams to kilograms, milliliters/centiliters to liters), but omits other critical behaviors such as whether the tool is read-only, what the comparison output format is (sorted list, cheapest item, normalized values), or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient two-clause structure. Front-loads the core purpose ('Compare unit prices') and uses the em-dash to append the key behavioral detail without waste. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Acceptable for the input side given 100% schema coverage, but incomplete regarding outputs. With no output schema provided and no annotations, the description should ideally explain what the comparison returns (e.g., sorted results, cheapest item, normalized price list). Currently silent on return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed parameter descriptions in the schema itself (baseline 3). The description adds valuable semantic context by specifying which unit conversions are automatically handled (g→kg, mL/cL→L), clarifying how the 'unit' parameter values will be interpreted and normalized.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Compare) and resource (unit prices of grocery items). The normalization detail (g→kg, mL/cL→L) adds specificity about the comparison method, though it doesn't explicitly distinguish when to use this versus the sibling `calculate_unit_price` tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implied usage context through 'grocery items' domain and normalization examples, but lacks explicit guidance on when to select this over `calculate_unit_price` or other conversion tools. No exclusions or prerequisites mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_harvest_dateCInspect
Estimate harvest date for vegetables based on sowing date and region
| Name | Required | Description | Default |
|---|---|---|---|
| region | Yes | Growing region: north (+10 days), south (-10 days), mediterranean (-15 days) | |
| plant_type | Yes | Type of vegetable | |
| sowing_date | Yes | Sowing date in ISO format (YYYY-MM-DD) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. While it states inputs (sowing date, region), it fails to disclose output format (date string?), calculation methodology (days to maturity + regional offset?), or whether results are stored. Schema reveals regional adjustments (+10/-10 days), but description omits this behavioral detail.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient 9-word sentence with no redundancy. Front-loaded with action and object. However, given zero annotations and no output schema, extreme brevity may underserve the agent's need for behavioral context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate given excellent schema coverage (enums fully documented, format specified), but lacks output description (critical since no output schema exists) and omits logical prerequisites (e.g., assumes valid sowing dates). Acceptable for simple calculation domain but has gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, baseline is 3. Description mentions 'sowing date and region' but adds no syntax details beyond schema (which already documents ISO format and regional day offsets). Plant_type parameter functionality is implied but not elaborated in description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Estimate') and resource ('harvest date') with specific domain ('vegetables'). Distinguishes itself from financial/physical calculator siblings and other gardening tools like calculate_garden_soil or calculate_seed_quantity, though it doesn't explicitly contrast with them.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use versus other gardening calculators (e.g., calculate_garden_water_needs), no prerequisites mentioned, and no alternatives suggested. Simply states inputs without context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_hat_sizeAInspect
Calculate hat size in FR/EU, US/UK systems and standard S/M/L/XL from head circumference (cm)
| Name | Required | Description | Default |
|---|---|---|---|
| head_circumference_cm | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses the calculation scope (converting to multiple sizing systems) but omits operational details like idempotency, side effects, or error conditions. Adequate but not rich behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. Front-loaded with action verb, immediately communicates input requirement and output variations. Every clause earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 1-parameter calculation tool with no annotations and no output schema, the description adequately covers the essential information needed for an agent to invoke it correctly. Missing output schema is compensated by describing the return format in the description text.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% (parameter lacks description field). The description compensates by specifying 'head circumference (cm)', clarifying both the semantic meaning (head circumference vs other measurements) and the unit (cm), which are critical for correct invocation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Calculate' with resource 'hat size', explicitly scopes the output to FR/EU, US/UK, and S/M/L/XL systems, and distinguishes from siblings like calculate_ring_size or calculate_shoe_size through this specific domain focus.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use or alternative guidance is provided. However, the specificity of 'hat size' versus the extensive list of sibling calculate_* tools provides implicit context for when to select this tool over others.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_heart_rate_zonesAInspect
Calculate heart rate training zones Z1-Z5, optionally using Karvonen method
| Name | Required | Description | Default |
|---|---|---|---|
| max_hr | Yes | Maximum heart rate in bpm | |
| resting_hr | No | Resting heart rate for Karvonen method (bpm) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses the algorithmic variant (Karvonen) and output scope (Z1-Z5), but lacks behavioral details like output format, whether zones are percentages of max HR, or error conditions. Adequate but not rich.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence (9 words). Front-loaded with action verb, no filler text. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, so description should ideally sketch return values. While it mentions 'Z1-Z5' (the conceptual output), it does not describe the structure (e.g., 'returns zone ranges as percentages'). Sufficient for tool selection but gaps remain for invocation confidence.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions. The description adds value by stating 'optionally using Karvonen method', clarifying that resting_hr is optional and linking it to the method choice—context not fully explicit in the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb (Calculate) and specific resource (heart rate training zones Z1-Z5). The mention of 'Karvonen method' adds specificity. However, it does not explicitly distinguish from sibling calculate_training_zones_running (likely pace-based), though the 'heart rate' specificity helps differentiate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides minimal implicit guidance: 'optionally using Karvonen method' hints that providing resting_hr triggers this algorithm. However, lacks explicit when-to-use (e.g., 'use when you need HR zones for training') or when-to-choose over standard calculation, and doesn't explain what Z1-Z5 represent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_heat_indexBInspect
Calculate the apparent temperature (heat index) from temperature and humidity
| Name | Required | Description | Default |
|---|---|---|---|
| humidity_pct | Yes | Relative humidity in percent | |
| temperature_c | Yes | Air temperature in degrees C |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It adequately states the conceptual output (apparent temperature) but omits implementation details such as the formula used, valid input ranges beyond schema constraints, precision, or output format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Perfectly concise single sentence (9 words) with zero waste. The action ('Calculate') is front-loaded, followed immediately by the output and inputs. No redundant phrases or repetition of the tool name.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (2 primitive parameters, no nested objects) and high input schema coverage, the description is minimally adequate. However, lacking an output schema, it should ideally mention the return type or units (e.g., 'returns apparent temperature in degrees Celsius').
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both parameters fully documented). The description mentions 'temperature and humidity' which aligns with but does not extend beyond the schema's 'Air temperature in degrees C' and 'Relative humidity in percent'. Baseline 3 is appropriate for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the resource/output ('apparent temperature (heat index)'). While it implicitly identifies the domain via 'heat index', it does not explicitly differentiate from similar meteorological tools like calculate_wind_chill or calculate_dew_point.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., wind chill for cold conditions vs heat index for hot/humid conditions), nor does it mention prerequisites or preconditions beyond the implied need for input values.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_heat_pump_copCInspect
Heat pump coefficient of performance
| Name | Required | Description | Default |
|---|---|---|---|
| indoor_temp | No | Indoor target °C | |
| outdoor_temp | Yes | Outdoor temperature °C |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description adds no behavioral context. It does not disclose the calculation method (theoretical Carnot vs. practical), expected output format/range, or temperature constraints (e.g., minimum operational outdoor temp).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief (3 words), it is not effectively front-loaded. It resembles a category label rather than a descriptive sentence. The extreme brevity renders it uninformative rather than efficiently concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no annotations and no output schema, the description should explain what value is returned (COP ratio) and its significance. Currently it provides only the subject noun without explaining the calculation purpose or result interpretation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (both parameters have clear descriptions in the schema), the baseline is 3. The description adds no parameter-specific context (e.g., that indoor_temp defaults to 20°C for room heating scenarios), but the schema carries the load.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Heat pump coefficient of performance' is a noun phrase that states the topic but omits the action (calculate/compute/estimate). It distinguishes the domain from sibling tools but fails to communicate what the tool actually does with this concept.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No usage guidance is provided. Given the large number of sibling calculation tools, there is no indication of when to use this specific tool versus other energy/physics calculators or what inputs are required for valid results.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_horse_weightAInspect
Estimate horse weight using Carroll formula
| Name | Required | Description | Default |
|---|---|---|---|
| body_length_cm | Yes | Body length cm | |
| heart_girth_cm | Yes | Heart girth circumference cm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It identifies the specific formula used ('Carroll formula'), which hints at the calculation method, but fails to describe the return value format, units (kg, lbs), precision, or any constraints on the input values beyond what the schema defines.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of a single 7-word sentence with zero fluff. Every word earns its place by identifying the action, subject, and methodology without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a two-parameter calculation tool with no output schema, the description minimally suffices by stating the transformation purpose. However, it lacks specification of output units and does not describe the result structure, leaving gaps that could cause ambiguity in interpreting the calculated weight value.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% for both required parameters ('Body length cm' and 'Heart girth circumference cm'). The description does not add parameter-specific guidance, syntax details, or contextual relationships between the measurements, warranting the baseline score for complete schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Estimate') and resource ('horse weight') and identifies the unique methodology ('Carroll formula'), clearly distinguishing it from sibling tools like calculate_bmi or calculate_ideal_weight which target different subjects or use different formulas.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., when only measurements are available vs. having access to a scale), nor does it mention prerequisites or constraints for the Carroll formula's applicability.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_hourly_costDInspect
Hourly cost to company
| Name | Required | Description | Default |
|---|---|---|---|
| work_days | No | Working days/year | |
| charges_pct | No | Employer charges % | |
| annual_gross | Yes | Annual gross salary EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description adds zero behavioral context. It does not disclose whether the tool makes external calls, what format the output takes (numeric? object?), whether results are cached, or any other operational characteristics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While not verbose, the description is a four-word fragment that severely under-specifies the tool. It suffers from under-specification rather than appropriate conciseness, failing to front-load any actionable information beyond the tool's name.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given three parameters (one required), zero annotations, and no output schema, the four-word description is completely inadequate. It omits the calculation algorithm, output format, and regional applicability (generic vs. country-specific like siblings).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (all three parameters have descriptions), the baseline score applies. The description adds no additional parameter context (e.g., expected value ranges, format constraints), but the schema adequately covers semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Hourly cost to company' is essentially tautological, restating the tool name with minimal addition. It fails to specify the calculation method (annual gross + employer charges divided by working days/hours) and does not distinguish from siblings like calculate_employer_cost_fr or calculate_salary_hourly_to_annual.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides absolutely no guidance on when to use this tool versus alternatives. No mention of prerequisites (e.g., needing annual gross salary), nor when to prefer this over calculate_employer_cost_fr or other regional salary calculators in the extensive sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_housing_aidBInspect
Estimate French housing aid (APL — Aide Personnalisee au Logement)
| Name | Required | Description | Default |
|---|---|---|---|
| rent | Yes | Monthly rent in euros | |
| city_zone | No | City zone: 1 (Paris/IDF), 2 (large cities), 3 (rural) | 2 |
| household_size | No | Number of people in household (1-6) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden but only notes this is an 'Estimate'. It fails to disclose limitations (the schema lacks income fields which real APL calculations require), whether results are monthly amounts, or if it covers colocation scenarios.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficiently structured with the essential identification front-loaded. However, given the complexity of French housing benefits, the extreme brevity leaves it under-specified for the domain (a completeness issue, not verbosity).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a government benefit calculation tool with no output schema or annotations, the description is insufficient. It omits return value description (monthly aid amount?), calculation methodology (CAF rules), and critical eligibility constraints that APL requires.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear descriptions for all three parameters. Since the schema fully documents rent, city_zone, and household_size, the description earns the baseline 3 without needing to repeat parameter details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the action ('Estimate'), the specific resource ('French housing aid'), and distinguishes it by name ('APL — Aide Personnalisee au Logement'), making it unambiguous among the sibling French benefit calculators like calculate_prime_activite.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., calculate_prime_activite for work benefits), nor does it mention prerequisites like tenant status or required income thresholds for APL eligibility.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_housing_loan_comparisonBInspect
Compare multiple mortgage offers sorted by total cost
| Name | Required | Description | Default |
|---|---|---|---|
| offers | Yes | List of mortgage offers to compare | |
| loan_amount | Yes | Loan amount in EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Given no annotations and no output schema, the description carries full disclosure burden but only mentions sorting behavior. It lacks crucial details: what 'total cost' comprises (principal+interest+insurance?), return format structure, amortization assumptions, or whether fees are considered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief single sentence with no filler. However, given the complex nested input schema and lack of output specification, this brevity borders on under-specification rather than optimal conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Inadequate for the tool's complexity. With rich nested parameters (offers containing 4 required sub-fields) and no output schema or annotations, the description should explain comparison methodology, output structure, and cost calculation components. Current description is insufficient for an agent to predict behavior or validate inputs effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline score applies. Description mentions 'multiple mortgage offers' which aligns with the offers array structure, but adds no semantic clarification beyond schema (e.g., clarifying APR vs nominal rate, or insurance calculation methodology).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description provides specific verb ('Compare'), clear resource ('mortgage offers'), and distinguishing scope ('multiple' offers 'sorted by total cost'), clearly differentiating it from single-calculation siblings like calculate_mortgage or calculate_loan_payment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to select this tool versus alternatives like calculate_mortgage, calculate_loan_payment, or calculate_loan_to_value. No prerequisites or exclusion criteria mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_hydrationBInspect
Calculate daily water intake needs based on weight, activity and climate
| Name | Required | Description | Default |
|---|---|---|---|
| climate | No | Climate/environment | temperate |
| weight_kg | Yes | Body weight in kilograms | |
| activity_minutes | No | Daily exercise duration in minutes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. While it states the inputs, it fails to disclose the output format (e.g., milliliters vs liters), calculation methodology, or whether it returns a single value or a breakdown. Missing behavioral traits expected for a calculation tool without output schema coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, 9 words. Front-loaded with action verb. Zero redundancy. Every word directly contributes to understanding the tool's function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 3-parameter calculator with full schema documentation. However, given the absence of an output schema and annotations, the description should specify the return value units/format (e.g., 'returns recommended liters per day') to be complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema adequately documents all three parameters. The description lists the factors ('weight, activity and climate') which aligns with the schema but adds no additional semantic depth (e.g., explaining how activity minutes modify the baseline calculation). Baseline score appropriate for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and resource ('daily water intake needs') with specific input factors (weight, activity, climate). However, it lacks explicit differentiation from the sibling tool 'calculate_water_intake', which creates potential selection ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus the sibling 'calculate_water_intake' or other health calculators. No prerequisites, exclusions, or alternative suggestions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_hydraulic_pressureDInspect
Hydraulic system pressure
| Name | Required | Description | Default |
|---|---|---|---|
| force_n | Yes | Force N | |
| area_cm2 | Yes | Piston area cm² |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to state the formula used (P=F/A), the output units (Pascals/bar), precision behavior, or validation rules beyond the schema minimums. It does not mention that this is a mathematical calculation operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At three words, the description is technically concise, but this represents under-specification rather than efficient communication. Every sentence (or fragment) fails to earn its place by providing actionable information to the agent.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple two-parameter schema and lack of output schema or annotations, the description should disclose the return value format and units. It omits critical information: what unit of pressure is returned and the relationship between inputs and output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both force_n and area_cm2 have descriptions). The description adds no parameter guidance beyond the schema, but with complete schema coverage, the baseline score of 3 is appropriate as the schema adequately documents the physics parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Hydraulic system pressure' is a tautological noun phrase that restates the tool name without clarifying what calculation is performed (pressure = force/area). It fails to distinguish from siblings like calculate_pressure_convert or convert_pressure, which also deal with pressure but perform different operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus siblings (calculate_pressure_convert, convert_pressure) or other physics calculators like calculate_force. No prerequisites, input requirements, or alternative tools are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_hyperfocal_distanceBInspect
Calculate hyperfocal distance and near/far sharp limits for a lens and aperture
| Name | Required | Description | Default |
|---|---|---|---|
| coc_mm | No | Circle of confusion in mm (default 0.03 for full frame) | |
| aperture | Yes | Aperture f-number | |
| focal_length_mm | Yes | Lens focal length in millimeters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It discloses what values are computed (hyperfocal distance, near/far limits) but omits return format, units (meters vs feet), error conditions, or whether results are approximate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with verb 'Calculate', no redundancy. Every word earns its place by identifying the operation and target values.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for identifying the computation but lacks domain context (photography/optics) and output specification given no output schema exists. For a 3-parameter specialized tool, it meets minimum viability but leaves gaps for unfamiliar users.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with complete descriptions for focal_length_mm ('Lens focal length'), aperture ('Aperture f-number'), and coc_mm ('Circle of confusion'). The description adds minimal semantic value beyond mapping 'lens' to focal_length_mm. Baseline 3 is appropriate given schema completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific calculation (hyperfocal distance, near/far sharp limits) and maps inputs to 'lens' and 'aperture'. However, it fails to distinguish from sibling 'calculate_depth_of_field', which calculates related but different optical values, leaving the selection criteria ambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus alternatives (like calculate_depth_of_field), nor prerequisites such as needing specific photography knowledge or sensor size considerations (implied by coc_mm default).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ideal_gasAInspect
Solve PV=nRT. Provide any 3 of: pressure_pa, volume_m3, moles, temperature_k. R=8.314
| Name | Required | Description | Default |
|---|---|---|---|
| moles | No | Amount in mol | |
| volume_m3 | No | Volume in m³ | |
| pressure_pa | No | Pressure in Pa | |
| temperature_k | No | Temperature in K |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Mentions the gas constant R=8.314 and that any 3 inputs are needed, but omits error behavior (what happens with <3 inputs?), output format, or that this is a safe read-only calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three precise sentences with zero redundancy: equation identification, input constraint, and constant specification. Perfectly front-loaded and sized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, should describe return value (which variable is calculated). Input constraints are well covered, but error handling and output specification gaps prevent higher score for a 4-parameter tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, baseline is 3. The description adds crucial semantic constraint 'Provide any 3 of...' which compensates for the schema having zero required fields. Also clarifies the constant R value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States 'Solve PV=nRT' which clearly identifies the specific physics equation being handled, distinguishing it from hundreds of sibling calculation tools. However, it assumes the agent recognizes the ideal gas law formula rather than explicitly stating 'Calculate ideal gas law properties'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit input constraint 'Provide any 3 of...' which is critical for correct invocation given zero required parameters in schema. Lacks explicit 'when to use' guidance relative to other physics calculators, but the formula specificity serves as implicit context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ideal_weightCInspect
Estimate ideal body weight using Lorentz and Devine formulas
| Name | Required | Description | Default |
|---|---|---|---|
| sex | Yes | Biological sex | |
| height_cm | Yes | Height in centimeters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. While it names the formulas used, it fails to disclose output format (kg/lbs?), whether it returns results from both formulas or averages them, or any limitations/accuracy notes for these medical estimates.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that is efficiently front-loaded with the key verb and subject. However, extreme brevity sacrifices necessary behavioral context for a health calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Without output schema or annotations, the description inadequately covers what the tool returns (units, formula outputs, format). Missing crucial distinction from 'calculate_ideal_weight_range' and fails to explain Lorentz/Devine formula differences for users unfamiliar with medical terminology.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with both 'height_cm' and 'sex' well-documented in the JSON schema. The description adds no additional parameter context, which is acceptable given the baseline score for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action ('Estimate'), resource ('ideal body weight'), and method ('Lorentz and Devine formulas'). However, it does not distinguish from close sibling 'calculate_ideal_weight_range', leaving ambiguity about whether this returns a single value or range.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus siblings like 'calculate_ideal_weight_range', 'calculate_bmi', or 'calculate_bmr'. No prerequisites or alternatives mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ideal_weight_rangeCInspect
Calculate ideal weight range using multiple methods
| Name | Required | Description | Default |
|---|---|---|---|
| sex | Yes | Biological sex | |
| frame | Yes | Body frame size | |
| height_cm | Yes | Height in cm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. While 'Calculate' implies read-only operation, the description doesn't clarify what 'multiple methods' means, what output format to expect (range boundaries? multiple values?), or any limitations of the calculation methods.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero filler words. Efficiently front-loaded. However, extreme brevity borders on under-specification for a tool with no annotations or output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no annotations, no output schema, and a 3-parameter calculation tool, the description needs to disclose return format, calculation methodology, or result interpretation. Currently provides only the minimal purpose statement.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all three parameters have descriptions), establishing a baseline of 3. The description mentions 'using multiple methods' which implies the 'frame' parameter is relevant (as different formulas use frame size), but doesn't explicitly add syntax, units, or validation details beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb ('Calculate') and resource ('ideal weight range'). Mentions 'range' which distinguishes it from sibling 'calculate_ideal_weight', though it doesn't explain the distinction between using a single formula vs multiple methods.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus the sibling 'calculate_ideal_weight' or 'calculate_bmi' tools. No prerequisites or exclusion criteria mentioned despite having many related calculation siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_impermanent_lossAInspect
Calculate impermanent loss for a DeFi liquidity pool position when price ratio changes
| Name | Required | Description | Default |
|---|---|---|---|
| price_ratio_change | Yes | Price ratio change multiplier (e.g. 2.0 if token doubled in price, 0.5 if halved) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. It explains the conceptual behavior (impermanent loss calculation when price ratio changes) but omits operational traits: read-only nature, calculation methodology (constant product formula), whether external market data is fetched, or return value units/format. Safe for a pure calculation tool but lacks complete transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb. Every word earns its place: 'DeFi liquidity pool position' precisely scopes the domain without verbosity. No redundant phrases or tautology.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Simple single-parameter tool with complete schema coverage, but lacks output schema. Description does not compensate by explaining return value format (percentage? absolute token amount? decimal vs percentage representation?). For calculation tools without output schemas, return value description is expected.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with excellent parameter description including examples (2.0, 0.5). Description mentions 'when price ratio changes' which conceptually aligns with the parameter but adds no syntax/format details beyond schema. Baseline 3 appropriate when schema does heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with clear resource 'impermanent loss' and domain context 'DeFi liquidity pool position'. Effectively distinguishes from 400+ generic calculate_* siblings (e.g., calculate_crypto_profit_loss, calculate_staking_rewards) by specifying the liquidity pool mechanics and price ratio dependency.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when/when-not or alternatives stated. However, the specific DeFi domain reference ('DeFi liquidity pool') provides implied usage context for selecting this over generic financial calculators. Lacks explicit guidance on prerequisites (e.g., needing initial price ratio vs final price ratio).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_inflation_adjusted_valueCInspect
Calculate real purchasing power after inflation
| Name | Required | Description | Default |
|---|---|---|---|
| years | Yes | Number of years | |
| amount | Yes | Amount in EUR | |
| inflation_rate | Yes | Annual inflation rate percent |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description lacks behavioral details. It does not specify whether the calculation compounds annually, what format the output takes (adjusted currency amount, percentage loss, or purchasing power index), or the direction of adjustment (projecting forward vs. adjusting backward).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is efficiently worded and front-loaded, but given the presence of similarly-named sibling tools and the complexity of financial calculations, it is undersized for proper agent selection. The brevity creates ambiguity rather than clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Without an output schema or annotations, the description should compensate by explaining the return value format and calculation methodology. It fails to do so. Additionally, the existence of semantically similar siblings ('calculate_inflation_adjustment') makes this description insufficiently complete for confident tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing a baseline of 3. The description adds no specific guidance about the parameters (e.g., noting that 'amount' should be in EUR as specified in the schema, or clarifying that 'inflation_rate' expects a percentage value), but the schema adequately documents them without additional description support.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the general intent (calculating purchasing power adjusted for inflation) but fails to distinguish this tool from sibling tools 'calculate_inflation_adjustment' and 'calculate_purchasing_power', which appear to perform similar functions. The scope and direction of the calculation (future vs. past value) remain ambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this specific tool versus the similar 'calculate_inflation_adjustment' or 'calculate_purchasing_power' siblings. Given the crowded calculate_* namespace and overlapping financial concepts, this omission forces the agent to guess based on parameter schemas alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_inflation_adjustmentCInspect
Adjust an amount for inflation over time
| Name | Required | Description | Default |
|---|---|---|---|
| years | Yes | Number of years | |
| amount | Yes | Original amount | |
| inflation_rate | Yes | Annual inflation rate in % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description fails to disclose whether the tool performs a simple or compound adjustment calculation, whether it returns the adjusted amount or the difference (adjustment value), or any constraints like maximum year limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely terse at only six words. While not verbose, this is insufficiently sized for a financial calculation tool with potential sibling ambiguity. The single sentence structure is efficient but fails to front-load critical distinctions (adjustment vs. adjusted value).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of a confusingly similar sibling tool and lack of output schema, the description inadequately specifies the calculation methodology (compound vs. simple) or return value semantics. High schema coverage for inputs does not compensate for missing behavioral context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds no additional semantic context beyond the schema (e.g., that 'amount' refers to present value, or that the return is future value). No compensation needed for schema gaps, but no value added either.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the core action ('Adjust') and resource ('amount'), but remains vague regarding directionality (forward vs. backward adjustment) and fails to distinguish from the sibling tool 'calculate_inflation_adjusted_value'. The distinction between calculating an 'adjustment' (delta) versus an 'adjusted value' (final amount) is critical but unexplained.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus siblings like 'calculate_inflation_adjusted_value' or 'calculate_compound_interest'. No mention of prerequisites (e.g., whether rate should be actual or percentage) or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_inheritance_taxAInspect
Calculate French inheritance tax (droits de succession) based on relationship and amount
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Inherited amount in euros | |
| relationship | Yes | Relationship to deceased: spouse, child, sibling, other |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. 'Calculate' implies a read-only computation, but description lacks disclosure of return format, whether it includes tax brackets/bands, rate schedules, or any currency/rounding behaviors.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Zero waste in single sentence. Domain (French), function (calculate), legal term (droits de succession), and key inputs (relationship/amount) front-loaded efficiently.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 2-parameter calculator but minimal. Lacks output description (currency? total vs breakdown? tax brackets?), though this is somewhat expected for simple calculation tools without output schemas.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (both parameters have descriptions). Description mentions 'relationship and amount' acknowledging the parameters exist, but adds no semantic depth beyond the schema's 'Inherited amount in euros' and relationship enum values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' + specific resource 'French inheritance tax (droits de succession)' clearly identifies the function. The French legal term 'droits de succession' and domain specification distinguish it from siblings like calculate_belgian_donation and calculate_french_income_tax.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage by naming the specific tax domain, but lacks explicit 'when to use' guidance or comparison to alternatives like calculate_property_transfer_tax or calculate_belgian_donation. No prerequisites or exclusions stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_insulation_rAInspect
Calculate thermal resistance R = thickness/lambda
| Name | Required | Description | Default |
|---|---|---|---|
| lambda | Yes | Conductivity W/(m.K) | |
| thickness_mm | Yes | Thickness mm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It successfully discloses the calculation logic by providing the exact formula (thickness/lambda), which explains how the output is derived. However, it does not specify the output units (m²·K/W) or error behavior beyond the schema constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with zero redundancy. Every word serves a purpose: the action verb ('Calculate'), the resource ('thermal resistance R'), and the behavioral logic ('thickness/lambda'). Appropriately front-loaded for a simple calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter calculation tool with 100% schema coverage and no output schema, the description is minimally sufficient. The formula explains the calculation intent, but the lack of distinction from 'calculate_insulation_r_value' and absence of output unit specification leave minor gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description references 'thickness' and 'lambda' in the formula, reinforcing their relationship, but does not add semantic information (units, valid ranges, formats) beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool calculates 'thermal resistance R' and provides the specific formula (R = thickness/lambda). However, it does not distinguish from the sibling tool 'calculate_insulation_r_value', which appears to calculate a closely related metric.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives, particularly the sibling 'calculate_insulation_r_value'. There are no stated prerequisites, constraints, or exclusions beyond the parameter validation in the schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_insulation_r_valueAInspect
Calculate thermal R-value: R = thickness/lambda. Compare with RE2020 targets
| Name | Required | Description | Default |
|---|---|---|---|
| lambda | Yes | Conductivity W/(m·K) — mineral wool ~0.035, polyurethane ~0.025 | |
| thickness_mm | Yes | Insulation thickness in mm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It successfully discloses the calculation formula (R = thickness/lambda) and reveals that the tool performs comparison against RE2020 standards, hinting at output behavior. However, it omits details about return format, unit handling, or whether it provides pass/fail guidance against regulations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely efficient at two short sentences. Front-loaded with the action ('Calculate thermal R-value'), followed immediately by implementation detail (formula) and value-add feature (RE2020 comparison). Zero redundancy or generic filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 100% input schema coverage, the description adequately covers inputs. However, lacking an output schema, it only hints at the RE2020 comparison output without explaining what data is returned (e.g., whether it includes compliance recommendations, chart data, or just the raw R-value).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters (thickness_mm, lambda) fully documented including examples (mineral wool ~0.035). The description adds the mathematical relationship between them but does not add semantic context beyond what the schema already provides, warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the domain (thermal insulation R-value) and operation (Calculate), distinguishing it from hundreds of generic sibling calculators. It references the specific formula 'R = thickness/lambda' and mentions 'RE2020 targets' (French energy regulation), which uniquely identifies this as a building/energy efficiency tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the mention of 'RE2020 targets' implies usage for building energy compliance checking, there is no explicit guidance on when to use this versus other tools, prerequisites (e.g., needing thermal conductivity values), or what RE2020 refers to for users unfamiliar with French regulations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_insurance_estimateCInspect
Estimate annual car insurance from vehicle value, driver age and bonus-malus
| Name | Required | Description | Default |
|---|---|---|---|
| driver_age | Yes | Driver age | |
| vehicle_value | Yes | Vehicle value EUR | |
| bonus_malus_coefficient | No | Bonus-malus (0.5=best, 3.5=worst) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet description fails to disclose output format (currency, annual vs monthly), calculation methodology, or whether this applies to specific countries/regions. 'Estimate' implies read-only but output details are missing.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single, front-loaded sentence with no redundant words. Efficiently communicates purpose and required inputs without extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Missing critical context for a calculation tool: output schema is absent yet description doesn't specify return value units (EUR?), precision level, or regional applicability (car insurance laws vary by jurisdiction).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed param descriptions already present. Description maps natural language terms to parameters but adds minimal semantic value beyond what the schema provides (e.g., doesn't explain that bonus-malus is a French/European insurance rating system).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Estimate') and resource ('annual car insurance'), with specific inputs listed. Distinguishes from travel insurance siblings by specifying 'car', though could more explicitly differentiate from calculate_travel_insurance_estimate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this versus calculate_travel_insurance_estimate or other insurance calculators. No mention of prerequisites like driver's license requirements or regional restrictions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_international_shippingBInspect
Calculate international shipping cost using volumetric weight and destination zone
| Name | Required | Description | Default |
|---|---|---|---|
| from_zone | Yes | Destination zone | |
| weight_kg | Yes | Actual parcel weight in kg | |
| dimensions_cm | Yes | Parcel dimensions in cm (length, width, height) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It discloses the volumetric weight calculation method, explaining why dimensions are required. However, it lacks details on output format (currency? object structure?), side effects, or whether this queries live carrier rates vs. static formulas.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single, dense sentence with no waste. Front-loaded with action verb, immediately communicates resource and key inputs. Every clause earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a calculation tool with 100% schema coverage, but given no output schema exists, the description should ideally indicate what the calculation returns (e.g., cost amount, currency, rate breakdown). Missing this leaves the agent uncertain about result handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 3 parameters fully documented), establishing baseline 3. The description adds context that dimensions are used for 'volumetric weight' and mentions 'destination zone', reinforcing the schema but not adding significant new semantic details beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and resource ('international shipping cost'), and specifies the methodology ('volumetric weight'). However, given the existence of sibling tool 'calculate_shipping_volumetric', it fails to explicitly differentiate when to use this cost-calculation variant versus the volumetric-weight-only variant.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus sibling calculators like 'calculate_shipping_volumetric' or 'calculate_delivery_cost'. Does not mention prerequisites (e.g., needing accurate dimensional data) or when volumetric vs. actual weight pricing applies.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_inventory_eoqCInspect
Economic Order Quantity
| Name | Required | Description | Default |
|---|---|---|---|
| order_cost | Yes | Cost per order | |
| holding_cost | Yes | Annual holding cost per unit | |
| annual_demand | Yes | Annual demand units |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden but discloses nothing about the calculation method (Wilson formula), output format (units of currency), or whether results include Safety Stock. Only identifies the domain concept.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief (3 words), this represents under-specification rather than efficient conciseness. However, there is no verbiage or structural bloat to deduct further.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks output schema disclosure (returns optimal quantity value?) and provides no business context despite being a specialized operations research formula. Relies entirely on parameter schema for completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (annual_demand, order_cost, holding_cost all documented), establishing baseline 3. The description adds no semantic value beyond the schema (e.g., no guidance on cost units or currency consistency).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Economic Order Quantity' is essentially a tautology that restates the acronym in the tool name without using a verb to describe the action (calculate, compute, determine). It fails to distinguish from sibling calculation tools like calculate_inventory_turnover or calculate_break_even.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus sibling calculators, no business context (e.g., optimal inventory ordering), and no prerequisites or assumptions (e.g., constant demand rate).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_inventory_turnoverCInspect
Inventory turnover ratio
| Name | Required | Description | Default |
|---|---|---|---|
| cogs | Yes | Cost of goods sold | |
| avg_inventory | Yes | Average inventory value |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry full behavioral disclosure. It omits what the tool returns (numeric ratio? Interpretation guidelines?), error handling (division by zero protection), or that this is a read-only calculation. The schema indicates minimum values, but the description doesn't reference behavioral constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The three-word fragment is under-specified rather than appropriately concise. No sentences are present to 'earn their place'; the extreme brevity forces the agent to infer all behavioral and contextual details from the schema alone.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter financial tool with no output schema and no annotations, the description fails to explain the output format, business significance of the ratio, or calculation formula. It meets the bare minimum of identifying the metric but leaves critical gaps in contextual completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Cost of goods sold' and 'Average inventory value'), so the baseline score applies. The description adds no supplemental context about units (currency), time periods, or calculation methodology beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Inventory turnover ratio' is a tautology that restates the tool name without adding specificity. It fails to distinguish this from sibling financial calculators (e.g., calculate_cogs, calculate_profit_margin) or clarify what operation is performed.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus other financial calculation siblings. With over 300 calculator tools available, the description offers no criteria for selection or prerequisites (e.g., accounting periods, inventory valuation methods).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_jet_lag_recoveryBInspect
Estimate jet lag recovery time based on timezone difference and direction of travel
| Name | Required | Description | Default |
|---|---|---|---|
| timezone_diff_hours | Yes | Timezone difference in hours (positive = eastward, negative = westward) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. States 'estimate' implying approximation, but fails to disclose output format (JSON structure, units like days/hours), estimation methodology, or whether this is read-only/idempotent. Critical gap for a tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. Front-loaded with action verb and immediately identifies the core calculation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for input requirements given 100% schema coverage, but incomplete regarding output expectations. Without an output schema, description should specify return value format/units and whether additional context (travel dates, individual factors) might be needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with detailed parameter description ('positive = eastward, negative = westward'). Description mentions 'direction of travel' which aligns with but doesn't substantially augment the schema's encoding explanation. Baseline 3 appropriate when schema does heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Estimate') and resource ('jet lag recovery time') with explicit inputs mentioned. Distinguishes from timezone conversion siblings by focusing on physiological recovery rather than time math. Minor gap: doesn't specify what 'recovery time' means (days? hours?).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use versus siblings like calculate_timezone_convert or calculate_sleep_cycles. No mention of prerequisites (e.g., travel completed vs planned) or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_kinetic_energyCInspect
Kinetic energy of a moving object
| Name | Required | Description | Default |
|---|---|---|---|
| mass_kg | Yes | Mass in kg | |
| velocity_ms | Yes | Velocity in m/s |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description discloses no behavioral traits. It does not mention the calculation formula (½mv²), expected output units (Joules), precision, or idempotency. For a stateless mathematical operation, this minimal disclosure is insufficient when no annotations cover the safety/profile characteristics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At six words, the description is brief but incomplete—it's a noun fragment lacking a predicate. While efficiently front-loaded, it sacrifices necessary operational context for the sake of brevity, landing at adequacy rather than excellence.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter physics calculation with no nested objects, the description is minimally sufficient. However, without an output schema or annotations, it omits expected return value details (scalar value in Joules) that would complete the contract for the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('Mass in kg', 'Velocity in m/s'), the schema fully documents the parameters. The description adds no parameter semantics beyond the schema, which aligns with the baseline score of 3 for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Kinetic energy of a moving object' identifies the domain but omits the action verb (calculate/compute). It restates the concept from the tool name without clarifying what the tool actually does to that object. With siblings like calculate_energy_physics and calculate_work, it fails to differentiate when to use this specific formula.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives. Given the extensive list of physics and energy-related siblings (calculate_force, calculate_work, calculate_energy_physics, calculate_projectile_motion), there is no indication of which specific scenarios warrant this kinetic energy calculation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_knitting_yarnBInspect
Calculate yarn needed for a knitting project (meters and number of 50g/100m balls)
| Name | Required | Description | Default |
|---|---|---|---|
| size | Yes | ||
| project | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description must carry the full behavioral burden. It successfully discloses the output format (meters and 50g/100m balls) since no output schema exists. However, it omits calculation assumptions (standard gauge, yarn weight, ease) and whether the S/M/L sizing refers to wearer size or garment dimensions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single efficient sentence with no redundant words. The parenthetical output specification is appropriately placed. However, the lack of structured guidance (when-to-use, parameters) prevents a perfect score.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 2-parameter calculation tool without output schema, the description adequately covers the return value format. However, given the domain complexity (knitting gauge variations, yarn weights), the description lacks completeness regarding calculation assumptions and parameter semantics that would help an agent predict appropriate inputs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, requiring the description to compensate for undocumented parameters. While the description implies a 'project' parameter by mentioning 'knitting project,' it fails to explain the 'size' parameter (S/M/L enum) or clarify that specific project types (scarf, hat, etc.) are expected. The enum values in the schema provide raw data, but semantic context is missing.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the action ('Calculate'), resource ('yarn'), and domain ('knitting project'), distinguishing it from sibling fabric-calculation tools. It also specifies the output units (meters and balls), which is helpful. However, it could explicitly differentiate from 'calculate_fabric_needed' for clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'calculate_fabric_needed' or 'calculate_fabric_yardage'. No prerequisites, assumptions (e.g., gauge/stitch type), or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_laundry_costAInspect
Calculate weekly and annual laundry cost (electricity + water + detergent)
| Name | Required | Description | Default |
|---|---|---|---|
| loads_per_week | Yes | Loads per week | |
| water_liters_per_load | No | Liters per load (default 50) | |
| detergent_cost_per_load | No | Detergent EUR/load (default 0.30) | |
| electricity_kwh_per_load | No | kWh per load (default 1.2) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It adds valuable context by specifying the output includes both weekly and annual calculations and lists the cost components aggregated, but lacks details on output format, currency (though EUR appears in schema), or side-effect guarantees.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of a single, efficient sentence that front-loads the action verb and parenthetically lists cost components without redundancy. Every word serves a distinct purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple calculation tool with 100% schema coverage and four straightforward parameters, the description adequately covers the calculation scope and temporal outputs (weekly/annual). It could be improved by explicitly mentioning the output currency or precise return value structure given the lack of output schema, but it is sufficient for tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all four parameters already documented in the schema. The description maps the conceptual cost components (electricity, water, detergent) to the parameters but does not add syntax, unit explanations, or usage examples beyond what the schema already provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Calculate), resource (laundry cost), scope (weekly and annual), and cost components (electricity + water + detergent), effectively distinguishing it from sibling calculation tools like calculate_electricity_cost or calculate_water_bill by specifying the laundry domain.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to select this tool versus related siblings such as calculate_electricity_cost, calculate_cost_per_use, or calculate_water_bill, nor does it mention prerequisites or exclusions for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_lawn_mowing_frequencyBInspect
Calculate recommended lawn mowing interval based on grass type, season and rainfall
| Name | Required | Description | Default |
|---|---|---|---|
| season | Yes | Current season | |
| grass_type | Yes | Type of grass: cool_season (fescue/rye), warm_season (bermuda/zoysia), or mixed | |
| weekly_rainfall_mm | No | Average weekly rainfall in mm (default 25mm) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It transparently states the inputs driving the calculation ('based on grass type, season and rainfall'), but fails to specify the output format, units (days? weeks?), or range constraints. It implies a read-only calculation but doesn't confirm no side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. The information is front-loaded with the action ('Calculate') and subject ('recommended lawn mowing interval') immediately clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 3-parameter tool with 100% schema coverage but no output schema or annotations, the description adequately covers inputs but should specify what the calculation returns (unit, format, range). Without this, agents cannot confidently interpret results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents all parameters including enum values and the default for weekly_rainfall_mm. The description merely lists the parameter categories ('grass type, season and rainfall') without adding semantic depth, syntax guidance, or usage examples beyond the schema, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the domain ('lawn mowing interval') and the factors considered ('grass type, season and rainfall'). It effectively distinguishes this from sibling tools like calculate_lawn_seed or calculate_garden_water_needs. However, it does not specify what 'interval' means (e.g., days between mows vs. frequency per week) which would make it a 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states what the tool does but provides no guidance on when to use it versus alternatives. Given the presence of sibling garden tools (calculate_lawn_seed, calculate_garden_water_needs, calculate_garden_soil), explicit context like 'Use this for scheduling maintenance frequency' or differentiation from seed/soil calculations is absent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_lawn_seedDInspect
Lawn seed quantity
| Name | Required | Description | Default |
|---|---|---|---|
| area_m2 | Yes | Lawn area m² | |
| rate_g_m2 | No | Seeding rate g/m² |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full responsibility for behavioral disclosure. It fails to indicate what value is returned (grams, kg, bags), the calculation formula used, or any side effects. The agent cannot determine if this is a simple multiplication or includes safety margins.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief (3 words), it is under-specified rather than concisely informative. The single phrase lacks a verb and actionable front-loading, failing the 'every sentence earns its place' standard.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter calculation tool with no output schema, the description is insufficient. It omits the output unit (grams, kilograms), calculation methodology, and practical context (e.g., 'for new lawns vs overseeding').
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both area_m2 and rate_g_m2 are documented), establishing a baseline of 3. The description adds no semantic context about these parameters or their relationship (area × rate = total seed weight).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Lawn seed quantity' is a noun phrase that fails to specify the action performed (calculate/compute). While it identifies the domain (lawn seed), it functions as a label rather than a functional description and does not distinguish from sibling 'calculate_seed_quantity'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like 'calculate_seed_quantity' or 'calculate_garden_soil'. No mention of prerequisites or expected input format.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_leave_daysAInspect
Calculate French paid leave (congés payés): 2.5 days/month, max 25 working days/year
| Name | Required | Description | Default |
|---|---|---|---|
| start_date | Yes | YYYY-MM-DD — Employment start date | |
| months_worked | Yes | Months worked in the reference period (1-12) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully discloses the calculation logic (2.5 days/month) and cap (25 working days), but lacks disclosure of output format, whether it returns working days vs calendar days, or error handling behavior for invalid date ranges.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero waste. Every clause earns its place: identifies domain (French), legal concept (congés payés), calculation method (2.5 days/month), and annual maximum (25 working days).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 100% schema coverage and simple calculation domain, description adequately covers the domain-specific rules. Minor gap: no output schema is present, and description does not specify return format (days count vs detailed object) or whether results are rounded.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage (baseline 3). Description adds value by contextualizing the 'months_worked' parameter via the '2.5 days/month' formula and implying the date calculation purpose of 'start_date', though it does not elaborate on date format beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with exact resource 'French paid leave (congés payés)' and includes the specific legal accrual formula '2.5 days/month, max 25 working days/year', which clearly distinguishes it from siblings like calculate_vacation_days_fr or calculate_maternity_leave_fr by referencing the specific French labor code calculation method.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description implies usage through specificity of French legal framework ('congés payés') and the accrual rate (2.5 days/month), but does not explicitly state when to use this vs alternatives like calculate_vacation_days_fr or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_led_savingsCInspect
Savings from switching to LED bulbs
| Name | Required | Description | Default |
|---|---|---|---|
| led_w | Yes | LED replacement wattage | |
| old_w | Yes | Old bulb wattage | |
| hours_day | No | Hours per day | |
| num_bulbs | No | Number of bulbs | |
| price_kwh | No | EUR/kWh |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full disclosure burden. It fails to specify the time period for calculated savings (annual, monthly, lifetime), output format, or currency handling beyond the EUR/kWh hint in parameter descriptions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at five words with no redundant content. However, the lack of a verb makes it slightly less immediately actionable than a complete sentence would be.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 5 parameters, no output schema, and no annotations, the description should clarify what the savings calculation represents (e.g., annual savings, total cost difference). The absence of temporal scope or output value interpretation leaves significant gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description provides minimal additional context beyond implying a comparison between old and LED wattage, which is already evident from the parameter names and schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the resource (LED bulbs) and outcome (savings), distinguishing it from generic electricity calculators in the sibling list. However, it lacks an explicit verb (e.g., 'Calculate'), relying on the tool name to imply the action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like calculate_electricity_cost or calculate_energy_physics. No mention of prerequisites or specific use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_life_path_numerologyBInspect
Calculate numerology life path number from birth date
| Name | Required | Description | Default |
|---|---|---|---|
| birth_date | Yes | Birth date in YYYY-MM-DD format |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. It states the transformation intent but reveals nothing about output format (single digit? master numbers?), calculation methodology (Pythagorean, Chaldean), or whether results are cached.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb, zero waste. Appropriate length for single-parameter tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple single-parameter calculation tool with high schema coverage, though lacking output description since no output schema exists. Does not explain what constitutes a 'life path number' or expected return value type.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('Birth date in YYYY-MM-DD format'), establishing baseline 3. Description references 'birth date' but adds no syntax rules, validation constraints, or semantic elaboration beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (Calculate), resource (numerology life path number), and input (birth date). However, it does not explicitly differentiate from siblings like calculate_chinese_zodiac or calculate_age which also process birth dates, relying only on the domain-specific term 'life path number' for implicit distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus calculate_chinese_zodiac, calculate_biorhythm, or other birth-date-based calculation tools in the extensive sibling list. No prerequisites or exclusions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_light_yearCInspect
Light year conversions
| Name | Required | Description | Default |
|---|---|---|---|
| unit | Yes | Input unit | |
| value | Yes | Value |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden but reveals nothing about output format, precision/rounding behavior, conversion formula standards, or whether the operation is destructive/idempotent. Critical behavioral context is absent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief (3 words), the description fails to earn its place—it communicates almost nothing beyond the tool name. Appropriate conciseness requires density of information, not mere brevity; this is under-specification masquerading as efficiency.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, yet the description does not explain return values, units, or response structure. Given the ambiguity around conversion directionality and the presence of similarly-named sibling tools, the description is materially incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. The description adds no parameter-specific context (e.g., that 'value' must be non-negative, or that 'unit' determines the input's astronomical scale), but meets minimum viable threshold since schema provides basic field labels.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The phrase 'Light year conversions' is vague and borders on tautology. It fails to specify conversion direction (to or from light years), doesn't mention the supported astronomical units (parsec, AU, km), and provides no distinction from the sibling tool 'calculate_light_year_distance'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description offers zero guidance on when to use this tool versus the numerous sibling conversion tools (e.g., calculate_light_year_distance, convert_distance) or conversion utilities. No prerequisites, limitations, or alternative selection criteria are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_light_year_distanceCInspect
Convert astronomical distances between light-years, parsecs, AU, km
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Distance value | |
| from_unit | Yes | Source unit |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to specify what unit the conversion outputs to (the schema only includes 'from_unit'), whether it returns multiple conversions, or the output format. Critical behavioral traits are omitted.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single brief sentence (7 words) which is efficient, but it is overly minimal given the tool's behavioral ambiguity. The front-loaded content lacks the necessary detail to explain the conversion behavior and output format.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of 2 parameters with 100% schema coverage but no output schema or annotations, the description inadequately explains what the tool returns, what unit(s) it converts to, and how it differs from sibling conversion tools. The incomplete specification leaves critical gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description enumerates the specific supported units (ly, parsec, AU, km) which reinforces the enum values, but does not clarify why there is no 'to_unit' parameter or how the conversion direction works.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool converts astronomical distances and lists specific units (light-years, parsecs, AU, km), providing a clear verb and resource. However, it does not distinguish this specialized astronomical converter from the generic 'convert_distance' sibling tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus the generic 'convert_distance' or other conversion tools. No alternatives, prerequisites, or contextual recommendations are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_linear_regressionAInspect
Calculate linear regression slope and intercept from summary statistics
| Name | Required | Description | Default |
|---|---|---|---|
| n | Yes | Number of data points | |
| sum_x2 | Yes | Sum of (xi-x_mean)² | |
| sum_xy | Yes | Sum of (xi-x_mean)*(yi-y_mean) | |
| x_mean | Yes | Mean of x values | |
| y_mean | Yes | Mean of y values |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It specifies the mathematical operation (linear regression) and input form (summary statistics), but omits details like output format, error handling for invalid inputs (e.g., division by zero), or whether additional statistics (R²) are returned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb, no redundancy. Every word earns its place by conveying the tool's specific function and input requirements.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 100% schema coverage and clear parameter semantics, the description adequately covers the tool's purpose. Mention of output values ('slope and intercept') compensates partially for missing output schema, though explicit return structure would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear mathematical definitions for each parameter (e.g., 'Sum of (xi-x_mean)²'). Description provides baseline value by grouping them as 'summary statistics' but adds no additional syntax or constraint details beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description provides specific verb ('Calculate'), clear resource ('linear regression slope and intercept'), and distinguishes input method ('from summary statistics') from raw-data alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'from summary statistics' implies usage context (when pre-calculated means and sums of squares are available vs. raw data), but lacks explicit 'when-not' guidance or named alternatives like calculate_statistics.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_lmnp_amortizationAInspect
Calculate LMNP rental property amortization and annual tax deduction (French tax regime)
| Name | Required | Description | Default |
|---|---|---|---|
| annual_rent | Yes | Annual gross rental income in EUR | |
| property_value | Yes | Property purchase price excluding land in EUR | |
| furniture_value | Yes | Furniture and equipment value in EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses the critical jurisdictional constraint (French tax regime) but omits other expected behavioral details such as calculation methodology, whether results are estimates or official values, or confirmation that this is a pure calculation with no side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with high information density. Every word serves a purpose: action verb, calculation targets, and jurisdictional scope. No redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriately complete for a calculation tool with rich input schema: specifies the tax regime (LMNP) and what is calculated. Minor gap: does not explicitly contrast with calculate_lmnp_deficit sibling or describe output format/structure, though the description implies monetary values are returned.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (all 3 parameters well-documented), the baseline is 3. The description adds minimal parameter-specific semantics beyond the schema, though the LMNP context implies the currency (EUR) and specific valuation methodology for rental properties.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent clarity: specific verb 'Calculate', specific resource 'LMNP rental property amortization and annual tax deduction', and specific scope 'French tax regime'. The LMNP specificity distinguishes it from generic rental calculators and siblings like calculate_lmnp_deficit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides domain context (French tax regime, LMNP status) that implies when to use it. However, lacks explicit guidance on when to use this versus calculate_lmnp_deficit or other French property tax siblings, which could confuse users unfamiliar with the distinction between amortization and deficit regimes.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_lmnp_deficitCInspect
Calculate LMNP (non-professional furnished rental) tax deficit
| Name | Required | Description | Default |
|---|---|---|---|
| annual_rent | Yes | Annual rental income in EUR | |
| annual_charges | Yes | Annual deductible charges in EUR | |
| depreciation_annual | Yes | Annual depreciation amount in EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but provides almost none. It does not state whether this is read-only (implied by 'Calculate'), what the output represents (deficit amount in EUR?), or whether negative values are expected/allowed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is appropriately front-loaded with the action verb, but borders on underspecified for a complex tax calculation tool. It wastes no words yet could benefit from additional structure or length to explain the tax concept.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the specialized domain (French LMNP tax law), lack of output schema, and presence of closely related siblings, the description is insufficient. It fails to explain what 'deficit' means in this context (excess of deductible expenses over income), the output format, or how results should be interpreted for tax filing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for all three parameters. The description provides domain context (LMNP) but does not add parameter usage guidance beyond what the schema already documents, warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb (Calculate) and clearly identifies the resource (LMNP tax deficit), including helpful expansion of the acronym. However, it fails to distinguish from sibling tool calculate_lmnp_amortization, which likely produces the depreciation values needed as input here.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives, or how it relates to the related LMNP amortization calculator. No prerequisites or conditions are mentioned despite this being a specialized French tax regime tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_loan_early_repaymentAInspect
Calculate interest savings from early partial loan repayment
| Name | Required | Description | Default |
|---|---|---|---|
| early_amount | Yes | Early repayment amount EUR | |
| monthly_payment | Yes | Current monthly payment EUR | |
| months_remaining | Yes | Months remaining | |
| remaining_capital | Yes | Remaining loan capital EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. While the term 'Calculate' implies a read-only, computational operation with no side effects, the description lacks details on output format, authentication requirements, or whether results are estimates versus exact figures.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, highly efficient sentence with zero waste. It is appropriately front-loaded with the action verb and contains no redundant or unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of annotations and output schema, the description adequately conveys the tool's purpose for its complexity level (4 well-documented parameters). However, it could be enhanced by mentioning the output currency/format or noting that results are projections.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage with clear EUR denominations and definitions. The description adds no parameter-specific information, but with complete schema coverage, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') with a clear resource ('interest savings') and scope ('early partial loan repayment'). It distinguishes itself from sibling tools like 'calculate_loan_payment' by explicitly mentioning 'early partial' repayment scenarios.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no explicit guidance on when to use this tool versus alternatives (e.g., full repayment calculators), nor does it mention prerequisites or constraints beyond what is implied by the purpose statement.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_loan_paymentBInspect
Calculate monthly loan payment for any generic loan
| Name | Required | Description | Default |
|---|---|---|---|
| months | Yes | Loan duration in months | |
| principal | Yes | Loan amount | |
| annual_rate | Yes | Annual interest rate in % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description carries minimal burden—disclosing nothing about output format (numeric value vs object), rounding behavior, formula methodology, or error conditions for invalid inputs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with verb ('Calculate'), no redundant words. Length is appropriate for the information density provided.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Input parameters are fully documented via schema (100% coverage), but without output schema, the description fails to specify what values are returned (monthly payment only? total interest? full amortization?).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (principal, annual_rate, months all described). The description adds no additional param context beyond the schema, earning the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (Calculate) and resource (monthly loan payment). The term 'generic loan' provides some differentiation from siblings like 'calculate_mortgage', though it doesn't explicitly clarify when to use this versus specific loan types or annuity calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus sibling alternatives (calculate_mortgage, calculate_annuity_payment, calculate_loan_early_repayment, calculate_housing_loan_comparison) or prerequisites for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_loan_to_valueCInspect
Calculate LTV ratio and risk level
| Name | Required | Description | Default |
|---|---|---|---|
| loan_amount | Yes | Loan amount EUR | |
| property_value | Yes | Property value EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are absent, so the description carries full disclosure burden. It mentions the outputs ('LTV ratio and risk level'), which helps, but fails to explain what 'risk level' means (categorical labels? percentages?) or the return format, leaving significant behavioral gaps for a tool with no output schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at only four words with no filler. Front-loaded with the core action and outputs. However, extreme brevity sacrifices necessary context given the lack of annotations or output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Without an output schema, the description inadequately explains what 'risk level' entails or the return structure. Given this is a financial assessment tool among many siblings, the description lacks the necessary detail to be fully useful (e.g., risk thresholds, calculation methodology).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Loan amount EUR', 'Property value EUR'), establishing a baseline of 3. The description adds no additional parameter-specific semantics (e.g., valid ranges, relationship between parameters).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the verb (Calculate) and resource (LTV ratio and risk level), but 'LTV' is unexplained jargon (Loan-to-Value) and fails to distinguish from the sibling tool 'calculate_cac_ltv_ratio' (Customer Acquisition Cost vs Lifetime Value), which could cause selection errors given the parameter names are the only differentiator.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus alternatives like 'calculate_mortgage', 'calculate_debt_to_income', or 'calculate_cac_ltv_ratio'. No prerequisites or conditions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_logarithmCInspect
Calculate logarithm in any base (natural, common, binary)
| Name | Required | Description | Default |
|---|---|---|---|
| base | No | Log base: e=natural, 10=common, 2=binary | e |
| value | Yes | Value to take log of |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden of behavioral disclosure. Fails to mention mathematical domain restrictions (input must be >0, handled only by schema's minimum: 0.0001), return value format, or precision characteristics. 'Calculate' implies deterministic computation but lacks safety or side-effect context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no waste. Front-loads the core operation. However, brevity sacrifices opportunity to mention critical constraints (positive values only) that would aid agent invocation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple two-parameter mathematical function with complete schema documentation. Lacks completeness regarding edge case handling and output structure, but acceptable given the low complexity and standard mathematical operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. The description maps technical terms ('natural', 'common', 'binary') to the enum values in the schema, confirming semantics, but does not add syntax details, valid ranges explanation, or examples beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific operation (calculate logarithm) and identifies the three supported bases (natural, common, binary). Distinguishes clearly from sibling calculation tools (e.g., calculate_age, calculate_vat) via the specific mathematical verb. Minor imprecision: 'any base' suggests arbitrary bases, but the schema restricts to only three enumerated options.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this specific calculator versus the numerous other mathematical siblings (e.g., calculate_exponent, calculate_power_unit_convert). No mention of prerequisite conditions (positive numbers only) or error scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_lottery_oddsBInspect
Calculate lottery win odds for any number pool and pick count
| Name | Required | Description | Default |
|---|---|---|---|
| bonus_pool | No | Size of the bonus number pool (default 0, no bonus) | |
| bonus_numbers | No | Number of bonus/powerball numbers to match (default 0) | |
| total_numbers | Yes | Total numbers in the main pool | |
| numbers_to_pick | Yes | How many numbers you pick |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of disclosure but omits critical behavioral details: whether this is read-only, what format odds are returned in (ratio, fraction, percentage), or how the calculation handles the optional bonus numbers.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero waste. Front-loaded with the action ('Calculate lottery win odds') followed immediately by scope ('for any number pool and pick count'). Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 100% schema coverage and simple flat parameters, the description is minimally adequate. However, it omits expected output format, doesn't explain the interaction between main and bonus pools, and leaves agents to infer the mathematical model from parameter names alone.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds slight value by mapping 'number pool' to total_numbers and 'pick count' to numbers_to_pick, connecting technical parameters to the lottery domain, but does not elaborate on syntax or bonus ball semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and resource ('lottery win odds') combined with scope ('any number pool and pick count'). It distinguishes from siblings like calculate_card_draw_probability by specifying 'lottery', though it could further clarify it handles combinatorial probability.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While mentioning 'any number pool and pick count' implies flexibility, the description provides no explicit guidance on when to select this tool over sibling probability calculators such as calculate_poker_hand_probability or calculate_dice_probability.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_luggage_weightAInspect
Calculate total luggage weight and compare to airline limits (carry-on, economy checked, business checked)
| Name | Required | Description | Default |
|---|---|---|---|
| items | Yes | Array of luggage items with name and weight in kg |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries the full burden. It discloses what limits are compared (carry-on, economy, business), adding behavioral context. However, it omits what the comparison returns (excess weight? boolean? totals) and whether there are side effects or prerequisites.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero redundancy. Front-loaded with the action verb. However, it could benefit from one additional sentence describing the output format to achieve maximum value per word.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter tool with complete schema coverage, but the lack of output schema means the description should ideally indicate what calculation results are returned (totals, comparisons, recommendations).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the items array fully documented. The description does not add parameter syntax, format details, or examples beyond the schema, which is appropriate baseline given the schema's completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific verb (Calculate), resource (luggage weight), and scope (compare to airline limits including carry-on, economy checked, and business checked). It effectively distinguishes this from the 300+ other calculate_* siblings by specifying the unique airline luggage domain.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implicit usage context through the airline/luggage domain mention, but lacks explicit when-to-use guidance or exclusion criteria. Given the unique domain among siblings, the use case is inferable but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_malus_ecologiqueBInspect
French ecological malus 2026: CO2 g/km based tax on new vehicle registration
| Name | Required | Description | Default |
|---|---|---|---|
| co2_g_km | Yes | CO2 emissions in g/km |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses the temporal scope (2026) and calculation basis (CO2 g/km), but omits critical behavioral details: the return format (tax amount in EUR), whether the malus uses tiered rates, minimum/maximum tax bounds, or that this is a pure calculation function (read-only).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficiently structured sentence that front-loads the key identifier ('French ecological malus 2026') followed by the mechanism and target. No words are wasted, and every element earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one numeric parameter, no output schema) and high schema coverage, the description adequately covers the domain context. It specifies the jurisdiction, year, and tax mechanism sufficiently for an agent to understand the tool's scope, though it could benefit from noting the output currency (EUR).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage ('CO2 emissions in g/km'), establishing a baseline of 3. The description adds contextual reinforcement by mentioning 'CO2 g/km based' taxation, confirming the parameter's role in the calculation without duplicating schema metadata.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the specific domain (French ecological malus), the year (2026), the basis (CO2 g/km), and the target (new vehicle registration tax). It effectively distinguishes from siblings like calculate_belgian_car_advantage or calculate_french_vat by specifying the exact tax type. However, it uses a noun phrase rather than an explicit action verb like 'Calculate'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no explicit guidance on when to use this tool versus alternatives such as calculate_puissance_fiscale or calculate_car_depreciation. While the domain specificity implies usage, there are no explicit when/when-not statements or prerequisites (e.g., that this applies only to France, 2026 regulations).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_marathon_splitsBInspect
Generate even and negative-split pacing plans for a marathon target time
| Name | Required | Description | Default |
|---|---|---|---|
| target_time_minutes | Yes | Target marathon finish time in minutes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses that the tool produces multiple plan types (even/negative-split), but lacks details on output format (splits per km/mile?), calculation methodology, or whether it returns multiple plans or a single recommendation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that is front-loaded and efficient. No redundant words, though given the lack of annotations and output schema, slightly more detail would be warranted rather than being overly terse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter calculation tool, the description covers the core function but leaves gaps regarding output structure and behavioral details (e.g., split intervals, negative-split logic) that would help an agent understand return values without an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with clear description of 'target_time_minutes'. The description aligns with this by mentioning 'marathon target time', but adds no additional parameter semantics (e.g., valid ranges, format) beyond what the schema already provides. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Generate') and resource ('pacing plans'), with specific mention of 'even and negative-split' strategies that distinguish this from generic running calculators. However, it could more explicitly differentiate from sibling 'calculate_running_pace'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies marathon-specific usage through the description, but lacks explicit guidance on when to choose this over 'calculate_running_pace' or 'calculate_training_zones_running', or what constitutes a valid negative-split strategy.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_markup_marginCInspect
Markup vs margin calculator
| Name | Required | Description | Default |
|---|---|---|---|
| cost | Yes | Cost price | |
| selling_price | Yes | Selling price |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present to indicate safety or side effects, yet the description fails to disclose critical behavioral details such as what values are returned (markup percentage, margin percentage, or both), whether results are cached, or if there are any constraints beyond the schema minimums.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single four-word sentence with no structural waste or filler. It is appropriately front-loaded, though it sacrifices necessary detail for brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema provided, the description omits critical information about what values are returned. It also fails to differentiate from numerous sibling margin/profit calculators on the server (calculate_profit_margin, calculate_exchange_margin, etc.) despite the crowded namespace.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions ('Cost price', 'Selling price'), establishing a baseline score of 3 per rubric. The description adds no additional parameter semantics, syntax details, or usage constraints beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
"Markup vs margin calculator" identifies the general domain (markup and margin calculations) but lacks specificity on what exactly is computed—for example, whether it returns both percentages, converts between them, or calculates their difference. It does not distinguish from siblings like calculate_profit_margin or calculate_exchange_rate_margin.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus alternatives like calculate_profit_margin or calculate_margin. No prerequisites, conditions, or exclusion criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_maternity_leave_frCInspect
French maternity leave duration
| Name | Required | Description | Default |
|---|---|---|---|
| twins | No | Multiple birth | |
| existing_children | Yes | Existing children |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fails to disclose critical behavioral traits: return format (days? weeks? months?), calculation methodology (based on current French labor law?), or how parameters affect the result (e.g., that twins may extend duration). Only 'duration' hints at the output type.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief (4 words) but not efficiently informative. While it avoids redundancy, it undershoots appropriate length for a tool with zero annotations and no output schema, leaving essential behavioral and output questions unanswered.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and no annotations, the description should explain the return value format and calculation basis. It fails to complete the picture for an agent needing to know what value is returned and how to interpret it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Multiple birth', 'Existing children'), so the schema adequately documents parameters. The description adds no additional semantic context for parameters, earning the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description identifies the domain (French maternity leave) and aspect (duration) but lacks a specific action verb (e.g., 'Calculates' or 'Returns'). It partially distinguishes from siblings by specifying 'French' but relies on the tool name to imply the calculation action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus other French calculation tools (e.g., calculate_overtime_fr, calculate_french_salary) or what inputs are required. No alternatives or prerequisites mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_max_heart_rateCInspect
Estimate maximum heart rate using standard or age-adjusted formulas
| Name | Required | Description | Default |
|---|---|---|---|
| age | Yes | Age in years | |
| formula | No | Formula: standard (220-age), tanaka (men), gulati (women) | standard |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, and description fails to disclose output format (integer BPM?), calculation limitations, or medical disclaimers appropriate for a health-estimation tool. Carries insufficient behavioral disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with no redundancy. However, extreme brevity leaves insufficient room for necessary behavioral and contextual disclosure given the tool's medical/fitness domain.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks mention of output format and omits crucial context about gender-specific formula selection. No output schema exists, yet description doesn't compensate by describing return values or relationship to training zone calculations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with detailed formula descriptions ('standard (220-age), tanaka (men), gulati (women)'). Description adds minimal semantic value beyond the schema, meeting baseline expectation for well-documented schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Estimate' and resource 'maximum heart rate' with method 'standard or age-adjusted formulas.' Distinguishes sufficiently from siblings like calculate_heart_rate_zones by focusing specifically on the max HR value itself rather than training zones.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use specific formulas (standard vs tanaka vs gulati), nor mention that tanaka is designed for men and gulati for women—critical information for correct invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_meat_cookingCInspect
Meat cooking time and temperature
| Name | Required | Description | Default |
|---|---|---|---|
| meat | Yes | Meat type | |
| doneness | No | Doneness | medium |
| weight_kg | Yes | Meat weight kg |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet the description omits critical behavioral context: it doesn't specify the output format (time per side? total duration? internal temp?), calculation method (minutes per kg), or food safety warnings given that undercooked meat poses health risks.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief at 5 words, it is not appropriately sized for the domain complexity. The description is a noun phrase rather than a sentence, lacking a verb and failing to front-load critical distinctions from siblings or safety notes.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Inadequate for a food preparation tool where safety is paramount. With no output schema provided, the description should explain what gets returned. The near-duplicate sibling name creates confusion that the description must resolve but doesn't.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions ('Meat type', 'Doneness', 'Meat weight kg'), establishing baseline 3. The tool description adds no semantic context beyond the schema (e.g., no explanation that doneness is ignored for poultry, or weight precision requirements).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The phrase 'Meat cooking time and temperature' identifies the domain but lacks a verb (Calculate? Get? Retrieve?), making the action unclear. Critically, it fails to distinguish from sibling 'calculate_meat_cooking_time', leaving ambiguity about whether this returns additional temperature data or performs a different calculation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus 'calculate_meat_cooking_time' or 'calculate_cooking_time'. No mention of prerequisites (e.g., weight must be in kg) or safety considerations for proper meat preparation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_meat_cooking_timeCInspect
Calculate meat cooking time by weight and desired doneness
| Name | Required | Description | Default |
|---|---|---|---|
| doneness | Yes | Desired doneness | |
| meat_type | Yes | Type of meat | |
| weight_kg | Yes | Meat weight kg |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. However, it fails to state cooking method assumptions (roasting vs. grilling), whether resting time is included, the output format (minutes vs. hours), or that results are estimates. This leaves critical behavioral traits undocumented.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no redundancy. Front-loaded with the core action and immediately identifies the key input factors. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, yet the description does not explain what values are returned (time in minutes? total duration? per-side timing?). Combined with missing cooking assumptions and zero annotation coverage, the description is insufficient for safe invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema adequately documents all three parameters (weight_kg, meat_type, doneness). The description mentions 'weight' and 'doneness' explicitly, meeting the baseline expectation when schema coverage is high, but adds no additional semantic context like 'bone-in vs. boneless' or weight validation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and resource ('meat cooking time') with specific modifiers ('by weight and desired doneness'). Distinguishes from generic cooking calculators, though does not explicitly differentiate from the sibling 'calculate_meat_cooking' or specify cooking method assumptions (e.g., oven roasting).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus siblings like 'calculate_cooking_time' or 'calculate_meat_cooking', nor does it mention prerequisites such as oven temperature requirements or meat thickness considerations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_menstrual_cycleBInspect
Calculate next period, fertile window, and ovulation date
| Name | Required | Description | Default |
|---|---|---|---|
| cycle_length | No | Average cycle length days | |
| last_period_date | Yes | Last period start date YYYY-MM-DD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full behavioral disclosure burden. It lists what gets calculated (period, fertile window, ovulation) but lacks critical context: it doesn't disclose that these are estimates based on calendar/algorithmic methods, doesn't mention typical accuracy limitations, and omits output format details or medical disclaimer expectations for health tools.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is front-loaded with the key outputs and contains no wasted words. However, it may be overly terse for a health-related calculation tool where additional context (estimates, assumptions) would be valuable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter tool with no output schema, the description covers the basic inputs and outputs. However, given this is a biological/health calculation, the description is incomplete—it should mention that results are estimates based on average cycle lengths and may not account for irregular cycles or medical conditions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both 'cycle_length' and 'last_period_date' fully documented in the JSON schema. The description adds no semantic meaning beyond what the schema already provides (average days, date format), warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool calculates three specific outputs: next period, fertile window, and ovulation date, using specific verbs and resources. However, it fails to distinguish from sibling tool 'calculate_ovulation', which appears to overlap functionally with the ovulation calculation aspect of this tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, particularly the sibling 'calculate_ovulation'. There is no mention of prerequisites, assumptions about cycle regularity, or when a simpler ovulation-only calculation might be preferred.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_mining_profitabilityCInspect
Calculate daily and monthly cryptocurrency mining profitability
| Name | Required | Description | Default |
|---|---|---|---|
| power_watts | Yes | Mining hardware power consumption in watts | |
| block_reward | No | Block reward in coins (default 3.125 BTC post-halving) | |
| hashrate_mhs | Yes | Mining hashrate in MH/s | |
| coin_price_usd | Yes | Current coin price in USD | |
| network_difficulty | Yes | Current network difficulty | |
| electricity_cost_kwh | Yes | Electricity cost per kWh in fiat currency |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full burden. It fails to disclose whether this performs live lookups, uses cached data, or is purely deterministic based on inputs. It doesn't explain what 'profitability' entails (revenue minus electricity costs, hardware depreciation, etc.) or that results are estimates based on instantaneous network parameters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, 7 words, front-loaded with the action verb. Efficient structure with no redundancy, though arguably too terse given the lack of annotations and behavioral context. Appropriate density for the sentence provided.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 6 parameters, no annotations, and no output schema, the description inadequately explains the calculation methodology, assumptions (like the implied Bitcoin-centric default block_reward of 3.125), or what the return value represents (net profit, gross revenue, break-even timeline).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for all 6 parameters including units (MH/s, watts, USD). The description adds no additional semantic context beyond the schema, but the schema does the heavy lifting adequately. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and resource ('cryptocurrency mining profitability') and distinguishes from siblings like 'calculate_crypto_profit_loss' and 'calculate_staking_rewards' by specifying the 'mining' domain. However, it doesn't explicitly clarify this is for Proof-of-Work mining (distinguishing from staking) or mention the specific output timeframe units.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus siblings like 'calculate_staking_rewards' (for PoS) or 'calculate_crypto_profit_loss' (for trading). No prerequisites mentioned, such as needing current network difficulty data or electricity rates.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_moon_phaseBInspect
Calculate the current moon phase for a given date
| Name | Required | Description | Default |
|---|---|---|---|
| date | Yes | Date in YYYY-MM-DD format |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Calculate' implies a read-only operation, the description omits details about output format (string vs object), timezone handling, astronomical accuracy, or whether it returns phase names, illumination percentages, or degrees. For a tool with zero annotation coverage, this is insufficient behavioral transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is appropriately sized and front-loaded with the action verb. Every word earns its place—no redundancy, no tautology, and no unnecessary filler. The structure is optimal for quick comprehension by an agent.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (1 parameter, no nested objects) and high schema coverage, the description is minimally adequate. However, with no output schema provided, the description should ideally mention what the calculation returns (e.g., phase names like 'Full Moon' or illumination data) to help the agent determine if this tool meets its needs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema fully documents the 'date' parameter with type, pattern, and description ('Date in YYYY-MM-DD format'). The description adds only 'for a given date,' which aligns with but does not extend beyond the schema documentation. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clear resource ('moon phase'), and the 'for a given date' clause clarifies the single parameter's purpose. It distinguishes from siblings like calculate_day_of_week and calculate_chinese_zodiac by being astronomy-specific. However, it lacks specificity about what the calculation returns (phase names, illumination percentages, etc.), preventing a 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites or constraints. While siblings cover different domains (taxes, cooking, etc.), the description does not explicitly differentiate this from other date-based calculations like calculate_day_of_week.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_moroccan_cnssAInspect
Calculate Moroccan CNSS contributions (employee and employer shares)
| Name | Required | Description | Default |
|---|---|---|---|
| gross_monthly_mad | Yes | Gross monthly salary in MAD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully indicates that the calculation produces both employee and employer share amounts, which is valuable behavioral context. However, it omits critical details such as whether CNSS contribution caps (plafonds) are applied automatically or whether the results are gross or net figures.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a highly efficient seven-word sentence with zero waste. The core action and subject are front-loaded, while the parenthetical '(employee and employer shares)' provides essential scoping information without cluttering the main clause.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a single-parameter calculation tool with complete input schema coverage, the description provides adequate context for its intended use. The parenthetical mention of both shares partially compensates for the missing output schema, though explicit mention of CNSS-specific calculation rules or caps would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the parameter 'gross_monthly_mad' is fully documented in the schema itself as 'Gross monthly salary in MAD'. The description adds no additional parameter semantics (such as whether the gross amount should include bonuses or 13th-month pay), warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Calculate), the jurisdiction-specific resource (Moroccan CNSS contributions), and the scope covering both components (employee and employer shares). It effectively distinguishes from siblings like calculate_moroccan_income_tax or calculate_senegalese_css through specific domain terminology.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the specificity of 'CNSS' (Caisse Nationale de Sécurité Sociale) provides implied context that this is for social security contributions rather than income tax or VAT, there is no explicit guidance on when to use this versus the available sibling tools for Moroccan tax calculations or other countries' social contribution systems.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_moroccan_income_taxAInspect
Calculate Moroccan income tax (IR) using DGI progressive brackets with family deductions
| Name | Required | Description | Default |
|---|---|---|---|
| dependents | No | Number of dependents (360 MAD deduction each, max 6) | |
| annual_income_mad | Yes | Annual gross income in Moroccan Dirhams (MAD) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses calculation methodology ('DGI progressive brackets', 'family deductions') but omits behavioral details like return format, error conditions, currency handling specifics, or whether results include marginal vs effective rates.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of appropriate length with zero waste. Front-loaded action verb ('Calculate') immediately establishes purpose, followed by specific jurisdiction and methodology details. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage and only 2 parameters, the input requirements are well documented. However, lacking both an output schema and annotations, the description should ideally disclose what gets returned (tax amount, deductions total, brackets applied) to be fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds semantic context by linking 'family deductions' to the dependents parameter conceptually, but does not expand parameter syntax, validation rules, or input formats beyond what the schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' with clear resource 'Moroccan income tax (IR)', jurisdiction-specific details 'DGI', and mechanism 'progressive brackets with family deductions' clearly distinguish this from sibling tax calculators like calculate_belgian_income_tax or calculate_french_income_tax.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implied usage context through specific jurisdiction naming ('Moroccan'), allowing the agent to infer this is for Morocco-specific tax calculations. However, lacks explicit when-to-use guidance or comparisons against the numerous other national tax calculators in the sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_moroccan_profit_foncierBInspect
Calculate Moroccan property income tax (profit foncier / revenus fonciers)
| Name | Required | Description | Default |
|---|---|---|---|
| dependents | No | Number of dependents for family deduction | |
| expenses_pct | No | Deductible expenses as % of rent (default 40%) | |
| annual_rent_mad | Yes | Annual rental income in MAD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, requiring the description to carry the full burden of behavioral disclosure. While 'Calculate' implies a read-only operation, the description omits critical details such as applicable tax year, specific Moroccan tax brackets or deductions applied, whether the calculation includes social contributions, or the structure of the returned values.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of a single efficient sentence that front-loads the action verb, jurisdiction, tax type, and local legal terminology. There is no redundant text, tautologies, or extraneous information—every word serves the purpose of identifying the tool's function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's limited complexity (three primitive parameters with complete schema documentation) and straightforward calculation purpose, the description provides minimal but sufficient context for tool selection. However, the absence of an output schema and lack of behavioral details (tax rules applicability) prevent a higher score.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With schema description coverage at 100%, all three parameters (annual_rent_mad, dependents, expenses_pct) are fully documented in the JSON schema with clear descriptions. The description adds no additional parameter semantics (syntax, format, or interdependencies) beyond what the schema already provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and identifies the exact tax domain ('Moroccan property income tax'). It implicitly distinguishes from sibling tools like calculate_moroccan_income_tax by specifying 'property' income and including the French terms 'profit foncier / revenus fonciers', though it does not explicitly reference alternative tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to select this tool versus related Moroccan tax calculators (e.g., calculate_moroccan_income_tax for general employment income). It lacks prerequisites, conditions for use, or contextual triggers that would help an agent choose between the multiple Moroccan tax tools available.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_moroccan_vatBInspect
Calculate Moroccan VAT (TVA) at standard or reduced rates
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Input mode: ht=hors taxe, ttc=toutes taxes comprises | ht |
| rate | No | VAT rate: 0%, 7%, 10%, 14%, or 20% (standard) | 20 |
| amount | Yes | Amount in MAD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, and description carries minimal burden. It mentions standard/reduced rates but fails to disclose output format, whether calculations include rounding rules, or if any external lookups occur. 'Calculate' implies pure computation but lacks confirmation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, 9 words. Front-loaded with verb. Every word earns its place with zero redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 3-parameter calculation tool with well-documented schema. However, lacks output description (what fields return: VAT amount, total, etc.) and specific guidance on which Moroccan goods/services qualify for reduced rates versus standard.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions (ht/ttc modes, specific rate percentages, MAD currency). Description adds 'standard or reduced rates' which aligns with the rate parameter options, but doesn't extend meaning beyond the comprehensive schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (Calculate), resource (Moroccan VAT/TVA), and scope (standard or reduced rates). Effectively distinguishes from sibling VAT calculators by specifying Moroccan jurisdiction and rate types.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this versus calculate_vat_generic, calculate_vat_reverse, or other country-specific VAT tools. User must infer from the name/description that this is specifically for Moroccan tax rates.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_mortgageBInspect
Calculate mortgage/loan monthly payment, total cost, and optional amortization schedule
| Name | Required | Description | Default |
|---|---|---|---|
| years | Yes | Loan duration in years | |
| principal | Yes | Loan amount in currency units | |
| annual_rate | Yes | Annual interest rate in % | |
| with_schedule | No | Include first 12 months + last month amortization |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries the full burden of behavioral disclosure. It mentions the three outputs (monthly payment, total cost, amortization schedule) and hints at the optional nature of the schedule, but lacks information on currency handling, validation behavior, or whether results are stored/tracked.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no waste. It front-loads the core functionality (calculate mortgage/loan) followed by the specific outputs, making it immediately scannable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage, the description doesn't need to elaborate on inputs. It partially compensates for the missing output schema by listing the three calculation results. However, it lacks differentiation from related tools and doesn't specify the return format structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all four parameters. The description adds minimal semantic value beyond mapping 'optional amortization schedule' to the with_schedule parameter, which is sufficient given the comprehensive schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the calculation outputs (monthly payment, total cost, amortization schedule) using specific verbs and resources. However, given siblings like 'calculate_loan_payment' and 'calculate_us_mortgage', it fails to distinguish when to use this specific tool versus alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus the numerous sibling calculation tools (e.g., calculate_loan_payment, calculate_annuity_payment, calculate_us_mortgage). No prerequisites, exclusions, or alternative recommendations are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_mortgage_insuranceCInspect
Calculate mortgage insurance (assurance emprunteur) cost
| Name | Required | Description | Default |
|---|---|---|---|
| rate_pct | No | Annual insurance rate in % of loan (default 0.36) | |
| loan_amount | Yes | Loan amount in EUR | |
| duration_years | Yes | Loan duration in years |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. It fails to indicate what the calculation returns (total insurance cost? monthly premium? annual cost?) or whether this is a safe read-only operation, though 'Calculate' implies non-destructive behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 7 words. Front-loaded with verb. However, the extreme brevity leaves gaps in contextual completeness; the single sentence could have included output specification without sacrificing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 3-parameter calculation tool with no output schema, the description inadequately explains what value is returned (total cost over duration? monthly?). The domain is identified but the calculation's output semantics are missing, which is critical for financial calculator selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. The description mentions 'assurance emprunteur' which contextualizes the domain, but the schema already documents parameters sufficiently ('Annual insurance rate', 'Loan amount in EUR', etc.) without needing additional description support.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and resource ('mortgage insurance cost'), with helpful French specificity ('assurance emprunteur') indicating French borrower insurance context. However, it does not explicitly differentiate from the sibling tool 'calculate_insurance_estimate' or 'calculate_mortgage'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus the 200+ other calculators, particularly 'calculate_insurance_estimate' or 'calculate_mortgage'. No prerequisites or constraints mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_motor_torqueCInspect
Motor torque from power and RPM
| Name | Required | Description | Default |
|---|---|---|---|
| rpm | Yes | RPM | |
| power_w | Yes | Power watts |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to describe the output format, units (presumably Newton-meters), mathematical formula used, or any validation behavior beyond the schema's minimum values. It does not indicate whether this is a pure computation or has any side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely brief at only five words. While it wastes no space on filler, it is arguably too terse—lacking essential context such as output units or a complete sentence structure. Every word earns its place, but the overall brevity compromises informational completeness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter calculation tool with primitive types and no output schema, the description minimally suffices by identifying the core formula relationship. However, it omits the output unit (critical for a calculation tool) and provides no hint about the returned data structure, which is necessary given the absence of an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Power watts', 'RPM'), establishing a baseline of 3. The description adds minimal semantic value beyond the schema, merely stating that these parameters are used to derive torque. It does not clarify why minimum values exist (e.g., RPM ≥ 1 to avoid division by zero) or provide usage examples.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description identifies the specific calculation (motor torque) and inputs (power, RPM), functioning as a shorthand formula reference. However, it lacks an explicit verb ('Calculate') and fails to differentiate from physics-related siblings like calculate_force or calculate_energy_physics, leaving ambiguity about when to select this specific tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides absolutely no guidance on when to use this tool versus alternatives. There is no mention of prerequisites, no exclusion criteria, and no pointer to related tools like calculate_cycling_power or calculate_pump_power that might be confused with this motor-specific calculation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_moving_cost_detailedCInspect
Estimate detailed moving cost based on volume, distance and floor
| Name | Required | Description | Default |
|---|---|---|---|
| floor | No | Floor number (default 0 = ground floor) | |
| elevator | No | Whether elevator is available (default true) | |
| volume_m3 | Yes | Volume of goods to move in m3 | |
| distance_km | Yes | Moving distance in km |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to specify what 'detailed' entails (itemized breakdown vs. total), the return format, currency used, or calculation methodology. It does not indicate whether this is a pure computation or requires external service calls.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is efficiently structured with the action front-loaded. However, given the lack of annotations and output schema, the extreme brevity becomes a liability rather than a virtue, leaving critical behavioral and output questions unanswered.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 4-parameter tool with no annotations and no output schema, the description is insufficient. It fails to describe the output structure ( currency amount? cost breakdown object? ), does not acknowledge the 'elevator' parameter in the prose, and leaves the meaning of 'detailed' unexplained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions three parameters (volume, distance, floor) but omits 'elevator' entirely, despite it being a significant cost factor. No additional semantic context (e.g., expected precision, measurement standards) is provided beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses the specific verb 'Estimate' with clear resource 'moving cost' and identifies key inputs (volume, distance, floor). However, it could better differentiate from sibling 'calculate_moving_volume' (which calculates volume, not cost) and does not explain what makes this calculation 'detailed' versus a simple estimate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance is provided on when to use this tool versus alternatives, prerequisites for use (e.g., needing exact measurements), or when not to use it. The agent must infer applicability solely from the parameter list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_moving_volumeBInspect
Estimate moving volume by home type
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Home type |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. While it indicates the result is an 'Estimate' (implying approximation), it fails to disclose the output units (cubic meters/feet), calculation methodology, accuracy level, or what constitutes standard furniture for each home type.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is exactly six words, front-loaded with the action and object, and contains no redundancy. Every word earns its place for this simple single-parameter tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a simple calculator with one enum parameter and no output schema, the description is minimally adequate. However, it lacks critical context for a calculation tool: the units of volume returned and the basis of the estimation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 100% description coverage ('Home type') and the parameter enum values are self-explanatory. The description adds minimal semantic value beyond the schema, merely reinforcing that the input is home type. Baseline score appropriate for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Estimate') and resource ('moving volume') and implies the input method ('by home type'). It implicitly distinguishes from sibling calculate_moving_cost_detailed by focusing on volume rather than cost, though it doesn't explicitly name the alternative.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus calculate_moving_cost_detailed or other moving-related calculators. There are no prerequisites, conditions, or exclusion criteria mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_net_worthCInspect
Calculate net worth and debt ratio from assets and liabilities
| Name | Required | Description | Default |
|---|---|---|---|
| assets_total | Yes | Total assets EUR | |
| liabilities_total | Yes | Total liabilities EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Lacks annotations requiring description to disclose computational behavior. While it names the outputs (net worth, debt ratio), it does not indicate calculation methodology, currency handling details beyond schema's EUR mention, idempotency, or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 9-word sentence efficiently conveys core purpose. Front-loaded with action verb 'Calculate'. Appropriate length for simple tool but leaves room for additional behavioral context given lack of annotations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for low-complexity tool with complete input schema. Mentions output metrics (debt ratio, net worth) compensating partially for missing output schema, but lacks behavioral safety disclosures expected when annotations are absent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions 'Total assets EUR' and 'Total liabilities EUR'. Description maps parameters to 'assets and liabilities' but adds no syntax details, validation explanations, or format examples beyond schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific calculations performed (net worth and debt ratio) and input data required (assets and liabilities). Distinguishes from sibling debt calculators by specifying dual output, though lacks explicit comparison to calculate_debt_to_income or similar tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus other financial calculators like calculate_debt_to_income or calculate_loan_to_value. No prerequisites or constraints mentioned beyond parameter schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_night_shift_payAInspect
Calculate night shift pay (21h-6h) with configurable premium percentage
| Name | Required | Description | Default |
|---|---|---|---|
| night_hours | Yes | Number of night hours worked (21h-6h) | |
| premium_pct | No | Night shift premium percentage (default 25%) | |
| base_hourly_rate | Yes | Normal hourly rate in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully discloses the temporal domain constraint (21h-6h) which defines what constitutes 'night'. However, lacks disclosure of output format, currency precision, or calculation methodology (e.g., whether result is gross/taxable). No indication of side effects or persistence, though 'Calculate' implies read-only.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Exceptionally tight: 9 words total. Every segment earns its place—action verb, target resource, temporal constraint, and key parameter feature. Front-loaded structure with no redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for a simple 3-parameter calculation tool with flat schema. Since no output schema exists, could benefit from explicitly stating the return value represents total night shift pay amount. However, the tool name and schema make the output reasonably inferable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3. Description adds value by framing 'premium percentage' as 'configurable' (highlighting optionality) and reinforcing the '21h-6h' scope for night_hours. Provides conceptual context that ties the three parameters together into a cohesive pay calculation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specific verb ('Calculate') + resource ('night shift pay') + precise scope ('21h-6h') that distinguishes it from generic overtime calculators. The mention of 'configurable premium percentage' clarifies the specific calculation method used.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implied usage is clear through the specific time range (21h-6h), but no explicit guidance on when to use versus siblings like 'calculate_overtime_pay_fr' or 'calculate_overtime_fr'. No prerequisites or exclusion criteria mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_notary_feesCInspect
Calculate French notary fees (frais de notaire) for a real estate purchase
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | Property type: ancien (old) or neuf (new) | ancien |
| price | Yes | Purchase price in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It fails to disclose whether this is a read-only calculation, what components the fee calculation includes (taxes vs. emoluments), or the structure/format of returned values.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with no filler. However, given the lack of annotations and output schema, the extreme brevity leaves critical gaps in behavioral disclosure that a slightly longer description could address.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Identifies the specific French legal/financial domain adequately, but lacks completeness given the complexity: no explanation of calculation methodology, output format, or how it differs from the 'detailed' variant. Minimum viable for a 2-parameter tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('Purchase price in euros', 'Property type: ancien (old) or neuf (new)'). Description adds minimal semantic value beyond confirming the real estate domain context, maintaining the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Calculate' with specific resource 'French notary fees (frais de notaire)' and context 'real estate purchase'. However, it fails to distinguish from sibling tool 'calculate_notary_fees_detailed', which is likely a more comprehensive alternative.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus 'calculate_notary_fees_detailed' or other real estate calculators. No mention of prerequisites, estimation disclaimers, or French regional applicability.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_notary_fees_detailedBInspect
Estimate French notary fees breakdown for property purchase
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Property type | |
| department | No | French department code (optional, affects DMTO rate) | |
| property_price | Yes | Property price EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses 'Estimate' (approximate nature) and 'breakdown' (detailed component output), but lacks details on what specific fee components are included, calculation methodology, or limitations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 9 words with zero waste. Key information (estimate, French notary fees, breakdown, property purchase) is front-loaded efficiently.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of French notary fees (DMTO, emoluments, disbursements) and lack of output schema, mentioning 'breakdown' provides minimal output guidance. Adequate but missing explanation of specific cost components covered.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear enum values ('new'/'old') and optional flags. Description adds no parameter-specific semantics beyond schema, which is acceptable given the high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the action ('Estimate'), subject ('French notary fees breakdown'), and context ('property purchase'). However, it fails to distinguish from sibling tool 'calculate_notary_fees', leaving ambiguity about which to use.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this detailed version versus the sibling 'calculate_notary_fees'. No prerequisites or conditions mentioned (e.g., valid French department codes required).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_number_base_convertCInspect
Convert a number between bases 2, 8, 10, and 16
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Number to convert as string | |
| to_base | Yes | Target base | |
| from_base | Yes | Source base |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It successfully specifies valid input bases (2, 8, 10, 16) but omits error handling behavior, output format, and validation rules (e.g., what happens if base 3 is requested).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is front-loaded with the verb 'Convert' and wastes no words. However, given the presence of a similarly-named sibling tool and lack of annotations, the extreme brevity leaves critical gaps rather than earning full marks for efficient communication.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having well-documented parameters in the schema, the description is incomplete due to the unresolved conflict with sibling 'calculate_base_converter'. Without distinguishing these tools or describing output format, the agent lacks sufficient context to invoke the tool confidently.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage, establishing a baseline of 3. The description mentions the four valid bases which implicitly constrains the semantic meaning of 'from_base' and 'to_base', but does not explicitly elaborate on parameter syntax or formats beyond the schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the specific action (convert) and resource (number bases) and enumerates supported bases (2, 8, 10, 16). However, it fails to differentiate from sibling tool 'calculate_base_converter' which has nearly identical functionality, creating ambiguity about which tool to select.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, particularly the confusingly similar 'calculate_base_converter'. No prerequisites, constraints, or selection criteria are mentioned despite the crowded sibling namespace.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ohms_lawBInspect
Ohm's law: V=IR, P=VI. Solve for missing value
| Name | Required | Description | Default |
|---|---|---|---|
| current | No | Amps | |
| voltage | No | Volts | |
| resistance | No | Ohms |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses the key algorithmic behavior ('Solve for missing value'), indicating it infers which variable to calculate. However, it fails to specify the return format, which parameter is returned, or validation constraints (exactly 2 inputs required).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at two sentences. No wasted words, but the mention of P=VI without context slightly detracts from efficiency. Front-loads the concept (Ohm's law) before the action, which is appropriate.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and the specific constraint that exactly two inputs are required to solve for the third, the description is incomplete. It does not explain what the tool returns (voltage, current, or resistance?), nor does it clarify the P=VI reference or error conditions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (Amps, Volts, Ohms), establishing a baseline of 3. The description references V, I, and R in the formula, which maps to the parameters, but adds no additional semantic detail about valid ranges, units, or the requirement to provide pairs of values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the specific physics law (V=IR) and core operation ('Solve for missing value'), which distinguishes it from generic calculators. However, confusingly mentions power formula (P=VI) without clarifying if power is calculated or how it relates to the three input parameters, slightly muddling the scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus siblings like `calculate_electrical_power` or `calculate_cable_section`. Does not mention prerequisites (e.g., needing exactly two of three values) or when the tool cannot solve (e.g., all three values provided, or zero provided).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_one_rep_maxAInspect
Estimate 1 repetition maximum from submaximal lift using Epley, Brzycki and Lombardi formulas
| Name | Required | Description | Default |
|---|---|---|---|
| reps | Yes | Number of repetitions performed | |
| weight_lifted | Yes | Weight lifted in kg or lbs |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It transparently discloses the calculation methodology via named formulas, but omits output format details, mutability, or rate limits that would help an agent understand execution behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single dense sentence with zero waste. Front-loaded with the action ('Estimate'), followed by method and domain context. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 2-parameter tool with complete schema coverage, but lacks return value description (no output schema exists) and misses opportunity to contrast with related 'calculate_1rm_table' sibling tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear descriptions for 'weight_lifted' and 'reps'. Description adds domain context ('submaximal lift') but doesn't extend parameter semantics beyond what the schema already provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Estimate' with specific resource '1 repetition maximum'. Naming the specific formulas (Epley, Brzycki, Lombardi) adds methodology distinction, though it doesn't explicitly differentiate from sibling 'calculate_1rm_table'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage context via 'from submaximal lift' (suggesting when max testing isn't possible), but provides no explicit when-to-use/when-not-to-use guidance or comparison to the table-generating sibling.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_overtime_frCInspect
French overtime pay calculation
| Name | Required | Description | Default |
|---|---|---|---|
| base_hours | No | Base weekly hours | |
| hourly_rate | Yes | Hourly rate EUR | |
| actual_hours | Yes | Actual weekly hours |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but provides minimal information. It does not specify the calculation methodology (e.g., 25% increase for first 8 hours), return value format, or whether the operation has side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At four words, the description is extremely concise, but this brevity results in underspecification rather than efficient communication. It functions as a label rather than a descriptive guideline, failing to front-load critical behavioral details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the specialized domain (French labor law), lack of output schema, and absence of annotations, the description should explain the specific overtime calculation rules and return structure. It provides none of this behavioral context, leaving significant gaps for an agent invoking the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds no parameter-specific semantics, but the schema adequately documents 'base_hours', 'hourly_rate', and 'actual_hours' without requiring additional elaboration.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'French overtime pay calculation' clearly identifies the domain (French labor law) and operation (overtime pay calculation), but fails to distinguish from the nearly identical sibling tool 'calculate_overtime_pay_fr', leaving ambiguity about which tool to use.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No usage guidelines provided. The description does not indicate when to use this tool versus 'calculate_overtime_pay_fr', prerequisites for inputs, or any constraints on the calculation context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_overtime_pay_frAInspect
Calculate French overtime pay: first 8h at +25%, beyond 8h at +50% (weekly threshold 35h)
| Name | Required | Description | Default |
|---|---|---|---|
| overtime_hours | Yes | Total overtime hours worked beyond 35h/week | |
| base_hourly_rate | Yes | Normal hourly rate in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full disclosure burden. It successfully reveals the core behavioral logic by specifying the exact calculation algorithm (25% premium for first 8 hours, 50% beyond). However, it omits output format details, potential rounding behavior, or validation edge cases that would be helpful for a payroll calculation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, dense sentence with zero filler. It front-loads the action ('Calculate French overtime pay'), follows with the specific calculation rules, and ends with the threshold context. Every clause serves a distinct purpose—defining scope, algorithm, and constraints.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple 2-parameter input structure with complete schema coverage and no output schema, the description sufficiently covers the business logic needed to use the tool. However, it omits mention of the return value (presumably total overtime pay in euros), which would complete the contract for an agent determining how to use the result.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (both 'base_hourly_rate' and 'overtime_hours' are well-documented in the schema), the description appropriately does not redundantly repeat parameter details. It adds jurisdictional context ('French') that helps interpret the parameters, meeting the baseline expectation for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Calculate French overtime pay'), the jurisdiction-specific rules ('first 8h at +25%, beyond 8h at +50%'), and the threshold context ('weekly threshold 35h'). It distinguishes itself from generic overtime calculators and siblings like 'calculate_overtime_fr' by explicitly focusing on monetary pay calculations with specific French legal rates.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context about what the tool calculates (tiered overtime pay), but lacks explicit guidance on when to use this versus siblings like 'calculate_overtime_fr' or 'calculate_french_salary'. The specific business logic implies usage for French payroll calculations, but no explicit when/when-not instructions are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ovulationBInspect
Calculate ovulation date and fertile window from last period and cycle length
| Name | Required | Description | Default |
|---|---|---|---|
| cycle_length | No | Menstrual cycle length in days | |
| last_period_date | Yes | YYYY-MM-DD — First day of last menstrual period |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. While 'Calculate' implies a read-only mathematical operation, the description lacks critical behavioral details for a health tool: output format/structure, medical disclaimers about estimation accuracy, or assumptions like standard luteal phase lengths.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficiently structured with the verb first ('Calculate'), followed by outputs and inputs. No redundant words or tautologies; every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple two-parameter calculation, but gaps remain: no output schema is provided, and the description fails to specify return value format (dates, ranges) or acknowledge the estimation nature of ovulation calculation expected for health-related tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description mentions 'last period and cycle length' which aligns with the schema parameters, but adds no additional semantic context (e.g., validation rules, default handling) beyond what the JSON schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb ('Calculate') and clear resources ('ovulation date and fertile window'), distinguishing it from pregnancy or general menstrual tools in the sibling list. However, it does not explicitly differentiate from close siblings like 'calculate_menstrual_cycle' or 'calculate_due_date'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., when trying to conceive vs. tracking periods) or prerequisites (e.g., regular cycles). It fails to mention that 'calculate_due_date' should be used for pregnancy estimation instead.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_paint_neededCInspect
Paint quantity for walls
| Name | Required | Description | Default |
|---|---|---|---|
| coats | No | Number of coats | |
| area_m2 | Yes | Wall area m² | |
| coverage | No | Coverage m²/liter |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet description offers no behavioral context: silent on output format (liters, cans), calculation formula, side effects, or whether results account for doors/windows.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at 4 words. Efficiently front-loaded with the core concept, though extreme brevity sacrifices necessary context about tool behavior.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Insufficient for a calculation tool with no output schema. Omits critical context: relationship to similar tools, output units/format, and calculation methodology (e.g., simple division vs. waste factor).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (area_m2, coats, coverage fully documented), establishing baseline 3. Description adds no additional parameter context, syntax examples, or unit clarifications.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the domain (paint quantity) and scope (walls) but lacks a specific action verb and fails to differentiate from sibling tool 'calculate_paint_quantity'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus 'calculate_paint_quantity' or other alternatives. No mention of prerequisites like needing wall measurements.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_paint_quantityCInspect
Calculate paint needed for a surface
| Name | Required | Description | Default |
|---|---|---|---|
| coats | No | Coats | |
| area_m2 | Yes | Area in m² | |
| coverage | No | m²/liter |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full disclosure burden. It omits output units (liters? gallons?), calculation methodology, and assumptions (e.g., standard waste factor). 'Calculate' implies mutation but gives no behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero redundancy. However, given the lack of annotations, output schema, and sibling conflicts, the extreme brevity harms completeness rather than demonstrating effective prioritization.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Missing critical information given no output schema: the return value format/units are undefined. Also lacks sibling differentiation and behavioral details needed for a calculation tool with no annotation coverage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('Area in m²', 'm²/liter'). The description adds no parameter-specific guidance beyond what the schema already provides, warranting baseline score 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb+resource ('Calculate paint needed for a surface') but fails to distinguish from sibling 'calculate_paint_needed', which appears to be a functional duplicate. Without differentiation, an agent cannot select the correct tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. Critically missing given the existence of 'calculate_paint_needed' as a sibling tool with seemingly identical purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_paper_size_convertCInspect
Get dimensions of standard paper formats
| Name | Required | Description | Default |
|---|---|---|---|
| format | Yes | Paper format name |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It fails to disclose the return format (object with width/height? units in mm or inches?), whether the operation is read-only (though implied by 'Get'), or error handling behavior. Significant gap for a lookup tool with no output schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 5 words with zero redundancy. Front-loaded and efficient, though arguably underspecified given the lack of output schema and annotations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Minimum viable for a single-parameter lookup tool. Lacks explanation of return values which would be crucial given no output schema exists, but the simple enum-based input makes the tool's scope clear.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a well-documented enum parameter. The description adds no specific syntax guidance beyond the schema, meeting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Get' and resource 'dimensions of standard paper formats', clearly distinguishing from sibling size converters (clothing, shoe, ring) by specifying the paper domain. However, it doesn't reconcile the 'convert' implication in the tool name with the 'Get' action in the description.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus general area/distance converters (convert_area, convert_distance) or other size calculation tools. No mention of prerequisites or expected use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_parcoursup_pointsCInspect
Estimate Parcoursup admission score
| Name | Required | Description | Default |
|---|---|---|---|
| bac_average | Yes | Expected/actual bac average (/20) | |
| option_bonus | No | Bonus points from options |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden but offers minimal behavioral context. While 'Estimate' suggests a projection rather than exact calculation, it omits calculation methodology, output format, domain limitations (French higher education), and whether results are advisory or official.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at five words with no redundancy. However, the brevity sacrifices necessary contextual information; while efficient, it under-delivers for a specialized domain tool requiring domain clarification.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks essential context for a specialized French university admissions tool: no explanation of Parcoursup system, no output schema description (crucial given no output_schema exists), no sibling differentiation, and no indication that this is France-specific. The 100% schema coverage for inputs does not compensate for missing domain and output context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both 'bac_average' and 'option_bonus' are well-documented in the schema), establishing a baseline of 3. The description adds no additional parameter context beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the tool estimates a Parcoursup admission score with a clear verb and resource. However, it fails to distinguish from the sibling tool 'calculate_parcoursup_score' (nearly identical name) or clarify whether this calculates 'points' vs 'score', leaving ambiguity about which tool to use.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance is provided on when to use this tool versus the similar 'calculate_parcoursup_score' or other education calculators like 'calculate_bac_points'. Users cannot determine selection criteria or prerequisites from the description alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_parcoursup_scoreBInspect
Estimate Parcoursup weighted score from French baccalaureate component grades
| Name | Required | Description | Default |
|---|---|---|---|
| grand_oral_note | Yes | Grand Oral examination grade out of 20 | |
| bac_general_average | Yes | General baccalauréat average out of 20 | |
| specialite_1_average | Yes | First speciality subject average out of 20 | |
| specialite_2_average | Yes | Second speciality subject average out of 20 | |
| controle_continu_average | Yes | Continuous assessment (contrôle continu) average out of 20 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It qualifies the output as an 'estimate' and specifies 'weighted', indicating a calculation algorithm is applied. However, it lacks details on the specific weighting formula, precision limits, or whether results are cached/retained.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no redundant words. Key terms (Parcoursup, weighted score, baccalaureate) are front-loaded, making it immediately scannable for relevance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a domain-specific calculation tool (French university admissions), the description adequately identifies the system but lacks methodological context (e.g., how weights are calculated) or output format details. Given the 100% schema coverage, it meets baseline needs but doesn't provide comprehensive domain guidance.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear individual parameter descriptions. The description adds valuable domain context by grouping these as 'French baccalaureate component grades' and specifying the Parcoursup system, helping agents understand the semantic relationship between the five grade parameters that raw schema metadata doesn't convey.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the verb (estimate/calculate), resource (Parcoursup weighted score), and input source (French baccalaureate component grades). However, it does not distinguish this tool from the sibling 'calculate_parcoursup_points', leaving ambiguity about which Parcoursup calculation method to use.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'calculate_parcoursup_points' or 'calculate_bac_points'. There is no mention of prerequisites (e.g., requiring completed baccalaureate grades) or specific scenarios where this estimation is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_part_timeBInspect
Calculate part-time work percentage and optional pro-rata salary
| Name | Required | Description | Default |
|---|---|---|---|
| full_salary | No | Full-time salary to pro-rate (optional) | |
| full_time_hours | No | Full-time weekly hours (FR default 35h) | |
| part_time_hours | Yes | Part-time weekly hours |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose behavioral traits. It specifies the two outputs (percentage and pro-rata salary), implying a pure calculation without side effects. However, it lacks details on output format, validation behavior (e.g., handling part_time_hours > full_time_hours), or the French 35h default context mentioned in the schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise single sentence (9 words) with no redundancy. Front-loaded with the action verb and clearly structured, wasting no tokens while conveying the core calculation purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 3-parameter calculation tool with complete schema documentation, the description sufficiently covers the tool's purpose. It appropriately delegates parameter details to the schema and mentions the primary outputs, though an explicit note about output structure would further help given the lack of output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, providing detailed descriptions for all three parameters including units and optionality. The description adds no additional parameter semantics beyond the schema, meeting the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb (Calculate) and resources (part-time work percentage, pro-rata salary) are stated. However, it does not differentiate from sibling payroll calculators like calculate_french_salary or calculate_salary_hourly_to_annual, leaving ambiguity about when to choose this generic version over country-specific alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Mentions 'optional pro-rata salary' implying the salary parameter is optional, but provides no explicit guidance on when to use this tool versus alternative salary calculators (e.g., country-specific tax calculators) or prerequisites like requiring both hour parameters for meaningful results.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_passport_validityAInspect
Check if passport is valid for travel (6-month rule)
| Name | Required | Description | Default |
|---|---|---|---|
| expiry_date | Yes | Passport expiry date YYYY-MM-DD | |
| travel_date | Yes | Planned travel date YYYY-MM-DD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It discloses the '6-month rule' logic (key behavioral trait) but omits return format (boolean vs. detailed message), error handling, or side effects. Adequate but not rich behavioral disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Nine words with zero waste. Front-loaded with action verb 'Check'. Parenthetical '(6-month rule)' efficiently conveys business logic without redundancy. Every element earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for tool complexity (2 simple date parameters). The 6-month rule explanation covers the core calculation logic. Minor gap: lacks description of return value structure given no output schema exists.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with clear date descriptions. The description adds crucial semantic context: the '6-month rule' explains the relationship between expiry_date and travel_date (validity window calculation), which the schema alone does not convey.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Check', clear resource 'passport', and scope '6-month rule' precisely defines the function. It clearly distinguishes from mathematical siblings (calculate_bmi, calculate_area, etc.) by specifying the travel document validation domain.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implied context ('for travel', '6-month rule') indicating use for international travel planning, but lacks explicit 'when to use' guidance, prerequisites (e.g., needing a passport), or exclusions (e.g., does not check visa requirements).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pasta_portionsCInspect
Calculate dry pasta, water and salt for a given number of people
| Name | Required | Description | Default |
|---|---|---|---|
| appetite | Yes | Appetite level | |
| num_people | Yes | Number of people | |
| pasta_type | Yes | Pasta shape |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It fails to indicate that this is a read-only operation with no side effects, and omits critical details about the return format (units, whether values are weights or volumes, or the calculation methodology).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single efficient sentence (12 words) that front-loads the action verb. No words are wasted, though the phrase 'Calculate dry pasta' is slightly ambiguous grammatically (could be clearer as 'Calculate amount of dry pasta').
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacking an output schema, the description should specify what the tool returns (e.g., grams of pasta, liters of water, teaspoons of salt). It provides no information about units, return structure, or whether the calculation accounts for cooking absorption.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description mentions 'given number of people' which aligns with the 'num_people' parameter, but adds no additional semantic context for 'appetite' or 'pasta_type' that isn't already clear from the schema enums and descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description identifies the specific resources (dry pasta, water, salt) and the target context (given number of people), distinguishing it from sibling tools like 'calculate_recipe_scaling' or 'calculate_cooking_conversion'. However, it slightly lacks precision by not explicitly stating 'quantities' or 'amounts'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no explicit guidance on when to use this tool versus alternatives (e.g., 'calculate_recipe_scaling' for existing recipes), nor does it mention prerequisites or constraints. Usage must be inferred entirely from the tool name.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pendulum_periodDInspect
Simple pendulum period
| Name | Required | Description | Default |
|---|---|---|---|
| gravity | No | Gravity m/s² | |
| length_m | Yes | Pendulum length meters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of disclosure but provides zero behavioral context: it does not mention the small-angle approximation assumption, output units (seconds), or that this is a deterministic mathematical calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief (3 words), this represents under-specification rather than effective conciseness. No information is front-loaded because no actionable information is present.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a physics calculation tool with no output schema, the description omits critical context: the formula used (T=2π√(L/g)), return value semantics (time in seconds), and valid input ranges (small angles).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage ('Gravity m/s²', 'Pendulum length meters'), establishing the baseline. The description adds no additional parameter context, but meets the minimum given the schema completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Simple pendulum period' is tautological—it restates the tool's subject using noun phrases without an action verb (e.g., 'Calculate the period...'). It fails to distinguish from sibling calculation tools beyond identifying the physics domain.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this specific physics calculator versus other calculation tools, or prerequisites like required units (meters) vs other length units.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_percentageCInspect
Calculate percentages: value of total, percentage change, what percent
| Name | Required | Description | Default |
|---|---|---|---|
| a | Yes | First value | |
| b | Yes | Second value | |
| operation | Yes | of: X% of Y; change: from A to B; what_pct: X is what % of Y |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, and description carries the full burden. Fails to disclose return type, rounding behavior, division-by-zero handling, or read-only nature despite being a pure calculation function.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely compact single sentence using colon-separated list for the three modes. Slightly telegraphic ('value of total' lacks article), but efficiently front-loaded with zero repetition of schema details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage and explicit enum semantics, the description suffices for tool selection. However, it omits return value specification and error cases, which would help given no output schema exists.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with detailed enum descriptions mapping operations to formulas (e.g., 'of: X% of Y'). Description adds no parameter context beyond listing operation types, warranting baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific mathematical operations (value of total, percentage change, what percent) that map clearly to the enum options. Lacks explicit differentiation from sibling 'calculate_percentage_change', but the three-mode scope is clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this generic tool versus the specific 'calculate_percentage_change' sibling or other calculation tools. No mention of prerequisites or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_percentage_changeBInspect
Calculate percentage change between two values
| Name | Required | Description | Default |
|---|---|---|---|
| new_value | Yes | New value | |
| old_value | Yes | Original value |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to mention the formula used ((new-old)/old), handling of division-by-zero edge cases, whether results are returned as decimals or percentages, or rounding behavior. The agent must infer this is a safe read-only operation from context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is an efficient 7-word sentence front-loaded with the action verb. It contains no redundant boilerplate or tautological restatements of the tool name, making it appropriately sized for the tool's simplicity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (2 primitive parameters, 100% schema coverage) and absence of output schema, the description is minimally adequate but has clear gaps. It omits return value format, does not explain the 'percentage change' concept, and lacks safety notes about division by zero that would help an agent invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Original value' and 'New value'), establishing a baseline of 3. The description mentions 'two values' generically but does not add semantics beyond the schema, such as clarifying parameter order matters (old vs new) or that old_value cannot be zero.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and resource ('percentage change'), clearly indicating it performs the mathematical operation of finding percent change between values. However, it does not explicitly differentiate from sibling tool 'calculate_percentage' (which likely computes simple percentages), leaving potential ambiguity on which calculator to use.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, prerequisites for the calculation (e.g., non-zero original value), or expected use cases. No explicit when/when-not guidance is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_percentile_rankDInspect
Percentile rank of a value
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Value to rank | |
| total_values | Yes | Total number of values | |
| values_below | Yes | Number of values below |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure, yet it reveals nothing about side effects, return format, mathematical methodology, or computational constraints. Zero behavioral transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief at four words, this is under-specification masquerading as conciseness. The single phrase fails to front-load critical information about the tool's function or behavior.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given a 3-parameter statistical tool with no output schema and no annotations, the description is critically incomplete. It omits the calculation formula, expected value ranges, and return value semantics necessary for proper tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage (value, values_below, total_values are all documented). The description adds no semantic information beyond what's in the schema, warranting the baseline score of 3 for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Percentile rank of a value' is essentially tautological—it restates the tool name without clarifying what percentile rank means mathematically or what the tool actually computes. It fails to distinguish this from 200+ sibling calculate_* tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives (e.g., calculate_statistics, calculate_average) or what prerequisites exist. The description gives no context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_perimeterAInspect
Calculate perimeter/circumference for common shapes
| Name | Required | Description | Default |
|---|---|---|---|
| side | No | Side for square/hexagon | |
| shape | Yes | Shape | |
| width | No | Width/side b | |
| length | No | Length/side a | |
| radius | No | Radius | |
| side_c | No | Side c for triangle | |
| semi_major | No | Semi-major for ellipse | |
| semi_minor | No | Semi-minor for ellipse |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, placing full burden on the description. The description discloses only the mathematical operation performed, omitting details about return values, validation behavior for invalid shape/parameter combinations, units handling, or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 6 words with zero redundancy. The purpose is front-loaded immediately, and every word earns its place in conveying the tool's function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for identifying the tool's function, but given 8 parameters with conditional requirements (e.g., circle needs 'radius' while rectangle needs 'width'/'length'), the description could better signal that parameter requirements vary by shape selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema adequately documents all 8 parameters (side, radius, semi_major, etc.). The description adds no parameter-specific semantics beyond the schema, meeting the baseline expectation when structured documentation is complete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Calculate') and resources ('perimeter/circumference for common shapes'), clearly distinguishing it from siblings like `calculate_area` and `calculate_volume` through precise mathematical terminology.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The mathematical distinction between 'perimeter/circumference' and area/volume provides implied usage context, but there is no explicit guidance on when to select this tool versus other geometry tools or which parameter combinations are required for specific shapes.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pet_ageCInspect
Convert pet age to human equivalent years
| Name | Required | Description | Default |
|---|---|---|---|
| size | No | ||
| animal | Yes | ||
| age_years | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but omits critical details: it does not disclose the conversion formula used (which varies by species and size), whether the result is rounded, or the output format/structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is front-loaded and free of fluff, efficiently conveying the basic purpose. However, appropriate conciseness for this complexity would require additional sentences to cover parameters and behavior.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given three parameters with zero schema descriptions, no annotations, and no output schema, the description is grossly incomplete. It fails to explain the 'size' parameter's relevance, the calculation methodology, or what the tool returns.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, requiring the description to compensate, yet it fails to document any of the three parameters (animal, age_years, size). It does not explain that animal is restricted to dog/cat, that size is optional, or how these parameters interact.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the core transformation (convert pet age to human years) with specific verbs, but fails to distinguish from siblings calculate_dog_age and calculate_cat_age. It also uses the broad term 'pet' while the schema restricts inputs to only 'dog' and 'cat', creating a scope mismatch.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus the specific calculate_dog_age or calculate_cat_age alternatives. There is no mention of when the optional 'size' parameter should be provided or how it affects calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pet_bmiCInspect
Estimate body condition score proxy (BMI) for dogs and cats
| Name | Required | Description | Default |
|---|---|---|---|
| animal | Yes | ||
| weight_kg | Yes | ||
| body_length_cm | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. While it notes the calculation is an 'estimate' and 'proxy', it fails to explain the output format (numeric score? category?), the calculation methodology, or whether results differ between dogs and cats.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is not verbose, but given the lack of schema descriptions and annotations, it is underspecified rather than appropriately concise. Critical information is omitted that would aid invocation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 3 required parameters, 0% schema coverage, no output schema, and no annotations, the tool needs a richer description. The current text omits parameter semantics, output interpretation, and behavioral traits necessary for correct agent usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate by explaining the parameters. It implies weight and size measurements through 'BMI', but doesn't explicitly map to weight_kg and body_length_cm, nor explain how to measure 'body length' for quadrupeds (chest to tail? nose to tail?).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it estimates a body condition score proxy (BMI) for dogs and cats, distinguishing it from human BMI (sibling tool calculate_bmi) and horse-specific tools. However, it doesn't clarify when to use this versus other pet health tools like calculate_pet_food_portion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites such as requiring specific measurements or knowledge of the pet's species.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pet_food_portionCInspect
Calculate daily food portion for dogs and cats
| Name | Required | Description | Default |
|---|---|---|---|
| activity | Yes | Activity level | |
| pet_type | Yes | Type of pet | |
| age_years | Yes | Pet age years | |
| weight_kg | Yes | Pet weight kg |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. States 'Calculate' implying a read-only operation, but provides no information on output format (grams, cups, percentage?), calculation methodology, or nutritional standards used. Fails to disclose idempotency or safety characteristics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of six words with no redundancy. However, extreme brevity results in under-specification rather than elegant conciseness given the lack of supporting annotations or output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage the input parameters are documented, but the tool lacks annotations, output schema, and any description of calculation methodology or result units. Sibling tools calculate_dog_food and calculate_cat_food exist, creating ambiguity about selection criteria that remains unaddressed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions (weight_kg, age_years, activity, pet_type). The description adds no parameter-specific guidance (e.g., valid ranges implications, unit requirements beyond schema) beyond stating the subject matter. Baseline 3 appropriate for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Calculate) and resource (daily food portion) for scope (dogs and cats). However, fails to distinguish from sibling tools calculate_dog_food and calculate_cat_food, leaving ambiguous when to use this unified tool versus species-specific alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives (calculate_dog_food/calculate_cat_food), no prerequisites mentioned, no exclusion criteria. Description only states what the tool does, not when to select it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pet_medication_doseCInspect
Calculate veterinary medication dose by weight
| Name | Required | Description | Default |
|---|---|---|---|
| weight_kg | Yes | ||
| dose_mg_per_kg | Yes | ||
| concentration_mg_per_ml | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of disclosure. It fails to indicate this is a read-only calculation (no side effects), doesn't describe the return format, and omits safety-critical context that results are informational-only.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficiently worded without redundancy, but given the tool's complexity (3 parameters, 0% schema coverage, safety-critical domain), the description is insufficiently detailed rather than appropriately concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Highly incomplete for a medication calculation tool with zero schema descriptions. Fails to mention the optional concentration parameter's purpose, expected output values, units of results, or critical safety disclaimers appropriate for veterinary medication calculations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description must compensate by explaining all parameters. It only implicitly references weight_kg via 'by weight' and dose_mg_per_kg via 'medication dose', but completely omits the optional concentration_mg_per_ml parameter which determines liquid volume output.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the general function (calculate veterinary medication dose) and primary input method (by weight), but lacks specificity on what the tool returns (total mg, volume in ml, etc.) and doesn't fully distinguish from sibling tools like calculate_pet_food_portion beyond the word 'medication'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus other pet-related calculators (e.g., calculate_pet_food_portion), nor does it mention prerequisites like needing veterinary-prescribed dosage rates or that results should be verified by a professional.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pet_vaccination_scheduleCInspect
Generate upcoming vaccination schedule for a pet
| Name | Required | Description | Default |
|---|---|---|---|
| pet_type | Yes | Type of pet | |
| birth_date | Yes | Pet birth date YYYY-MM-DD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description discloses almost no behavioral traits: it does not state what the schedule contains (dates, vaccine names, dosages), whether results vary by region, or that this is informational/veterinary guidance.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence is efficient and front-loaded, though at 9 words it borders on underspecified given the lack of annotations and output schema documentation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having only 2 simple parameters, this is a health-related tool with no output schema or annotations. The description fails to address what data the schedule returns or any medical disclaimers, leaving critical gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3. The description adds no explanatory context about why birth_date matters (puppy/kitten vs adult protocols) or how pet_type affects the schedule beyond what the enum already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Generate' and resource 'vaccination schedule', and adds temporal scope 'upcoming'. However, it does not explicitly differentiate from sibling tools like calculate_pet_medication_dose or calculate_pet_age.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus other pet-related calculation tools, nor any prerequisites or limitations mentioned (e.g., assumes standard veterinary protocols).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_phBInspect
Calculate pH from H+ or vice versa
| Name | Required | Description | Default |
|---|---|---|---|
| ph_value | No | pH | |
| h_concentration | No | H+ mol/L |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses bidirectional calculation ('vice versa') which is crucial behavior. However, lacking annotations, it omits validation rules (H+ must be positive, pH range constraints), output format, and what happens if both/neither parameters are provided.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely compact at 7 words with zero redundancy. Front-loaded action verb. However, extreme brevity sacrifices necessary behavioral context for a calculation tool with zero required parameters.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Minimally sufficient for a 2-parameter conversion tool. Lacks output description (no output schema present) and doesn't clarify the mutual exclusivity constraint implied by 0 required fields in schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage ('pH', 'H+ mol/L'), establishing baseline of 3. Description adds relational context ('from...vice versa') implying input/output mapping, but doesn't expand on units, precision, or validation beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' with clear resource scope 'pH from H+' and bidirectional capability 'vice versa'. Distinct from 200+ financial/health/lifestyle calculators in sibling list—this is uniquely chemistry-focused.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use guidance or prerequisites. Doesn't explain that exactly one parameter must be provided (schema shows 0 required), or when to prefer this over general calculate_ph in calculate_* family.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pinel_tax_reductionAInspect
Calculate French Pinel tax reduction (2026 rates)
| Name | Required | Description | Default |
|---|---|---|---|
| duration | Yes | Rental commitment duration in years: 6, 9 or 12 | |
| investment | Yes | Investment amount in EUR (max 300,000) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It adds valuable temporal context by specifying '2026 rates,' indicating which tax year rules apply. However, it lacks disclosure of other behavioral traits like read-only status, error handling, or output format specifics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Every word earns its place in a single, dense sentence. 'French' identifies jurisdiction, 'Pinel' identifies the specific scheme, 'tax reduction' identifies the output, and '2026 rates' identifies the applicable rule version. No waste or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple schema (2 primitive parameters, 100% coverage, no nested objects) and clear naming convention, the description provides sufficient context for an agent to understand the tool's purpose. However, it could be improved by briefly mentioning the expected return value (tax reduction amount).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (investment amount and duration are fully documented in the schema), the baseline score is 3. The description does not add additional parameter semantics (e.g., explaining Pinel-specific constraints), but the schema is self-sufficient.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the action (Calculate), domain (French Pinel tax reduction), and scope (2026 rates). The term 'Pinel' specifically distinguishes this tool from siblings like calculate_french_income_tax or calculate_property_tax_fr by identifying the specific real estate investment scheme.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Usage is implied by the specificity of 'Pinel' (a niche French tax scheme), suggesting use for Pinel-specific calculations only. However, no explicit when-to-use guidance or alternatives are named, despite the presence of numerous other French tax calculation siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pipe_diameterAInspect
Calculate the minimum pipe diameter required for a given flow rate and maximum velocity
| Name | Required | Description | Default |
|---|---|---|---|
| flow_rate_lpm | Yes | Required flow rate in liters per minute | |
| max_velocity_ms | No | Maximum water velocity in m/s (default 1.5 m/s per DTU norms) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses that it calculates the 'minimum' pipe diameter (a key behavioral trait) and implies the domain (hydraulic/plumbing via 'flow rate' and velocity). However, it omits output units, whether results are rounded to standard nominal sizes, and material/roughness assumptions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of appropriate length. Front-loaded with the action and resource. Zero redundancy or filler. Every word serves to define the tool's specific computational purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 2-parameter calculation tool with complete schema documentation. However, lacking an output schema, the description could have specified the return format/units (e.g., 'returns diameter in millimeters'). It does not mention calculation standards beyond the DTU hint in the parameter schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed parameter descriptions (including the 'DTU norms' reference for the default velocity). The description mentions the parameters conceptually but adds no semantic detail beyond what the schema already provides, warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') with a clear resource ('pipe diameter') and scope ('minimum'). It implicitly distinguishes from sibling 'calculate_pipe_flow_rate' by specifying it takes flow rate as input to produce diameter, rather than vice versa.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use or alternative guidance is provided. However, the description implies the context (sizing pipes for fluid systems) by mentioning the specific inputs required (flow rate and velocity). No exclusions or prerequisites are stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pipe_flow_rateBInspect
Calculate water flow rate through a pipe using the Hazen-Williams formula
| Name | Required | Description | Default |
|---|---|---|---|
| length_m | Yes | Pipe length in meters | |
| material | Yes | Pipe material (affects Hazen-Williams C coefficient) | |
| diameter_mm | Yes | Pipe internal diameter in millimeters | |
| pressure_bar | No | Available water pressure in bar (default 3 bar) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. While it names the Hazen-Williams formula, it fails to disclose critical behavioral context: output units, temperature assumptions for the formula, validity ranges, or whether this is a pure calculation versus cached/storage operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 12-word sentence that is front-loaded with the action verb. No redundancy or waste; every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema provided, the description should ideally indicate the return value format or units. It also omits formula assumptions (temperature limits) that would be necessary for an engineering calculation tool. Adequate for basic selection but incomplete for robust usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with clear descriptions (including the important note that material affects the Hazen-Williams C coefficient). The description adds no additional parameter semantics beyond what the schema already provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Calculate'), resource ('water flow rate through a pipe'), and method ('using the Hazen-Williams formula'), effectively distinguishing it from sibling tools like calculate_flow_rate_convert or calculate_hydraulic_pressure.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives such as calculate_flow_rate_convert (likely unit conversion) or calculate_pipe_diameter. No prerequisites, limitations, or conditions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_planet_weightBInspect
Calculate your weight on other planets
| Name | Required | Description | Default |
|---|---|---|---|
| planet | Yes | Target planet | |
| earth_weight_kg | Yes | Weight on Earth in kg |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of disclosure. It fails to mention the calculation method (gravity ratio), the output format/units (presumably kg but unstated), or the specific planets supported. For a scientific calculation tool, this lack of behavioral context leaves the agent guessing about return values and precision.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero redundancy. Given the straightforward nature of the tool and comprehensive schema, this length is appropriate and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter calculator with complete schema documentation, the description is minimally sufficient. However, given the lack of output schema and annotations, the omission of return value details (units, precision) and supported celestial bodies leaves the definition with notable gaps despite the low complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with 'Weight on Earth in kg' and 'Target planet' adequately describing the parameters. The description adds no semantic information beyond the schema (e.g., valid ranges, planet name format), warranting the baseline score of 3 for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Calculate your weight on other planets' provides a clear verb ('Calculate') and resource ('weight on other planets'). It effectively distinguishes this astronomy/physics tool from the hundreds of financial, health, and conversion calculators in the sibling list (e.g., calculate_mortgage, calculate_bmi). However, it doesn't specify which planets are supported or clarify that Earth is excluded (implied by 'other' and the earth_weight_kg parameter).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. While the planetary physics domain is unique among the siblings, there is no mention of prerequisites (e.g., needing Earth's weight first), when this calculation is appropriate, or what to do if someone wants lunar weight (Moon not in enum).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_plasterAInspect
Calculate plaster volume and weight for a given surface and thickness
| Name | Required | Description | Default |
|---|---|---|---|
| area_m2 | Yes | Surface area in m² | |
| thickness_mm | No | Thickness in mm (default 13) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses that the tool calculates volume and weight, but fails to specify output units (e.g., m³, kg), the default thickness behavior (13mm in schema), or density assumptions used for weight calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 9 words that efficiently captures the tool's purpose. Every element earns its place with no redundancy or wasted information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While the description mentions the outputs (volume and weight) which is necessary given the lack of output schema, it omits critical contextual details like output units, the significance of the 13mm default value, or plaster type assumptions. Adequate but with clear gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, providing complete documentation for area_m2 and thickness_mm. The description maps 'surface' to area_m2 and 'thickness' to thickness_mm, meeting the baseline expectation when the schema already carries the semantic load.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the specific action (Calculate) and the resource (plaster volume and weight), along with the required inputs (surface and thickness). It effectively distinguishes itself from the numerous sibling calculation tools by being specific to plaster.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus related siblings like calculate_paint_needed or calculate_concrete_mix. No prerequisites, context, or alternative selection criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_poker_hand_probabilityBInspect
Calculate exact probability and odds for any 5-card poker hand
| Name | Required | Description | Default |
|---|---|---|---|
| hand_type | Yes | Poker hand type to calculate probability for |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It adds 'exact' implying precise calculation and mentions 'odds' suggesting output format, but lacks disclosure about deck assumptions (standard 52-card?), return value structure, or whether both percentage and ratio are returned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with 10 words. Front-loaded active verb, zero redundancy, and immediately conveys the tool's function without extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool has low complexity (1 param, no nested objects) and complete schema coverage. However, absence of output schema means description should ideally specify return format; it only broadly mentions 'probability and odds' without structure detail, leaving a minor gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with clear enum values. Description implies the parameter by referencing '5-card poker hand' but does not add semantic explanation (e.g., explaining hand rankings) or usage constraints beyond what the schema already provides. Baseline 3 appropriate for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with clear resource 'probability and odds' and scope '5-card poker hand'. However, it does not explicitly distinguish from sibling 'calculate_card_draw_probability', relying only on specificity to imply the difference.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus siblings like 'calculate_card_draw_probability' or 'calculate_probability_binomial', nor any prerequisites or constraints mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pool_chlorineDInspect
Pool chlorine dosage
| Name | Required | Description | Default |
|---|---|---|---|
| target_ppm | No | Target chlorine ppm | |
| current_ppm | No | Current chlorine ppm | |
| volume_liters | Yes | Pool volume liters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure, yet it reveals nothing about the calculation method, output format (grams? liters? tablets?), or whether the result is an estimate vs. precise chemical instruction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief, the three-word fragment is under-specified rather than appropriately concise. It lacks a complete sentence structure and front-loaded key information, providing insufficient content to evaluate for structural efficiency.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a calculation tool with 3 parameters and no output schema, the description fails to establish what is being calculated (amount of chlorine needed), expected return values, units, or safety considerations relevant to chemical handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all three parameters have descriptions), establishing a baseline of 3. The tool description adds no additional semantic context about parameters (e.g., explaining the relationship between current_ppm and target_ppm), but meets the minimum threshold.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Pool chlorine dosage' is a noun phrase without a clear verb stating what the tool actually does (calculate? recommend?). It vaguely identifies the domain but fails to specify the operation performed or distinguish from sibling tools like calculate_pool_volume.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives, or under what circumstances it should be invoked. There is no mention of prerequisites like knowing current ppm levels or pool volume.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pool_volumeCInspect
Swimming pool volume calculation
| Name | Required | Description | Default |
|---|---|---|---|
| shape | Yes | Shape | |
| depth_m | Yes | Avg depth m | |
| width_m | No | Width m | |
| length_m | No | Length m | |
| diameter_m | No | Diameter (round) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure, yet reveals only that it performs a calculation. It fails to specify output units (cubic meters vs liters), precision, or whether the tool performs validation on inputs beyond schema constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The four-word description is under-sized rather than concise. While it contains no wasted words, it fails to earn its brevity by omitting essential context about parameter dependencies and shape-specific requirements.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 5 parameters and conditional logic (rectangular pools require width/length, round pools require diameter), the description is grossly incomplete. It provides no hint that certain parameter combinations are required based on the shape enum value, leaving critical usage logic entirely undocumented.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 5 parameters have descriptions), establishing a baseline of 3. The description adds no parameter-specific guidance, particularly missing the critical semantic relationship that 'shape' determines which optional dimension parameters (width_m/length_m vs diameter_m) are required for valid calculation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Swimming pool volume calculation' is tautological, restating the tool name with minimal variation. While it identifies the resource (swimming pool) and action (calculation), it fails to distinguish from siblings like calculate_aquarium_volume or calculate_volume, or specify the calculation methodology.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this specialized tool versus generic alternatives like calculate_volume or calculate_aquarium_volume. No prerequisites, conditions, or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_portage_salarialAInspect
Estimate net income from portage salarial (freelance via umbrella company)
| Name | Required | Description | Default |
|---|---|---|---|
| daily_rate | Yes | Daily billing rate (TJM) in euros | |
| days_per_month | No | Billable days per month (default 20) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure but fails to declare whether this is a pure calculation (read-only), what format the return takes, or any rate limiting. The term 'estimate' implies approximateness but doesn't confirm idempotency or side-effect freedom.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero waste. Front-loaded with the action verb and immediately qualified with the domain specificity. Every word earns its place—'portage salarial' is technical but essential, and the parenthetical clarification precisely defines the scope.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a calculation tool with simple schema (2 flat parameters) and 100% schema coverage, though missing output format disclosure since no output schema exists. Could be strengthened by noting this is specific to French labor law given the international sibling tools (calculate_belgian_, calculate_swiss_, calculate_uk_).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with both daily_rate and days_per_month fully documented. The description mentions neither parameter specifically, relying entirely on the schema. Baseline score of 3 is appropriate when schema documentation is complete and description adds no supplemental parameter semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Estimate' with clear resource 'net income' and domain 'portage salarial (freelance via umbrella company)'. The parenthetical explanation distinguishes this from siblings like calculate_auto_entrepreneur, calculate_french_salary, and calculate_belgian_salary by specifying the unique umbrella company structure.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description implies usage through specificity of the French 'portage salarial' status, it lacks explicit guidance on when to choose this tool over calculate_auto_entrepreneur or calculate_french_salary for different freelancer structures. No prerequisites or alternative selection criteria are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_portfolio_allocationBInspect
Calculate portfolio allocation amounts by percentage for major crypto asset classes
| Name | Required | Description | Default |
|---|---|---|---|
| btc_pct | No | Bitcoin allocation percentage (default 40%) | |
| eth_pct | No | Ethereum allocation percentage (default 30%) | |
| alts_pct | No | Altcoins allocation percentage (default 20%) | |
| total_value | Yes | Total portfolio value in fiat currency | |
| stablecoins_pct | No | Stablecoins allocation percentage (default 10%) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It mentions calculating 'amounts' but does not specify the return format (structured breakdown vs single value), whether this performs validation (e.g., sum to 100%), or if it stores/portfolios persist data. Missing safety/behavioral traits for a financial calculation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 12-word sentence with zero waste. Front-loaded action verb ('Calculate') followed by object and scope. Every word serves a purpose; no redundancy or filler content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage and 5 well-documented parameters, the description doesn't need to repeat parameter details. However, for a financial tool without output schema or annotations, it should explain that outputs are monetary allocations per asset and mention the constraint that percentages should typically sum to 100%. Currently complete enough to identify function but not constraints.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (btc_pct, eth_pct, etc.), establishing baseline 3. The description adds valuable semantic context by framing parameters as 'major crypto asset classes,' helping users understand the domain grouping beyond the literal schema descriptions. However, it omits mention of the default 40/30/20/10 allocation split.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool calculates portfolio allocation amounts using percentages for crypto assets, with specific verbs and resource identification. It distinguishes from siblings like calculate_crypto_profit_loss (profit/loss) and calculate_staking_rewards by focusing on asset allocation percentages, though it could more explicitly state it outputs monetary values per asset class.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this versus other crypto/finance calculators in the extensive sibling list (e.g., calculate_dollar_cost_average). Does not mention prerequisites like ensuring percentages sum to 100% or that total_value is required while allocation percentages are optional with defaults.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_power_unit_convertAInspect
Convert power values between W, kW, HP, BTU/h, cal/s
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Power value to convert | |
| to_unit | Yes | Target unit | |
| from_unit | Yes | Source unit |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses nothing about precision/rounding behavior, return value format (scalar vs object), error handling for edge cases (negative values), or whether conversions use standard ISO factors. Mutation characteristics minimal but unverified.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb, zero redundancy. Every word earns its place by defining operation, resource, and valid value range.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for low-complexity conversion tool with complete input schema. Lacks description of return value format (critical given missing output schema) and doesn't clarify precision standards, but sufficient for basic tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds value by explicitly listing the specific power units supported (W, kW, HP, BTU/h, cal/s), reinforcing the domain context beyond generic 'Source unit'/'Target unit' labels in schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Convert' with clear resource 'power values' and explicitly enumerates all supported units (W, kW, HP, BTU/h, cal/s), distinguishing it from generic conversion siblings like convert_energy or calculate_electrical_power.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives such as calculate_electrical_power (which calculates power from other electrical values) or convert_energy. No mention of prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pregnancy_due_dateAInspect
Calculate due date and current gestational week from last period
| Name | Required | Description | Default |
|---|---|---|---|
| last_period_date | Yes | Last menstrual period date YYYY-MM-DD |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that it calculates both final due date and current gestational week (behavioral scope), but lacks disclosure of calculation method (Naegele's rule?), medical assumptions (28-day cycle), or whether results are estimates—critical gaps given no annotations exist to indicate read-only/destructive properties.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 10 words. Front-loaded with action verb, zero redundancy, every word earns its place. Clear subject-verb-object structure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter calculator, but given the medical domain and absence of output schema, it should disclose calculation standards (e.g., 40-week gestation) or output format to be complete. Siblings suggest this is part of a large calculation bundle, making the medical specificity (human vs animal) worth explicit mention.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter description. The description reinforces the parameter purpose by mentioning 'last period' but adds no additional semantic detail (date format implications, validity ranges) beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear specific verb (Calculate) and resources (due date, gestational week). Scope 'from last period' effectively distinguishes from sibling tools like calculate_due_date (generic), calculate_cat_pregnancy, calculate_dog_pregnancy, and calculate_breeding_due_date by specifying human LMP-based calculation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implied usage context through 'from last period' (indicates use when last menstrual period date is available), but lacks explicit guidance on when to choose this over calculate_due_date or animal-specific calculators, and omits medical prerequisites like cycle regularity assumptions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_present_valueCInspect
Calculate present value of a future sum
| Name | Required | Description | Default |
|---|---|---|---|
| rate | Yes | Annual discount rate percent | |
| years | Yes | Number of years | |
| future_value | Yes | Future value EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full disclosure burden. It fails to state this is a read-only mathematical operation, doesn't reveal the discounting formula used (PV = FV / (1+r)^n), and omits output format details (returns EUR amount). No mention of whether rate expects decimal (0.05) or percentage (5).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 7 words with no filler. Front-loaded verb structure is appropriate. However, given zero annotations and no output schema, the brevety leaves critical behavioral gaps rather than earning efficiency points.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 3-parameter calculation with complete schema documentation, but lacks completeness for financial domain tools: no output schema means return value behavior is unspecified, and the inverse relationship to 'calculate_future_value' remains implicit rather than explicit.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with 'Future value EUR', 'Annual discount rate percent', and 'Number of years' adequately documenting each parameter. The description mentions 'future sum' which maps to the future_value parameter, but adds no semantic nuance beyond the schema (e.g., confirming rate is entered as percentage number, not decimal).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clear resource ('present value of a future sum'). However, with sibling 'calculate_future_value' performing the inverse operation, the description misses the opportunity to explicitly distinguish when to use each direction of time-value-of-money calculation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to prefer this tool over alternatives like 'calculate_future_value' or other financial tools. No mention of prerequisites (e.g., having a known future value) or formula constraints (e.g., rate conventions).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pressure_convertBInspect
Convert pressure between Pa, bar, psi, atm, mmHg, mbar
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Pressure value | |
| to_unit | Yes | Target unit | |
| from_unit | Yes | Source unit |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden of behavioral disclosure. Description states only the conversion capability without mentioning precision, rounding behavior, error handling for invalid inputs, or whether the operation is read-only (implied by 'convert' but not explicit).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero fluff. Immediately states functionality and supported units. Every word earns its place despite minimal length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 3-parameter conversion tool with 100% schema coverage and no output schema, the description is minimally adequate. However, lacks disclosure about return value structure, precision, or behavioral edge cases that would make it fully complete given the absence of annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (all 3 parameters documented). Description lists the valid units which reinforces the enum constraints in the schema, but adds no additional semantic context about the 'value' parameter format or valid ranges. Baseline 3 appropriate when schema is comprehensive.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Convert') and resource ('pressure') with explicit enumeration of supported units (Pa, bar, psi, atm, mmHg, mbar). However, fails to distinguish from sibling tool 'convert_pressure', leaving ambiguity about which conversion tool to select.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this specific tool versus the generic 'convert_pressure' sibling or other calculation alternatives. No prerequisites, context, or exclusion criteria mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_prime_activiteBInspect
Estimate French prime d'activite (in-work benefit) eligibility and amount
| Name | Required | Description | Default |
|---|---|---|---|
| salary | Yes | Net monthly salary in euros | |
| household_size | No | Number of people in household (1-6) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. While 'Estimate' suggests approximation, the description lacks critical behavioral details such as data sources, applicable tax year, whether results are official, or potential side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of nine words that front-loads the action verb 'Estimate' and contains no redundant or wasteful text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 100% schema coverage and lack of output schema, the description adequately covers the tool's purpose by mentioning both eligibility and amount. However, it lacks contextual details about calculation authority or approximation quality expected for a welfare benefit tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage (documenting net monthly salary and household size), the baseline score is 3. The description adds no additional parameter context beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the specific resource (French prime d'activite) and actions (estimate eligibility and amount) using a specific verb. The parenthetical '(in-work benefit)' clarifies the domain, though it doesn't explicitly differentiate from sibling French benefit calculators like calculate_housing_aid.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states what the tool does but provides no guidance on when to use it versus numerous sibling tools (e.g., calculate_housing_aid, calculate_french_income_tax) or prerequisites for eligibility.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_print_resolutionCInspect
Calculate print DPI quality and maximum print size from image pixel dimensions
| Name | Required | Description | Default |
|---|---|---|---|
| image_width_px | Yes | Image width in pixels | |
| print_width_cm | Yes | Desired print width in centimeters | |
| image_height_px | Yes | Image height in pixels | |
| print_height_cm | Yes | Desired print height in centimeters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to deliver. It does not describe the output format (what fields are returned), does not explain the calculation methodology (e.g., whether it uses standard 300 DPI thresholds for 'quality'), and does not clarify the relationship between the required print dimensions inputs and the calculated maximum print size output.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of a single, efficient sentence without filler words. Information is front-loaded with the action verb. However, the sentence attempts to pack two different calculations (DPI for given dimensions AND maximum print size) into one phrase, which slightly muddles the clarity given the input schema constraints.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 100% schema coverage and 4 simple parameters, the description adequately covers the basic purpose but lacks completeness regarding outputs (no output schema exists). It should ideally specify what values are returned (DPI number, max dimensions, quality rating) and clarify how the tool handles the print dimension parameters.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description references 'image pixel dimensions' which semantically maps to image_width_px and image_height_px, adding context beyond the bare schema. However, it completely omits mention of the print_width_cm and print_height_cm parameters, which are required inputs, leaving a significant gap in parameter explanation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the resource (print DPI quality and maximum print size) and primary input (image pixel dimensions). It distinguishes this tool from siblings like calculate_depth_of_field or calculate_crop_factor by focusing on print output rather than capture or display. However, it imperfectly describes the actual inputs since the schema requires print dimensions as well, creating slight ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus other image-related calculators (e.g., calculate_crop_factor). It does not mention prerequisites, does not clarify whether this tool should be used before or after other image processing steps, and offers no alternative approaches.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_probability_binomialAInspect
Calculate binomial probability P(X=k) and cumulative P(X<=k)
| Name | Required | Description | Default |
|---|---|---|---|
| k | Yes | Number of successes | |
| n | Yes | Number of trials | |
| p | Yes | Probability of success per trial |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses that the tool computes both point probability (P(X=k)) and cumulative probability (P(X<=k)), but lacks information about output format, validation behavior when k>n, or computational limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 9 words with zero waste. Front-loaded with action ('Calculate') and specific mathematical targets. Every token provides essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage and straightforward 3-parameter input, the description adequately supports tool selection. However, given no output schema exists, it could clarify whether results are returned as separate fields, an object, or array for proper invocation handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for n, k, and p. The description adds value by contextualizing parameters within standard statistical notation P(X=k), clarifying that k represents the specific target number of successes for the point probability calculation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with explicit resource 'binomial probability', and distinguishes from siblings (e.g., calculate_dice_probability, calculate_lottery_odds) by specifying 'binomial' distribution and mathematical notation P(X=k)/P(X<=k).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus sibling probability calculators like calculate_card_draw_probability, calculate_lottery_odds, or calculate_dice_probability. Does not mention prerequisites or use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_profit_marginBInspect
Calculate gross margin, net margin, and markup percentage
| Name | Required | Description | Default |
|---|---|---|---|
| cost | Yes | Total cost | |
| revenue | Yes | Total revenue/selling price |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It adequately identifies the three metrics computed but lacks important context: calculation formulas, handling of edge cases (e.g., zero revenue), validation constraints (revenue >= cost?), or that this is a pure computation with no side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is appropriately front-loaded and wastes no words. However, it errs on the side of being overly minimal given the tool ecosystem complexity, omitting critical differentiating context that would aid selection.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 2-parameter calculation tool without output schema, listing the three computed metrics provides minimal viability. However, it remains incomplete due to failure to address sibling tool overlap (particularly 'calculate_markup_margin') and absence of behavioral constraints or formula disclosure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, the baseline is 3. The description adds value by clarifying the relationship between inputs (revenue/cost) and the specific financial ratios computed (gross margin, net margin, markup), providing semantic context beyond the raw parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Calculate') and resources ('gross margin', 'net margin', 'markup percentage'), clearly identifying what the tool computes. However, it fails to distinguish from the sibling tool 'calculate_markup_margin', which creates ambiguity about which tool to use for margin/markup calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. Given the extensive list of sibling calculation tools (400+), explicit differentiation from 'calculate_markup_margin' and 'calculate_exchange_margin' would be essential for correct selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_projectile_motionCInspect
Projectile trajectory calculations
| Name | Required | Description | Default |
|---|---|---|---|
| height | No | Initial height m | |
| velocity | Yes | Initial velocity m/s | |
| angle_deg | Yes | Launch angle degrees |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to specify what the calculation returns (distance, max height, flight time, or trajectory points) or whether it assumes vacuum conditions. It implies read-only behavior via 'calculations' but lacks explicit safety confirmation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely brief at three words with no redundant or wasted text. However, this extreme brevity results in under-specification rather than earned conciseness, as critical information is omitted.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of an output schema and annotations, the description inadequately describes what the tool returns (specific computed values). For a physics calculation with three well-documented input parameters, the lack of output specification creates a significant documentation gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for velocity, angle_deg, and height. The description adds no additional parameter semantics (e.g., valid ranges, unit constraints beyond the schema), warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the general domain (projectile trajectory) but uses vague noun phrasing ('calculations') rather than specific verbs. It does not distinguish from siblings like calculate_kinetic_energy or calculate_speed_distance_time, leaving ambiguity about what specific physical values (range, height, time) are computed.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus other physics calculation siblings (e.g., calculate_kinetic_energy) or what prerequisites (initial height, velocity units) are expected. No alternatives or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_property_capital_gains_frAInspect
Calculate French property capital gains tax after holding-period abatements
| Name | Required | Description | Default |
|---|---|---|---|
| sale_price | Yes | Sale price EUR | |
| holding_years | Yes | Years held | |
| purchase_price | Yes | Purchase price EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Mentions 'holding-period abatements' clarifying calculation logic, but omits side effects (none expected), output format, or whether it returns net tax, taxable gain, or effective rate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 8-word sentence with zero waste. Front-loaded action verb, specific domain identifier ('French'), and key behavioral hint ('abatements') in compact form.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for tool selection given simple 3-parameter schema with 100% coverage, but lacks mention of return value/output format since no output schema exists. For a calculation tool, stating what numeric result represents (tax liability amount) would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (all 3 params documented), establishing baseline 3. Description adds value by linking 'holding_years' to 'abatements', explaining the semantic relationship between parameters and calculation logic.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' + resource 'French property capital gains tax' + scope 'after holding-period abatements'. Clearly distinguishes from sibling 'calculate_capital_gains_property' via '_fr' suffix and explicit 'French' in description, and from 'calculate_french_income_tax' by specifying property vs personal income.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage context (French property transactions) through specificity, but lacks explicit 'when to use' guidance contrasting with siblings like 'calculate_capital_gains_property' or prerequisites like needing actual sale/purchase prices.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_property_tax_estimate_frBInspect
Estimate French taxe foncière from cadastral value and commune rate
| Name | Required | Description | Default |
|---|---|---|---|
| commune_rate | Yes | Commune tax rate percent | |
| cadastral_value | Yes | Valeur locative cadastrale EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The term 'estimate' provides useful behavioral context distinguishing it from exact calculation tools, but with no annotations provided, the description carries the full burden and lacks details on calculation methodology, limitations, or output format (e.g., currency, rounding).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 9 words in a single sentence. Every word earns its place with no filler content, and it is appropriately front-loaded with the key action and resource.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 primitive parameters, no output schema) and the specific domain (French property tax), the description is minimally adequate but lacks expected details such as the output format (estimated tax amount in EUR) or whether other fees (waste collection, etc.) are included in the estimate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, the parameter descriptions in the schema are complete. The tool description aligns with the schema by mentioning 'cadastral value and commune rate' but adds no additional semantic context, examples, or syntax guidance beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (Estimate), resource (French taxe foncière), and required inputs (cadastral value, commune rate). However, it fails to distinguish from the sibling tool 'calculate_property_tax_fr', leaving ambiguity about when to use the estimate versus the full calculation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance is provided on when to use this tool versus alternatives like 'calculate_property_tax_fr' or 'calculate_property_capital_gains_fr'. The description only implies usage by listing required inputs, offering no 'when-not' or prerequisite guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_property_tax_frCInspect
Calculate French taxe fonciere (property tax)
| Name | Required | Description | Default |
|---|---|---|---|
| cadastral_value | Yes | Cadastral rental value (valeur locative cadastrale) in EUR | |
| commune_rate_pct | No | Commune tax rate in % (default 25) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool performs a calculation but does not indicate whether results are exact or estimated, what currency/unit is returned, or any jurisdictional limitations (e.g., mainland France only).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose without redundancy. However, extreme brevity contributes to gaps in behavioral and contextual disclosure.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple two-parameter schema with complete coverage and lack of output schema, the description minimally suffices for basic invocation. However, the absence of behavioral details, sibling differentiation, and usage constraints leaves significant gaps for an agent attempting to select the correct calculation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, documenting both 'cadastral_value' and 'commune_rate_pct'. The description adds no additional semantic context for parameters (e.g., explaining that cadastral value is typically lower than market value), warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States a specific action (Calculate) and precise resource (French taxe fonciere/property tax), using proper domain terminology. However, it fails to differentiate from the nearly identical sibling tool 'calculate_property_tax_estimate_fr', which could lead to incorrect selection.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus the available sibling 'calculate_property_tax_estimate_fr' or other property tax calculators. No prerequisites, constraints, or exclusion criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_property_transfer_taxCInspect
Calculate property transfer/registration tax by country
| Name | Required | Description | Default |
|---|---|---|---|
| price | Yes | Property price in local currency | |
| country | Yes | Country code: FR/BE/US/UK/DE |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. It fails to indicate whether calculations are estimates or exact, supported jurisdictions beyond the schema enum, tax year applicability, or any legal disclaimers required for tax tools.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely brief at six words. While it avoids verbosity, this conciseness crosses into under-specification for a complex tax domain where supported countries, calculation methodology, and output format would aid agent selection.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacking annotations and output schema, the description omits critical context: it does not disclose the limited country support (only 5 codes per schema), whether results include itemized tax breakdowns or totals, or how it handles regional variations within countries (e.g., US states).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, documenting both price (local currency) and country (codes). The description adds no parameter syntax, format examples, or semantics beyond the schema, warranting the baseline score for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a clear verb (Calculate) and resource (property transfer/registration tax) with scope (by country). However, it does not explicitly distinguish from similar sibling tools like calculate_capital_gains_property, calculate_property_tax_fr, or calculate_inheritance_tax, which could cause selection confusion given the large toolset.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. Given the presence of country-specific property tax calculators (e.g., calculate_property_tax_fr) and general capital gains tools, explicit disambiguation is needed but absent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ptz_eligibilityBInspect
Check French PTZ (zero-rate loan) eligibility and maximum amount
| Name | Required | Description | Default |
|---|---|---|---|
| zone | Yes | Geographic zone of the property | |
| household_size | Yes | Number of people in household (1-5+) | |
| household_income | Yes | Annual household income (revenu fiscal de reference) in EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses the tool calculates 'eligibility and maximum amount', implying a computation with those outputs. However, it lacks details on error handling, return format structure, or side effects. No contradiction with annotations (none exist).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero waste. Front-loaded with verb and specific domain identifier. Every word earns its place - 'French' distinguishes from international tools, '(zero-rate loan)' clarifies acronym, 'eligibility and maximum amount' specifies dual outputs.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage and no output schema, the description adequately covers the tool's purpose for selection. It mentions the French context and outputs (eligibility + amount). However, it lacks output format specification and prerequisites (e.g., property must be in France, buyer conditions) that would be helpful given the complexity of French housing schemes.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all three parameters have detailed descriptions including 'revenu fiscal de reference' for income). Since schema fully documents parameters, the baseline score applies. The description provides context that these relate to PTZ but does not add syntax or format details beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Check' and identifies the resource 'French PTZ (zero-rate loan)' clearly. It clarifies PTZ stands for zero-rate loan and specifies domain (French). However, it does not differentiate from siblings like 'calculate_housing_aid' or 'calculate_pinel_tax_reduction' which also relate to French housing/fiscal benefits.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives. Given the presence of many French housing/fiscal calculators (calculate_housing_aid, calculate_pinel_tax_reduction, calculate_property_capital_gains_fr), the description should specify when PTZ specifically applies.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_puissance_fiscaleAInspect
French fiscal horsepower CV = (CO2/45) + (P_kW/40)^1.6
| Name | Required | Description | Default |
|---|---|---|---|
| co2_g_km | Yes | CO2 g/km | |
| power_kw | Yes | Engine power in kW |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. The formula reveals the deterministic calculation logic, but the description omits output format (integer vs float), rounding behavior, whether results are cached, and operational traits like idempotency or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely compact single sentence with zero redundancy. The formula serves dual purpose as both definition and specification. Information density is maximal with no filler words or unnecessary preamble.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriate for the low complexity (2 simple parameters) and disclosed formula. The 'CV' label implies the output unit. Minor gap: does not explicitly state the return type (number representing chevaux fiscaux) or that this applies specifically to French vehicle taxation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear param descriptions ('CO2 g/km', 'Engine power in kW'). The description maps formula variables (CO2, P_kW) to parameters but adds no additional semantic depth (e.g., valid ranges, measurement standards) beyond the schema baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description immediately identifies the resource (French fiscal horsepower/CV) and the specific calculation performed using the exact formula. The formula syntax distinguishes this from sibling tools like calculate_malus_ecologique by specifying the unique CV computation method.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives (e.g., calculate_malus_ecologique for ecological penalty calculations). No mention of prerequisites, eligibility, or specific use cases for French vehicle registration contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pump_powerDInspect
Pump power requirement
| Name | Required | Description | Default |
|---|---|---|---|
| head_m | Yes | Head m | |
| flow_m3h | Yes | Flow rate m³/h | |
| efficiency | No | Pump efficiency |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It fails to specify the calculation method (hydraulic power formula), assumed fluid density, physical units of the output, or whether the result is shaft power or hydraulic power.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief at only three words, this represents under-specification rather than earned conciseness. No front-loaded key information beyond the tool name itself.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 3-parameter calculation tool with no output schema and no annotations, the description provides insufficient context. It omits what the tool returns (power in kW or W), the underlying physics, and any validation behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with flow_m3h, head_m, and efficiency all documented in the schema. The description adds no additional semantic context (e.g., that efficiency represents the pump's mechanical efficiency), so baseline score 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Pump power requirement' is a noun phrase that restates the tool name without using a specific verb (e.g., 'calculates'). It fails to distinguish this from sibling tools like calculate_electrical_power or calculate_cycling_power.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives such as calculate_electrical_power, or what constitutes valid input ranges beyond the schema constraints.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_purchasing_powerCInspect
Compare purchasing power between two years
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Amount to compare | |
| to_year | Yes | Target year | |
| from_year | Yes | Starting year | |
| avg_inflation | No | Average annual inflation rate in % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral disclosure burden but fails to state what the tool returns (monetary value, index, or percentage), whether the calculation uses compound inflation, or that this is a read-only operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single six-word sentence with zero redundancy or filler. However, extreme brevity results in underspecification; while efficiently structured, it lacks the content density needed for tool selection.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Incomplete for a financial calculation tool with no output schema: lacks explanation of calculation methodology, return value structure, rate sources, and relationship to similar inflation tools. Relies entirely JSON schema for parameter meaning.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with adequate parameter descriptions present in the JSON schema. The description adds no semantic context beyond the schema (e.g., that amount represents currency, or years reference historical CPI data), meeting the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the general domain (purchasing power comparison across years) but uses vague verb 'compare' without specifying output format (equivalent amount, ratio, or percentage change). Fails to distinguish from siblings like calculate_inflation_adjusted_value or calculate_inflation_adjustment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternative inflation calculators. Does not explain when to provide the optional avg_inflation parameter versus relying on historical data, nor provides prerequisites or constraints.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pyramidCInspect
Pyramid volume
| Name | Required | Description | Default |
|---|---|---|---|
| height | Yes | Pyramid height | |
| base_length | Yes | Base side length |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It fails to specify the pyramid type (square/rectangular implied by 'base_length' parameter), output format, units, or whether this is a pure computation versus a lookup. The minimal 'Pyramid volume' provides almost no behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At two words, the description is severely under-specified rather than appropriately concise. It lacks a verb and provides no structured information about inputs or outputs. Every sentence should earn its place, but this minimal phrase fails to provide sufficient context for tool selection.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema and no annotations, the description should explain what is returned (volume value, units implied) and clarify this calculates volume for a square-based pyramid specifically. It omits this critical context needed to understand the calculation scope and verify expected outputs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage ('Pyramid height', 'Base side length'), establishing baseline 3. The description adds no additional parameter semantics (e.g., units, that base_length implies a square base), but the schema adequately documents the two required parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Pyramid volume' essentially restates the tool name (calculate_pyramid) and fails to provide a specific verb or distinguish from geometric siblings like calculate_cone or calculate_cylinder. While it implies the calculated attribute (volume), it lacks the clarity needed to differentiate this tool from 300+ other calculate_* tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives (e.g., calculate_cone for conical volumes, calculate_aquarium_volume for tank volumes). Given the extensive sibling list including multiple geometry calculators, the absence of selection criteria is a significant gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_pythagorasAInspect
Find missing side of right triangle using Pythagorean theorem
| Name | Required | Description | Default |
|---|---|---|---|
| a | No | Side a length | |
| b | No | Side b length | |
| c | No | Hypotenuse c length |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It describes the mathematical operation but omits operational details like error handling when insufficient data is provided, validation behavior, or output format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single, front-loaded sentence of seven words with zero redundancy. Every word earns its place by conveying the operation, target, and mathematical method.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple calculation tool with 100% schema coverage and no output schema, the description covers the essential purpose. However, it fails to document the critical constraint that exactly two of three parameters must be provided, which is necessary given the schema marks all fields as optional.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, the baseline is 3. The description adds value beyond the schema by clarifying the intent (finding a missing value), which implies the optional nature of parameters and that exactly two should be provided, compensating for the schema's lack of required field constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb (Find), clearly identifies the resource (missing side of right triangle), and specifies the method (Pythagorean theorem), which distinguishes it from siblings like calculate_trigonometry or calculate_distance_2d that handle different geometric calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'Find missing side' implies providing known sides and omitting the unknown, offering implied usage guidance. However, it lacks explicit prerequisites (e.g., 'provide exactly two sides'), alternatives, or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_quadratic_equationAInspect
Solve quadratic equation ax²+bx+c=0 and find vertex
| Name | Required | Description | Default |
|---|---|---|---|
| a | Yes | Coefficient a | |
| b | Yes | Coefficient b | |
| c | Yes | Coefficient c |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It adds valuable behavioral context by specifying vertex calculation in addition to root-finding, but omits critical details like whether it returns complex numbers for negative discriminants, handles degenerate cases (a=0), or provides discriminant value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, zero filler. Front-loaded with action verb 'Solve', immediately identifies the equation type and secondary output. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple calculation tool but lacks output specification (no output schema provided). Does not describe return structure (roots as array? Vertex as object? Real vs complex formatting?), which forces the agent to discover behavior through trial.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 100% schema coverage, the schema only labels parameters as 'Coefficient a/b/c'. The description adds essential semantic mapping via the equation notation 'ax²+bx+c=0', clarifying the role of each parameter (quadratic, linear, constant) that the schema descriptions fail to specify.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity: states the exact mathematical operation (solving ax²+bx+c=0) and secondary output (vertex), clearly distinguishing it from the generic sibling 'calculate_equation' and other specialized calculators via its explicit quadratic focus.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this versus the generic 'calculate_equation' sibling, or when analytical solutions fail (e.g., a=0). No indication of prerequisites or input validation requirements.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_quebec_income_taxAInspect
Calculate Quebec provincial income tax (Revenu Québec) with basic personal amount deduction
| Name | Required | Description | Default |
|---|---|---|---|
| income_cad | Yes | Annual income in Canadian dollars (CAD) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It adds one behavioral detail (automatically applies 'basic personal amount deduction'), but lacks other critical context such as tax year applicability, output format, or whether result is an estimate vs official calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero waste. Front-loaded with action verb, followed by jurisdiction specificity, authority alias in parentheses, and calculation detail. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter calculation tool with 100% schema coverage. Mentions key deduction behavior (basic personal amount). Would benefit from tax year specification, but sufficient for tool selection given the simple input/output contract implied.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with 'income_cad' fully documented. Description mentions no parameters, but with complete schema documentation, baseline 3 is appropriate—the description doesn't need to repeat what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' + resource 'Quebec provincial income tax' with authority '(Revenu Québec)'. The addition of 'with basic personal amount deduction' distinguishes it from generic tax calculators and siblings like calculate_canada_federal_tax or calculate_belgian_income_tax.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage through specificity ('Quebec provincial'), but provides no explicit guidance regarding sibling relationships (e.g., when to use this vs calculate_canada_combined_tax or calculate_canada_federal_tax, or whether federal tax must be calculated separately).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_race_predictionAInspect
Predict race time for a target distance using Riegel formula
| Name | Required | Description | Default |
|---|---|---|---|
| target_distance_km | Yes | Target race distance in km | |
| reference_distance_km | Yes | Reference race distance in km | |
| reference_time_minutes | Yes | Reference race time in minutes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It discloses the specific algorithm (Riegel formula) but fails to explain the formula's limitations (valid range 3.5km-42km), accuracy expectations, or output format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Perfectly concise single sentence (9 words) with zero waste. Information is front-loaded with the action 'Predict' followed by resource and method.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a calculation tool with 100% parameter coverage, but lacks output specification (returns minutes? formatted time?) and algorithm limitations given no output schema exists.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds minimal semantic context beyond the schema, though 'Riegel formula' implies the reference parameters should be from actual race performances.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb ('Predict'), resource ('race time'), and algorithm ('Riegel formula'), clearly distinguishing it from siblings like calculate_running_pace (current pace) or calculate_marathon_splits.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage (predicting race times from reference performances) through the Riegel formula reference and parameter names, but provides no explicit guidance on when to use this versus other running calculators or prerequisites for the reference values.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_radioactive_decayCInspect
Radioactive decay: N=N0*(0.5)^(t/t_half)
| Name | Required | Description | Default |
|---|---|---|---|
| time | Yes | Time elapsed | |
| initial | Yes | Initial amount | |
| half_life | Yes | Half-life |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. The formula reveals the mathematical operation performed, but the description lacks critical behavioral details: expected units (do time and half-life need matching units?), precision/rounding behavior, whether the calculation is stateless (read-only), and the structure/format of the return value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely brief—just the formula. While efficient in mathematical terms, it arguably under-specifies for a tool with no output schema. Every character earns its place mathematically, but the lack of return value description or unit guidance suggests it is too minimal rather than appropriately concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema and annotations, the description should explicitly state what the tool returns (the calculated remaining amount N) and note that time units must be consistent between 'time' and 'half_life' parameters. The formula implies the result but never states it, leaving a significant gap in the contract.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage ('Time elapsed', 'Initial amount', 'Half-life'), establishing a baseline of 3. The description adds value by mapping these parameters to their mathematical notation in the formula (N0, t, t_half), clarifying their relationship in the calculation, but does not add information about expected units or valid value ranges beyond the schema minimums.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides the standard radioactive decay formula (N=N0*(0.5)^(t/t_half)), clearly indicating it performs exponential decay calculations on initial amounts using half-life. While the mathematical notation makes the operation clear, it lacks explicit mention that it returns the remaining quantity of substance, which would make it completely unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus other calculation tools (such as calculate_caffeine_half_life or calculate_exponential_decay if it existed). No mention of appropriate contexts (physics, chemistry, nuclear medicine) or prerequisites like consistent time units.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_rainwater_collectionBInspect
Estimate annual rainwater collection volume from a roof
| Name | Required | Description | Default |
|---|---|---|---|
| roof_area_m2 | Yes | Roof catchment area in square metres | |
| efficiency_pct | No | Collection efficiency percentage (default 80%) | |
| annual_rainfall_mm | Yes | Average annual rainfall in millimetres |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. While the term 'Estimate' appropriately signals that the result is an approximation, the description fails to disclose the output format, units (likely liters given inputs in m²/mm), read-only nature, or any calculation methodology (e.g., accounting for efficiency losses).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no extraneous words. It is front-loaded with the action verb and immediately qualifies the calculation type, source, and temporal scope, earning its brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of annotations and output schema, the description is minimally adequate for this simple three-parameter tool but leaves gaps. It should ideally specify the return value's units and format (e.g., 'returns estimated annual volume in liters') to compensate for the missing structured output information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all three parameters (roof_area_m2, annual_rainfall_mm, efficiency_pct) well-documented in the schema itself. The description adds no explicit parameter guidance, but the baseline score of 3 applies since the schema adequately covers semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb (Estimate), resource (rainwater collection volume), and scope (from a roof), clearly identifying the tool's domain. However, it lacks explicit contrast with sibling water-related calculation tools (e.g., calculate_garden_water_needs, calculate_water_intake) that would definitively guide selection.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites such as required measurement units or data sources. It simply states what the tool does, offering no contextual 'when-to-use' signals.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_raised_bed_soilBInspect
Calculate soil mix volumes needed to fill a raised garden bed
| Name | Required | Description | Default |
|---|---|---|---|
| width_m | Yes | Raised bed width in meters | |
| depth_cm | No | Raised bed depth in centimeters (default 30cm) | |
| length_m | Yes | Raised bed length in meters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden but fails to specify output format/units (cubic meters? liters?), whether it returns total volume or component breakdowns, or confirmation that this is a pure calculation with no side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with no redundancy. Front-loaded with the action and subject. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 3-parameter calculator, but lacking output specification (critical given no output schema exists) and any behavioral context that would help the agent interpret results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already clearly documents all three parameters (width_m, length_m, depth_cm) including units. The description implies these are bed dimensions but adds no semantic detail beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a clear verb ('Calculate'), resource ('soil mix volumes'), and scope ('raised garden bed'). It implicitly distinguishes from sibling calculate_garden_soil by specifying 'raised' bed, though it doesn't explicitly contrast with it.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like calculate_garden_soil or calculate_compost_volume, or prerequisites for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ratio_simplifyCInspect
Simplify a ratio to its lowest terms
| Name | Required | Description | Default |
|---|---|---|---|
| numerator | Yes | Numerator | |
| denominator | Yes | Denominator |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses the core mathematical operation (reduction to lowest terms) but fails to describe the output format (e.g., whether it returns a string, object, or integers) or how it handles edge cases.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of six words with no redundancy. Efficiently front-loaded with the action. However, given the lack of output schema and annotations, the extreme brevity leaves gaps that could have been filled with one additional sentence without violating conciseness principles.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, and the description does not compensate by describing return values. No annotations provide hints about idempotency or safety. For a calculation tool with no side effects, the description is technically sufficient for invocation but leaves ambiguity about result formatting.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both 'numerator' and 'denominator' documented. The description provides context that these form a 'ratio' but does not add syntax details, valid ranges, or examples beyond the schema's 'exclusiveMinimum': 0 constraint.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb (simplify), resource (ratio), and scope (lowest terms). While it distinguishes from generic calculators, it does not explicitly differentiate from siblings like 'calculate_fraction' or 'calculate_gcd_lcm' which could be confused with this operation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like 'calculate_fraction' or 'calculate_gcd_lcm'. No prerequisites, constraints, or exclusion criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_reading_timeBInspect
Estimate reading time for a text based on word count
| Name | Required | Description | Default |
|---|---|---|---|
| word_count | Yes | Number of words in text | |
| reading_speed_wpm | No | Reading speed words per minute |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden but offers minimal behavioral context. It mentions the calculation is 'based on word count' (input method) but fails to specify output units, return format, or default reading speed behavior (250 WPM) that the agent needs to interpret results.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero waste words. Front-loaded with the action ('Estimate') followed immediately by the resource and method. Every element earns its place in the description.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple two-parameter tool with complete schema documentation, but lacking output specification given the absence of an output schema. For a calculation tool with no annotations, the description should ideally disclose the return value format/units to be considered complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage (word_count and reading_speed_wpm are both documented), establishing a baseline score of 3. The description adds no additional parameter semantics beyond 'based on word count,' which simply restates the schema without adding validation context, format hints, or usage examples.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Estimate') with clear resource ('reading time') and scope ('for a text'), effectively distinguishing this from sibling calculation tools like calculate_cooking_time or calculate_travel_time by specifying the 'reading' domain. However, it lacks output format specificity (minutes vs. formatted string) that would elevate it to a 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., calculate_reading_time vs. other time estimation tools), nor does it mention prerequisites or appropriate contexts. It merely states what the tool does, not when to invoke it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_real_estate_agency_feesBInspect
Calculate French real estate agency fees using sliding scale
| Name | Required | Description | Default |
|---|---|---|---|
| sale_price | Yes | Property sale price in EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Mentions 'sliding scale' indicating a tiered calculation method, but lacks crucial behavioral details given no annotations: it does not define the scale brackets, specify whether it returns buyer/seller fee splits, or describe the output format. Assumes read-only calculation but does not explicitly state safety.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely compact at seven words with no redundancy. However, given the lack of annotations and output schema, it is perhaps overly terse—front-loading the French jurisdiction and sliding scale method is efficient, but additional behavioral context would improve utility.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Sufficient for a single-parameter calculation tool with complete schema coverage, but incomplete regarding the 'sliding scale' logic (rates, brackets) and return values. No output schema exists to compensate, and the description does not explain what the tool returns (fee amount, breakdown, etc.).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage and only one parameter (sale_price), the schema adequately documents inputs. The description does not add parameter-specific semantics beyond what the schema provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the action (Calculate), domain (French real estate), and specific resource (agency fees), distinguishing it from generic calculators in the sibling list. It adds jurisdictional specificity ('French') and method ('sliding scale') beyond the tool name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus siblings like calculate_notary_fees, calculate_property_transfer_tax, or calculate_property_capital_gains_fr, which are also used in French property transactions. No prerequisites or exclusions stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_recipe_nutritionCInspect
Sum macronutrients for a list of ingredients proportionally to quantities
| Name | Required | Description | Default |
|---|---|---|---|
| ingredients | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Explains the calculation method ('proportionally to quantities'), which is valuable behavioral context. However, lacks disclosure of return values, error conditions, or whether results are cached/returned as rounded values.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no waste. Front-loaded with action verb. However, given the complexity of the nested array schema with 6 required fields, the extreme brevity leaves significant gaps and could justify expansion.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complex nested input structure (array of objects with 6 required fields) with 0% schema coverage and no output schema. Description fails to document the per-100g input requirement, return value structure, or calculation methodology (e.g., weighted averaging). Incomplete for the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% (no field descriptions). Description mentions 'ingredients' and 'quantities' corresponding to parameters, but fails to explain critical requirement that nutritional values must be per-100g (calories_per_100g, protein_per_100g, etc.). Insufficient compensation for zero schema documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb 'Sum' and resources 'macronutrients' and 'ingredients'. Distinguishes from siblings like 'calculate_recipe_scale' (which adjusts yields) by focusing on nutritional aggregation. Does not specify which macronutrients (protein, carbs, fat, calories) or output format, preventing a 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use versus siblings like 'calculate_daily_protein', 'calculate_glycemic_load', or 'calculate_calories_burned'. No mention of prerequisites (e.g., needing per-100g nutritional data) or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_recipe_scaleDInspect
Scale recipe ingredients
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Original ingredient amount | |
| target_servings | Yes | Target servings | |
| original_servings | Yes | Original servings |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided and the description adds no behavioral context (output format, precision rules, validation behavior) beyond what is inferable from the tool name itself.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While the three-word description contains no wasted text, extreme brevity constitutes under-specification rather than effective conciseness for a calculation tool with no output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema and no annotations, the description should explain the calculation logic (target/original servings ratio multiplied by amount) and expected return format. It provides none of this necessary context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('Original ingredient amount', 'Target servings', etc.), the schema is self-documenting. The description adds no parameter-specific semantics but meets the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the basic action (scale ingredients) but lacks specificity about the calculation method (servings-based ratio) and critically fails to distinguish from near-identical sibling 'calculate_recipe_scaling'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives, particularly the ambiguously similar 'calculate_recipe_scaling' sibling, creating selection uncertainty for the agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_recipe_scalingCInspect
Scale recipe ingredients
| Name | Required | Description | Default |
|---|---|---|---|
| target | Yes | Target servings | |
| original | Yes | Original servings | |
| ingredients | Yes | Ingredients |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but discloses nothing about behavioral traits: rounding logic, whether units are converted or preserved, handling of indivisible ingredients, or the output format/structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At three words, this is under-specification masquerading as conciseness. Every tool definition needs front-loaded value; this provides none and forces users to infer functionality from parameter names alone.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given three required parameters including a complex nested array structure (ingredients with qty/unit/name), the description is inadequate. It omits expected return value documentation since no output schema exists, leaving the agent blind to success criteria.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% at the top level (original/target/ingredients), establishing a baseline of 3. The description adds no explanatory value about parameter semantics (e.g., that original/target represent serving counts) beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Scale recipe ingredients' is tautological—it merely restates the tool name (calculate_recipe_scaling) without adding specificity. It fails to distinguish from the sibling tool calculate_recipe_scale or explain what scaling entails mathematically.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus siblings like calculate_recipe_scale or calculate_cooking_conversion. No mention of prerequisites such as needing known original and target serving sizes.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_regular_polygonCInspect
Regular polygon properties
| Name | Required | Description | Default |
|---|---|---|---|
| sides | Yes | Number of sides | |
| length | Yes | Side length |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but reveals nothing about what properties are calculated, the return format, or computational constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief at three words, this constitutes under-specification rather than efficient communication. The single phrase fails to front-load critical information about the tool's specific function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of numerous geometric calculation siblings, the description inadequately specifies which polygon properties (area, perimeter, apothem, interior angles) are returned, leaving users uncertain about the tool's exact utility.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions ('Number of sides', 'Side length'), establishing a baseline of 3. The description adds no additional parameter context regarding valid ranges or relationships between parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Regular polygon properties' is a noun phrase that restates the tool's subject without specifying the action (calculate/derive) or specific outputs (area, perimeter, angles). It fails to distinguish this tool from siblings like calculate_area or calculate_perimeter.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus other geometric calculators, nor any prerequisites or constraints beyond the schema minimums.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_rental_profitabilityCInspect
Calculate rental investment profitability and annual cash flow
| Name | Required | Description | Default |
|---|---|---|---|
| annual_tax | Yes | Annual property tax in EUR | |
| monthly_rent | Yes | Monthly rental income in EUR | |
| purchase_price | Yes | Purchase price in EUR | |
| monthly_charges | Yes | Monthly charges/expenses in EUR | |
| notary_fees_pct | No | Notary fees as % of price (default 8) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions output concepts ('profitability', 'cash flow') but does not specify the calculation methodology, return data structure, or whether this constitutes financial advice, leaving significant behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single efficient sentence of twelve words with no redundancy. However, its extreme brevity sacrifices necessary contextual details, making it minimally sufficient rather than optimally informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (financial calculation with 5 parameters) and lack of output schema or annotations, the description is minimally adequate. It identifies the output domain (cash flow/profitability) but omits output format details, financial disclaimers, or distinctions from related rental calculators.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage across all 5 parameters (e.g., 'Notary fees as % of price'), so the baseline score applies. The description adds no supplemental semantic context about parameter interactions (e.g., how monthly_charges offset monthly_rent) or required vs optional fields.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Calculate') and resources ('rental investment profitability', 'annual cash flow') to clearly define the tool's function. While it effectively signals that this is a financial analysis tool, it does not explicitly differentiate from sibling tools like `calculate_rental_yield` or `calculate_rental_yield_net` that cover similar domains.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus the available sibling calculation tools (e.g., `calculate_rental_yield`, `calculate_capital_gains_property`). There are no stated prerequisites, exclusions, or conditions for invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_rental_yieldCInspect
Calculate gross and net rental yield for a real estate investment
| Name | Required | Description | Default |
|---|---|---|---|
| annual_rent | Yes | Annual rental income in EUR | |
| annual_charges | No | Annual charges/expenses in EUR (default 0) | |
| purchase_price | Yes | Purchase price in EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It mentions calculating yields but does not disclose the output format, whether it returns both values as an object, percentage formatting, or calculation methodology. Does not mention if this is a pure computation or has side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb 'Calculate', zero redundancy. Appropriate length for the tool's scope.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema and no annotations present; description fails to compensate by describing return values. Given the complex sibling landscape (specific gross/net calculators exist), the description should explain what this tool returns that the others don't and when to prefer it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with clear EUR units. Description adds semantic value by mapping 'gross' to annual_rent and implying 'net' involves annual_charges, which connects the business logic to the parameter names. Baseline 3 is appropriate given schema completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific calculation (gross and net rental yield) and domain (real estate investment) clearly. However, fails to distinguish from siblings calculate_rental_yield_gross and calculate_rental_yield_net, which exist as separate tools. An agent cannot determine whether to use this combined tool or the specific variants.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus the sibling calculate_rental_yield_gross, calculate_rental_yield_net, or calculate_rental_profitability tools. No prerequisites, conditions, or selection criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_rental_yield_grossBInspect
Calculate gross rental yield from property price and monthly rent
| Name | Required | Description | Default |
|---|---|---|---|
| monthly_rent | Yes | Monthly rent in EUR | |
| purchase_price | Yes | Property purchase price in EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It fails to disclose the output format (percentage vs decimal), the calculation formula (annual rent / purchase price), or whether 'gross' implies before expenses. No behavioral traits (idempotency, side effects) are mentioned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no redundancy. It front-loads the action verb and keeps the scope focused without unnecessary verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (2 primitive parameters, no output schema), the description is minimally adequate. However, it should explain the output format or gross-vs-net distinction given the existence of sibling yield calculators, which would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema fully documents both parameters (EUR units, exclusiveMinimum constraints). The description merely references them by name without adding business logic, constraints, or usage guidance beyond the schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific calculation (gross rental yield) and identifies the two required inputs (property price and monthly rent). However, it does not explicitly differentiate from siblings like 'calculate_rental_yield' or 'calculate_rental_yield_net', which could confuse selection.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to prefer this tool over 'calculate_rental_yield_net' or the base 'calculate_rental_yield', nor does it mention prerequisites or assumptions about the input data.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_rental_yield_netBInspect
Calculate net rental yield after charges and vacancy
| Name | Required | Description | Default |
|---|---|---|---|
| monthly_rent | Yes | Monthly rent EUR | |
| vacancy_rate | Yes | Vacancy rate percent | |
| annual_charges | Yes | Annual charges, taxes, insurance EUR | |
| purchase_price | Yes | Property price EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to state whether this is a safe read-only calculation, what return format to expect (percentage, decimal, object), or any computational constraints. Only the action verb indicates behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single front-loaded sentence is efficient and free of waste, placing the core action first. However, given the lack of annotations and output schema, the description is arguably overly concise rather than appropriately sized, missing critical contextual sentences without becoming wordy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Missing critical context given tool complexity: no output schema is provided yet the description fails to document return values, and with multiple similar rental calculation siblings (calculate_rental_yield_gross, calculate_rental_profitability), the description offers insufficient selection guidance beyond the single scope phrase.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds minimal semantic context by grouping 'charges and vacancy' as deducted inputs, but does not elaborate on parameter relationships, format constraints, or calculation methodology beyond the schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with clear resource 'net rental yield' and explicitly scopes the calculation to 'after charges and vacancy', effectively distinguishing it from sibling tools like calculate_rental_yield_gross by defining the net accounting method.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description implies usage through 'after charges and vacancy' (suggesting when net calculation is needed), it lacks explicit guidance on when to select this tool versus calculate_rental_yield_gross or calculate_rental_profitability. The differentiation is present but implicit.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_rent_increase_irlAInspect
Calculate rent increase allowed by French IRL index
| Name | Required | Description | Default |
|---|---|---|---|
| new_irl | Yes | Latest published IRL | |
| old_irl | Yes | IRL at lease start | |
| current_rent | Yes | Current rent EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, requiring description to carry full burden. 'Calculate' implies read-only computation, but description omits return format, error handling (e.g., invalid IRL values), whether this is an estimate or official legal calculation, and any rate limiting.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 7-word sentence with zero redundancy. Front-loaded with action verb and specific domain context. Every word contributes to understanding the tool's scope.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a calculation tool with complete schema coverage. However, given the domain-specific acronym 'IRL' (Indice de Référence des Loyers), the description could benefit from brief expansion or context about output format. No output schema exists to compensate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. Description adds domain context ('French IRL index') explaining what IRL represents, but does not elaborate on parameter relationships (e.g., that new_irl should be higher than old_irl) or provide usage examples beyond what schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Calculate' with clear resource 'rent increase' and precise scope 'allowed by French IRL index'. Distinctly differentiates from siblings like calculate_rental_yield or calculate_rent_ratio by specifying the French regulatory index context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implicit context through specificity ('French IRL index'), indicating when to use (French tenancy scenarios). However, lacks explicit when-to-use guidance, prerequisites (like needing valid IRL values), or named alternatives for non-French or non-index-based calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_rent_ratioDInspect
Rent-to-income ratio
| Name | Required | Description | Default |
|---|---|---|---|
| rent | Yes | Monthly rent | |
| income | Yes | Monthly income |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, yet the description provides zero behavioral context. It does not state what the calculation returns (percentage, decimal, grade), whether it validates affordability thresholds, or what formula is used (rent÷income vs income÷rent).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief (3 words), this is severe under-specification rather than earned conciseness. The single fragment provides insufficient information to be front-loaded or useful; it wastes the agent's need for clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite being a simple 2-parameter tool with complete schema coverage, the description is inadequate. Without annotations or output schema, it should at minimum explain the calculation method (e.g., 'rent divided by income as a percentage') and intended use case (affordability assessment).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with 'Monthly rent' and 'Monthly income' documented. The description adds no semantic value beyond the schema, which warrants a baseline 3 when schema coverage is high.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Rent-to-income ratio' is a noun phrase without an action verb (calculate/evaluate). It vaguely indicates the domain but fails to specify what the tool actually computes or how, and does not distinguish from siblings like calculate_debt_to_income or calculate_rental_yield.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives like calculate_rental_yield, calculate_debt_to_income, or calculate_housing_loan_comparison. No prerequisites or conditions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_retirement_dateBInspect
Estimate retirement date from birth date and country legal retirement age
| Name | Required | Description | Default |
|---|---|---|---|
| country | Yes | Country: FR=64 years, US=67 years, UK=66 years | |
| birth_date | Yes | YYYY-MM-DD — Date of birth |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided and no output schema, the description fails to disclose what the tool returns (date format? age at retirement? both?), that it only supports three specific countries, or that it uses current legal retirement ages without accounting for future regulatory changes. The agent must infer behavior solely from the input parameters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is maximally efficient, front-loading the verb and object immediately. It contains no redundant words or filler content, delivering the essential information in the most compact form possible.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (two scalar inputs, 100% schema coverage) and lack of output schema, the description is minimally adequate but incomplete. It omits the output format specification and supported country limitations that would help an agent confirm successful invocation, though the input requirements are fully covered.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the structured data already comprehensively documents parameter formats (YYYY-MM-DD) and enumerates supported countries with their corresponding retirement ages. The description adds minimal semantic value beyond the schema's documentation, meeting the baseline expectation for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action (Estimate) and resource (retirement date), and distinguishes itself from siblings like calculate_retirement_pension and calculate_retirement_savings_gap by focusing exclusively on date calculation rather than financial amounts. It mentions the key inputs (birth date and country) that define the scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus sibling calculators like calculate_retirement_pension or calculate_retirement_savings_gap. It lacks mention of specific use cases, prerequisites, or limitations (e.g., that it only calculates standard retirement age, not early retirement options).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_retirement_pensionBInspect
Estimate French basic retirement pension (retraite de base Assurance Vieillesse)
| Name | Required | Description | Default |
|---|---|---|---|
| target_years | No | Target quarters for full pension (default 172 = 43 years) | |
| years_contributed | Yes | Total years of contribution | |
| average_salary_best25 | Yes | Average annual salary of best 25 years in euros |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet description omits critical behavioral traits: read-only/calculation safety, output format (monthly/yearly amount), whether result is gross or net, or estimation caveats. Full burden falls on description which provides minimal disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence is efficiently front-loaded with no redundancy, though extreme brevity limits ability to convey behavioral or output expectations for this domain-specific financial tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Missing output schema and annotations, the description fails to compensate by describing return values (euros? monthly? gross?) or calculation methodology, leaving significant gaps for a specialized pension estimation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds domain context ('French', 'Assurance Vieillesse') helping interpret parameters, but does not elaborate on salary calculation methodology or quarter counting specifics beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Estimate' paired with exact resource 'French basic retirement pension' and French system name 'retraite de base Assurance Vieillesse' clearly distinguishes from siblings like calculate_belgian_pension and calculate_retirement_savings_gap.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The 'French' qualifier implicitly signals appropriate context versus calculate_belgian_pension, but lacks explicit when-to-use guidance, prerequisites, or comparison to calculate_retirement_savings_gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_retirement_savings_gapBInspect
Project retirement savings and identify shortfall
| Name | Required | Description | Default |
|---|---|---|---|
| current_age | Yes | Current age | |
| savings_rate | Yes | Annual return rate percent | |
| monthly_income | Yes | Desired monthly retirement income EUR | |
| retirement_age | Yes | Target retirement age | |
| current_savings | Yes | Current savings EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It indicates the tool performs projection and gap analysis ('identify shortfall'), but lacks details on output format, whether it provides recommendations, or that calculations assume EUR currency (evident only in schema).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at six words, with no redundant or wasted content. Front-loads the core function. However, the extreme brevity contributes to informational gaps regarding usage context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a calculation tool with complete schema coverage, but lacks explanation of return values (no output schema exists) and omits important context such as currency (EUR) or the specific retirement planning methodology used.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds no semantic context beyond the schema (e.g., clarifying that `monthly_income` refers to desired retirement income rather than current earnings).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific actions (project savings, identify shortfall) and implies the calculation scope. However, it does not explicitly differentiate from sibling tools like `calculate_savings_goal` or `calculate_retirement_pension`, which also deal with retirement finances.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives like `calculate_savings_goal` or `calculate_retirement_date`. No prerequisites, assumptions, or exclusion criteria are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_reverb_predelayCInspect
Calculate optimal reverb pre-delay based on room size and musical tempo
| Name | Required | Description | Default |
|---|---|---|---|
| bpm | No | Tempo in BPM (used to snap pre-delay to musical grid) | |
| room_length_m | Yes | Room length in meters |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It fails to disclose the output format (likely milliseconds or seconds), what 'optimal' entails mathematically (RT60 based?), or safety traits (read-only). While likely a pure function, the agent cannot confirm this from the description alone.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb. No repetition of schema details. However, brevity slightly underserves the tool given lack of output schema—one additional sentence explaining the returned value would improve completeness without harming conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Acceptable for a low-complexity calculation tool with rich input schema. However, lacking both output schema and behavioral notes, the description should have indicated the expected return value (time duration) to be fully complete. Sibling context (300+ calculators) is handled sufficiently by the specific naming.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both 'bpm' and 'room_length_m'. The description maps 'musical tempo' to bpm and 'room size' to room_length_m, but adds no additional semantic context (e.g., expected room size ranges, musical genres) beyond the schema baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and specific resource ('reverb pre-delay') identified. Distinguishes itself from generic BPM calculators like 'calculate_bpm_to_ms' by specifying the acoustic domain (room size + tempo). However, it does not explicitly mention siblings or the audio engineering context that would make the distinction explicit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives. Does not mention that 'calculate_bpm_to_ms' could be used for manual calculation, or when pre-delay calculation is unnecessary (e.g., for non-musical reverb applications). No prerequisites or constraints mentioned in description text.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ring_sizeBInspect
Convert ring circumference (mm) to FR, US, UK and JP sizing systems
| Name | Required | Description | Default |
|---|---|---|---|
| from_system | Yes | ||
| circumference_mm | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It clearly states the conversion direction (circumference to ring sizes) but omits details about output format, precision/rounding behavior, and whether it returns all four systems or requires selecting one.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficiently structured sentence that front-loads the action verb. However, it is overly concise given the lack of schema documentation, leaving insufficient space to explain the ambiguous 'from_system' parameter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter conversion tool without output schema or annotations, the description covers the core conversion concept but remains incomplete due to the undocumented 'from_system' parameter and lack of output specification (e.g., numeric size values vs. formatted strings).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, requiring the description to compensate. While it implies 'circumference_mm' by mentioning 'ring circumference (mm)', it completely fails to explain 'from_system'—a critical parameter constituting 50% of the required inputs. The listed systems (FR, US, UK, JP) hint at possible values but do not clarify the parameter's purpose or selection logic.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Convert') and identifies the resource (ring circumference) and target formats (FR, US, UK, JP systems). However, it does not explicitly distinguish from the sibling tool 'calculate_ring_size_convert', which appears to offer related functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (like 'calculate_ring_size_convert') or prerequisites (e.g., needing circumference in mm). It does not mention when not to use it or what inputs are expected.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_ring_size_convertBInspect
Convert ring size between FR, US, UK, EU and JP systems
| Name | Required | Description | Default |
|---|---|---|---|
| size | Yes | Ring size in source system | |
| from_system | Yes | Source sizing system |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. While it lists supported systems, it fails to disclose what the tool returns (e.g., conversions to all other systems or a specific target), error handling, or valid size ranges.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that is appropriately sized and front-loaded. Every word earns its place by identifying the operation, resource, and scope. No redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple conversion tool with rich input schema (100% coverage) and no output schema, the description adequately covers the input side but leaves ambiguity about the return format. Given the low complexity and clear purpose, it meets minimum viability but lacks output specification.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters fully documented in the schema. The description adds minimal semantic value beyond the schema, merely listing the sizing systems already enumerated in the from_system enum. Baseline 3 applies for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the verb (Convert) and resource (ring size) and specifies the five supported sizing systems (FR, US, UK, EU, JP). However, it does not explicitly differentiate from sibling calculate_ring_size which likely calculates size from physical measurements rather than converting between systems.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives like calculate_ring_size, nor does it specify prerequisites or conditions for use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_roiCInspect
Calculate Return on Investment
| Name | Required | Description | Default |
|---|---|---|---|
| investment | Yes | Initial investment amount | |
| return_value | Yes | Final value or total returns |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose whether this is read-only (presumed), what format/value is returned (percentage? decimal?), or any precision/rounding behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief (3 words) with zero redundancy or filler. However, borders on insufficient specification rather than effective conciseness given the lack of behavioral details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple two-parameter calculation tool with well-documented schema. Missing output specification but description plus schema allows basic usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions ('Initial investment amount', 'Final value or total returns'). Description adds no semantic clarification beyond the schema, earning baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the specific financial metric (ROI) clearly, but provides no differentiation from financial siblings like calculate_solar_roi, calculate_profit_margin, or calculate_break_even. Minimal expansion of the tool name acronym.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this versus other financial calculation tools (e.g., calculate_profit_margin for profit percentage vs ROI). No prerequisites or conditions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_roman_numeralBInspect
Convert between Roman numerals and decimal (1-3999)
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Decimal number to convert to Roman numeral |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses the valid input range (1-3999) which helps set expectations. However, it omits output format details (returns Roman numeral as string?), error behavior for out-of-range values, and whether the operation is read-only (implied by context but not stated).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with action front-loaded. Every clause earns its place: conversion direction implied, domain specified, and constraints included without verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter conversion tool with 100% schema coverage, the description is minimally adequate. However, lacking an output schema, it should ideally specify what constitutes successful output (e.g., 'returns Roman numeral string') to complete the contract.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Decimal number to convert to Roman numeral'), establishing baseline of 3. The description reinforces the range constraint but adds no further semantic context about parameter format or validation beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a clear verb ('Convert') and resources ('Roman numerals and decimal'), and specifies the valid range (1-3999). However, it uses 'between' suggesting bidirectional conversion, while the input schema only accepts decimal integers, creating slight ambiguity about whether Roman-to-decimal input is supported.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance provided on when to use this tool versus siblings (like calculate_number_base_convert or convert_* tools). The range constraint (1-3999) is mentioned but as a functional limit, not usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_roof_areaCInspect
Calculate roof surface area from building footprint and slope angle
| Name | Required | Description | Default |
|---|---|---|---|
| base_width_m | Yes | Building width in meters | |
| base_length_m | Yes | Building length in meters | |
| slope_degrees | Yes | Roof slope in degrees |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, yet description fails to disclose output units (presumably square meters), geometric assumptions (pitched roof over rectangle), or idempotent nature of the calculation. Carries full burden but provides minimal behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence (9 words) with no waste. Front-loaded action. Appropriate brevity for tool complexity, though slightly too minimal to cover output characteristics.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 100% input schema coverage but no output schema or annotations, description is minimally adequate but omits expected details: return value unit/format and geometric model assumptions (e.g., gabled roof over rectangular footprint).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds semantic value by mapping 'building footprint' to the length/width pair and 'slope angle' to slope_degrees, but offers no examples, validation rationale, or format details beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb ('Calculate') and specific resource ('roof surface area'). Mentions key inputs ('building footprint and slope angle') that distinguish it from sibling calculate_area, though does not explicitly differentiate via text.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this specific roof calculator versus generic area tools, nor any prerequisites or assumptions stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_roof_trussAInspect
Calculate roof truss dimensions, rafter length and material quantities for a pitched roof
| Name | Required | Description | Default |
|---|---|---|---|
| span_m | Yes | Total roof span in meters (full width) | |
| load_kg_m2 | No | Total roof load in kg/m² including snow, wind and tiles (default 150) | |
| spacing_cm | No | Distance between trusses/rafters in cm (default 60cm) | |
| pitch_degrees | Yes | Roof pitch angle in degrees |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses what values get calculated (dimensions, rafter length, materials) but omits behavioral specifics: assumed lumber sizes, safety factors, output format/structure, or structural standards used. Marginally acceptable but lacking richness expected for unannotated tools.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single front-loaded sentence with zero waste. Verb leads, followed by three parallel noun phrases specifying outputs, achieving maximum information density.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 4-parameter calculation tool with complete input schema coverage, the description is functionally adequate. However, lacking an output schema, it should briefly indicate return structure (e.g., whether it returns a breakdown of lumber counts vs. just lengths). Minor gap in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. Description mentions 'pitched roof' contextually linking to the pitch_degrees parameter, but adds minimal semantic detail beyond schema definitions (units, defaults, ranges already documented in schema).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity: verb 'Calculate' paired with concrete resources (truss dimensions, rafter length, material quantities) and scope (pitched roof). Clearly distinguishes from sibling 'calculate_roof_area' which handles surface coverage rather than structural members and lumber quantities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implicit guidance through specific scoping (pitched roof structural elements vs. simple area), but lacks explicit when-to-use guidance regarding other structural calculation tools or prerequisites like engineering standards.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_rule_of_72AInspect
Estimate years to double an investment using the Rule of 72
| Name | Required | Description | Default |
|---|---|---|---|
| annual_rate | Yes | Annual return rate percent |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It discloses that the result is an 'Estimate' (not exact), which is crucial behavioral context for a financial calculation. However, it omits other relevant details like the formula (72/rate) or that results assume annual compounding.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single 9-word sentence is perfectly front-loaded with the verb and contains zero redundancy. Every word earns its place by conveying the tool's specific purpose and method.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (1 parameter, no output schema, no annotations), the description adequately covers the tool's purpose. It identifies the domain (investment), operation, and methodology. A perfect score would require explicitly stating the output format or approximation nature.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with 'annual_rate' fully documented as 'Annual return rate percent'. The description adds no specific parameter guidance (e.g., whether to input 7 or 0.07 for 7%), but with complete schema documentation, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Estimate'), output ('years to double'), resource ('investment'), and method ('Rule of 72'). Naming the specific heuristic effectively distinguishes it from siblings like calculate_compound_interest and calculate_future_value.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
By specifying 'Rule of 72', the description implicitly guides the agent to use this for quick heuristic estimates rather than precise financial calculations. However, it lacks explicit guidance on when NOT to use it (e.g., 'for exact values use calculate_compound_interest').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_rule_of_threeCInspect
Solve rule of three / cross multiplication
| Name | Required | Description | Default |
|---|---|---|---|
| a | Yes | Known value A | |
| b | Yes | Corresponding value B | |
| x | Yes | New value of A |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fails to disclose what the tool returns (the missing fourth proportional value), error handling for division by zero, or any other behavioral characteristics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief and front-loaded, the extreme terseness (4 words) renders it incomplete rather than efficiently concise. It wastes no words but fails to include necessary contextual information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema, the description omits crucial information about what value is returned (the calculated corresponding value Y in the proportion a:b = x:y), leaving the agent unaware of the tool's output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage ('Known value A', 'Corresponding value B', 'New value of A'), establishing clear semantics. The description adds no additional parameter context, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the mathematical operation (rule of three / cross multiplication) but does not explain what that calculation entails or how it differs from sibling tools like calculate_ratio_simplify or calculate_percentage. It assumes familiarity with the mathematical term.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this specific proportional calculation versus the 200+ other calculation tools available, particularly other ratio or percentage calculators that might be used for similar problems.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_running_paceBInspect
Calculate running pace (min/km) and speed (km/h) from distance and time
| Name | Required | Description | Default |
|---|---|---|---|
| distance_km | Yes | Distance in kilometers | |
| time_minutes | Yes | Total time in minutes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden of behavioral disclosure. While it specifies the output metrics (pace and speed with units), it lacks details on error handling, whether calculations are rounded, idempotency, or what data structure is returned. For a mutation-free calculation tool, this minimal disclosure is insufficient given the absence of annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficiently structured sentence of nine words. It front-loads the action ('Calculate'), specifies the outputs with units, and references the inputs without redundancy. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (two simple numeric inputs, straightforward calculation) and excellent schema coverage, the description is adequate. It compensates for the missing output schema by specifying the return values (pace in min/km and speed in km/h) and their units. However, it could have noted behavioral aspects like 'returns both metrics simultaneously' given the lack of annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both parameters have clear descriptions). The description mentions 'distance and time' which correspond to the parameters, but does not add semantic nuance beyond the schema (e.g., clarifying that time_minutes is total elapsed time vs lap time, or that distance_km is the race/run distance). With high schema coverage, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the resources/outputs ('running pace (min/km) and speed (km/h)') and inputs ('distance and time'). However, it does not explicitly differentiate from siblings like calculate_swimming_pace or calculate_speed_distance_time, which also deal with speed/pace calculations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., calculate_training_zones_running for training metrics, calculate_race_prediction for performance forecasting, or calculate_speed_distance_time for general physics calculations). It merely states what the tool calculates.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_salary_comparison_pppAInspect
Compare salaries across countries using PPP (FR=0.79, US=1.0, UK=0.81, DE=0.77, CH=1.36, BE=0.80)
| Name | Required | Description | Default |
|---|---|---|---|
| salary | Yes | Salary in local currency | |
| to_country | Yes | Target country | |
| from_country | Yes | Source country |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. It provides specific PPP conversion factors explaining the calculation methodology, but omits safety characteristics (read-only status), output format details, or rate limiting. The conversion factors add behavioral context but coverage is incomplete.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely dense single sentence with zero waste. PPP data is front-loaded and immediately actionable. Slightly penalized because the parenthetical data could benefit from structural separation (e.g., 'Supported PPP factors: FR=0.79...') for readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 3-parameter calculation tool with complete schema coverage but no output schema or annotations, description should explain return values (adjusted salary amount? ratio?) and PPP methodology. The conversion factors provided are helpful context but insufficient for full completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage establishing baseline of 3. Description adds significant semantic value by providing actual PPP conversion factors (CH=1.36, etc.) that explain the economic meaning of country codes beyond the schema's simple 'Source country'/'Target country' labels.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the specific action (compare), resource (salaries), method (PPP), and scope (across countries). The PPP factor examples (FR=0.79, US=1.0, etc.) precisely distinguish this from sibling tools like calculate_belgian_salary or calculate_purchasing_power by specifying purchasing power parity normalization.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives (e.g., calculate_purchasing_power or calculate_currency_exchange). Missing prerequisites such as 'use when comparing job offers across countries' or warnings about PPP limitations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_salary_hourly_to_annualBInspect
Convert hourly rate to annual, monthly, and daily salary
| Name | Required | Description | Default |
|---|---|---|---|
| hourly_rate | Yes | Hourly rate | |
| hours_per_week | No | Hours worked per week | |
| weeks_per_year | No | Weeks worked per year |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, so the description carries full disclosure burden. It adds value by specifying three distinct outputs (annual, monthly, daily) beyond what the tool name suggests. However, it lacks assumptions about the calculation (gross vs. net, standard vs. actual working hours) and does not describe the return format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise single sentence (9 words) with no redundancy. Every word earns its place: verb (Convert), input (hourly rate), outputs (annual, monthly, daily salary). Front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 3-parameter calculator with 100% input schema coverage but no output schema, the description is minimally viable. It compensates slightly for the missing output schema by naming the three calculated values, but lacks information on calculation methodology, assumptions, or return structure that would make it fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('Hourly rate', 'Hours worked per week', 'Weeks worked per year'), the baseline score applies. The description implies the hourly_rate parameter via 'Convert hourly rate' but adds no semantic context about the relationship between parameters or default values (35h/week, 52 weeks/year).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('Convert') and resources ('hourly rate', 'annual/monthly/daily salary'), clearly stating the tool's function. It differentiates from siblings like calculate_salary_comparison_ppp by specifying the hourly-to-annual direction and multi-output nature, though it does not explicitly contrast with alternatives like calculate_part_time.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No when-to-use or when-not-to-use guidance is provided. The description does not indicate prerequisites (e.g., knowing gross hourly rates) or distinguish when this simple conversion is preferred over country-specific tax calculators (calculate_french_salary, calculate_belgian_salary) in the sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_sample_sizeCInspect
Required sample size for a survey
| Name | Required | Description | Default |
|---|---|---|---|
| confidence | No | Confidence level | 95 |
| population | No | Population size | |
| margin_error_pct | Yes | Margin of error % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden but fails to disclose behavioral details such as the statistical formula used (e.g., simple random sample), assumptions (e.g., worst-case proportion), or whether the result includes finite population correction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at five words with no filler. While not wasteful, the brevity underutilizes the space where behavioral and usage guidance could have been provided.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks essential context for a statistical tool: no output schema, no annotations, and no description of what the output represents (integer count? object with details?) or the underlying statistical assumptions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with clear descriptions for confidence level, population size, and margin of error. The description adds no additional parameter context, meeting the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the tool calculates required sample size for surveys, identifying the domain and output. However, it lacks specificity regarding the statistical methodology and does not distinguish from statistical siblings like calculate_confidence_interval.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives, prerequisites (e.g., knowing desired confidence level), or limitations of the calculation method.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_savings_goalCInspect
Calculate time needed to reach a savings target
| Name | Required | Description | Default |
|---|---|---|---|
| annual_rate | Yes | Annual return rate percent | |
| target_amount | Yes | Savings target EUR | |
| monthly_savings | Yes | Monthly savings EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to specify the return format (months? years? date?), compounding frequency assumptions, or handling of edge cases (zero interest rate). The only behavioral hint is 'Calculate time,' which is insufficient for a financial calculation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero redundancy and key information front-loaded. However, given the lack of annotations and output schema, the extreme brevity (7 words) under-provides context, preventing a score of 5.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 3-parameter financial calculation tool with no annotations and no output schema, the description is inadequate. It omits calculation methodology (simple vs. compound interest), return value structure, and currency context (EUR mentioned only in schema). A compound-interest tool requires more disclosure than 'Calculate time'.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all three parameters fully documented (units EUR and percent clearly specified). The description does not add parameter syntax details or usage examples beyond what the schema provides, which warrants the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific verb (Calculate), resource (time), and scope (to reach a savings target). It implicitly distinguishes from siblings like calculate_compound_interest or calculate_future_value by focusing on time-to-target rather than final amounts or interest earned. However, it lacks explicit differentiation from similar financial planning tools like calculate_retirement_savings_gap.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives (e.g., calculate_compound_interest for determining final balance, or calculate_future_value). No prerequisites or conditions are specified, such as requiring positive interest rates or minimum savings amounts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_scholarship_comparisonCInspect
Compare net tuition costs after scholarships
| Name | Required | Description | Default |
|---|---|---|---|
| tuition | Yes | Annual tuition EUR | |
| scholarship_1 | No | Scholarship 1 amount EUR | |
| scholarship_2 | No | Scholarship 2 amount EUR | |
| scholarship_3 | No | Scholarship 3 amount EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but discloses almost nothing about behavioral traits. It does not specify whether scholarships are summed and subtracted, what happens if scholarships exceed tuition, or what the return format contains (just a number? a breakdown?).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 5 words with no redundancy. While efficient, this brevity sacrifices necessary context about the comparison scope and return values. Appropriately front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacking an output schema, the description fails to indicate what values are returned (net cost only? itemized deductions?). Given the simplicity of the 4-parameter schema, the description is insufficiently complete regarding the actual calculation logic and output format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing a baseline of 3. The description adds minimal semantic value beyond the schema, though it implies the relationship that scholarships reduce tuition costs. No additional context on currency handling or validation beyond minimums.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the core action (comparing/calculating net tuition) and resource (tuition costs after scholarships), but the verb 'Compare' is ambiguous regarding what is being compared (multiple schools? net vs. gross?). It does not clearly distinguish this tool from siblings like calculate_net_worth or calculate_cost_price.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus other financial calculators in the extensive sibling list (e.g., calculate_student_loan_repayment, calculate_education_budget). No mention of prerequisites or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_sci_is_vs_irAInspect
Compare SCI taxation under IR vs IS regime to find the most advantageous option
| Name | Required | Description | Default |
|---|---|---|---|
| annual_rent | Yes | Annual gross rental income in EUR | |
| annual_charges | Yes | Annual deductible charges in EUR (management fees, interest, maintenance) | |
| property_value | Yes | Property value for amortization calculation under IS | |
| marginal_tax_rate_pct | Yes | Shareholder marginal income tax rate in percent (e.g. 30, 41, 45) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. States the tool 'find[s] the most advantageous option' but does not specify return format (recommendation string, comparative figures, net yields), side effects, or computational characteristics. Lacks details expected for a mutation-free calculation tool without output schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with zero redundancy. Front-loaded action verb. However, extreme brevity leaves gaps for a complex domain (French corporate taxation); a second sentence describing output format would improve utility without sacrificing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate given high schema coverage, but incomplete due to missing output schema and annotations. 'Find most advantageous option' hints at return value but lacks specifics. Complex tax domain warrants explicit statement of comparison methodology or return structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% providing baseline 3. Description adds valuable domain context: mentioning 'IR vs IS' explains why property_value is needed (IS amortization) and marginal_tax_rate_pct (IR bracket calculation), connecting parameters to the specific French tax regimes.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb 'Compare', specific resource 'SCI taxation' (French real estate company), and specific scope 'under IR vs IS regime'. Clearly distinguishes from generic siblings like calculate_french_income_tax by targeting the specific SCI legal structure and its dual tax regime options.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage through specificity (SCI entities only) but lacks explicit when-to-use guidance. Does not differentiate from sibling tax tools like calculate_french_income_tax or calculate_lmnp_deficit, leaving the agent to infer applicability based on domain keywords alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_seed_quantityBInspect
Calculate the number of seeds needed based on surface area, spacing and germination rate
| Name | Required | Description | Default |
|---|---|---|---|
| surface_m2 | Yes | Surface area in square meters | |
| row_spacing_cm | Yes | Distance between rows in centimeters | |
| plant_spacing_cm | Yes | Distance between plants in a row in centimeters | |
| germination_rate_pct | No | Germination rate in percent (default 85%) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the burden of behavioral disclosure. It adds useful context by explaining that the calculation considers area, spacing, and germination rate, implying a safety margin calculation. However, it lacks explicit information about safety (read-only), idempotency, or the return value format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no wasted words. It front-loads the action ('Calculate') and immediately explains the basis of the calculation. Every clause earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input schema (4 flat parameters, 100% documented) and lack of output schema, the description adequately explains the tool's purpose. However, it omits any description of the return value (e.g., seed count, units) which would be helpful given the missing output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage. The description adds semantic value by grouping 'row_spacing_cm' and 'plant_spacing_cm' under the concept of 'spacing' and confirming the purpose of 'germination_rate_pct'. It provides the conceptual framework that binds the parameters together, meeting the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific calculation (number of seeds needed) and the input factors (surface area, spacing, germination rate). It uses a precise verb ('Calculate') and resource. However, it does not explicitly differentiate from sibling tool 'calculate_lawn_seed', which likely performs a similar function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'calculate_lawn_seed', or when it is appropriate to use (e.g., for agricultural planting vs. lawn seeding). No prerequisites or conditions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_senegalese_cssAInspect
Calculate Senegalese social contributions (CSS/IPRES) for employee and employer
| Name | Required | Description | Default |
|---|---|---|---|
| accident_rate_pct | No | Work accident insurance rate 1-5% (employer only, default 3%) | |
| gross_monthly_xof | Yes | Gross monthly salary in XOF |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Adds valuable context that calculation covers both employee and employer portions, but omits output format, calculation methodology details, or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single dense sentence (10 words) with no filler. Front-loaded verb action, immediately identifies jurisdiction and contribution type. Zero wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a 2-parameter calculation tool with complete schema. Description explains what is calculated and for whom, though could briefly mention return value structure (breakdown of employee vs employer amounts).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with complete parameter descriptions. Description provides high-level context about employee/employer scope but does not add specific parameter guidance beyond what schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' with clear resource 'Senegalese social contributions (CSS/IPRES)' and scope 'for employee and employer'. Explicitly distinguishes from siblings calculate_senegalese_income_tax and calculate_senegalese_vat by specifying CSS/IPRES regime.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Domain specificity (Senegal + social contributions) implies appropriate usage context, but lacks explicit when-to-use guidance, prerequisites, or references to sibling alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_senegalese_income_taxAInspect
Calculate Senegalese income tax (IRPP) using DGI progressive brackets in XOF
| Name | Required | Description | Default |
|---|---|---|---|
| annual_income_xof | Yes | Annual gross income in CFA Francs (XOF) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Adds valuable methodological context ('DGI progressive brackets', 'XOF') indicating official tax brackets and currency. However, lacks disclosure of output format, deduction handling, or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Perfect efficiency: single sentence, 12 words, front-loaded action verb. Every element earns its place—tax type (IRPP), methodology (DGI brackets), and currency (XOF) all specified without waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Appropriately complete for low-complexity tool (1 param, no output schema). Scope is clear (Senegal IRPP calculation). Minor gap: could specify if output is annual tax liability or includes/deducts specific allowances.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear description 'Annual gross income in CFA Francs (XOF)'. Description reinforces currency ('in XOF') but adds minimal semantic value beyond schema. Baseline 3 appropriate given schema completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Excellent specificity: verb 'Calculate' + resource 'Senegalese income tax' + technical details 'IRPP' and 'DGI progressive brackets' + currency 'XOF'. Clearly distinguishes from siblings like calculate_belgian_income_tax and calculate_senegalese_css.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides implied usage through specific geographic and tax-type identifiers ('Senegalese', 'IRPP'), but lacks explicit when-to-use guidance or mention of alternative tools (e.g., calculate_senegalese_css for social contributions).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_senegalese_vatBInspect
Calculate Senegalese VAT (TVA) at standard 18% or specified rate
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Input mode: ht=hors taxe, ttc=toutes taxes comprises | ht |
| rate | No | TVA rate in % (standard 18%) | |
| amount | Yes | Amount in XOF |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully communicates the jurisdiction (Senegal) and standard rate (18%), but fails to explain the HT/TTC calculation modes, rounding behavior, or the structure of the returned result. It does not contradict any annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of a single, efficiently structured sentence with no redundant words. It front-loads the key information (country, tax type, rate). It could earn a 5 by incorporating the HT/TTC mode distinction without significantly increasing length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (3 parameters, no output schema) and complete schema documentation, the description is minimally adequate. It establishes the Senegalese context and standard rate, but the omission of the HT/TTC directional behavior leaves a gap that could confuse users about whether this calculates tax-inclusive to exclusive or vice versa.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description implicitly references the 'rate' parameter by mentioning 'standard 18% or specified rate', reinforcing the schema's default value. However, it adds no context for the 'mode' parameter (ht/ttc), which is critical for correct usage, or the 'amount' parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb ('Calculate'), identifies the resource ('Senegalese VAT/TVA'), mentions the jurisdiction-specific standard rate (18%), and distinguishes itself from sibling VAT calculators (calculate_french_vat, calculate_vat_generic, etc.) by specifying 'Senegalese'. However, it does not clarify the dual-directional nature of the calculation (HT to TTC and vice versa), leaving some ambiguity about exactly what gets calculated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like calculate_vat_generic, nor does it clarify when to use the default 18% rate versus a custom rate. There is no mention of prerequisites or the 'mode' parameter's significance for choosing between gross and net calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_sequenceAInspect
Calculate nth term and sum of arithmetic or geometric sequence
| Name | Required | Description | Default |
|---|---|---|---|
| n | Yes | Number of terms | |
| type | Yes | Sequence type | |
| common | Yes | Common difference (arithmetic) or ratio (geometric) | |
| first_term | Yes | First term (a1) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses the scope of calculation (returns both nth term and sum) but omits details on return format, precision, or idempotency that would help an agent understand the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 9 words with zero waste. Front-loaded with the action verb 'Calculate' and immediately specifies the scope, making it easy for agents to scan and select.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a straightforward 4-parameter calculation tool with complete schema coverage. However, lacking output schema and annotations, the description could benefit from mentioning the return value structure (e.g., that both values are returned together).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear parameter semantics already defined. The description adds minimal contextual reinforcement (mentioning both sequence types clarifies the 'common' parameter's dual purpose) but does not significantly augment the schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the specific mathematical operations (Calculate nth term and sum) and the resource types (arithmetic or geometric sequence), effectively distinguishing it from the 100+ other calculator tools in the sibling list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the description implies usage context (arithmetic/geometric sequences), it provides no explicit when-to-use guidance or alternatives for other sequence types (e.g., Fibonacci) among the dense calculator ecosystem.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_severance_payAInspect
Calculate French severance pay for rupture conventionnelle or licenciement
| Name | Required | Description | Default |
|---|---|---|---|
| monthly_salary | Yes | Reference gross monthly salary in euros | |
| years_seniority | Yes | Years of seniority in the company |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to mention whether the two legal contexts use different calculation formulas, what the output format is, or any legal disclaimers about the calculation accuracy.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is perfectly front-loaded with zero waste: 'Calculate' establishes the action, 'French severance pay' the domain, and 'rupture conventionnelle or licenciement' the specific legal scopes. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While adequate for a 2-parameter tool, the description misses important contextual details for a complex legal calculation: it doesn't mention statutory caps on severance pay, differences between the two termination types' formulas, or that results are estimates subject to collective bargaining agreements.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already fully documents both parameters ('Reference gross monthly salary in euros' and 'Years of seniority'). The description adds domain context (French labor law) but no additional parameter semantics beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb ('Calculate'), resource ('French severance pay'), and legal context ('rupture conventionnelle or licenciement'), clearly distinguishing it from generic calculators and other French financial tools like calculate_french_salary or calculate_employer_cost_fr.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage by specifying the two French legal contexts (rupture conventionnelle and licenciement), but lacks explicit when-to-use guidance, exclusions (e.g., voluntary resignation), or references to alternative tools for other severance types.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_shipping_volumetricCInspect
Volumetric weight for shipping
| Name | Required | Description | Default |
|---|---|---|---|
| width_cm | Yes | Width cm | |
| actual_kg | Yes | Actual weight kg | |
| height_cm | Yes | Height cm | |
| length_cm | Yes | Length cm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden but discloses almost nothing. It doesn't explain the volumetric weight formula (L×W×H / divisor), whether it compares against actual_kg to determine billable weight, or what the output contains.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at 4 words with zero redundancy. However, it may be excessively terse for a calculation tool that should explain its output format and comparison logic.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a calculation tool with no output schema, the description fails to explain what gets returned (volumetric weight in kg? which weight is billable?). Missing critical context for a 4-parameter shipping utility.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (all 4 parameters have basic descriptions). The description adds no parameter guidance beyond the schema, which meets the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Volumetric weight for shipping' identifies the domain (shipping calculations) and distinguishes from generic siblings like calculate_volume, but lacks an action verb. It reads as a noun phrase rather than stating what the tool does (calculates/computes/returns volumetric weight).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus siblings like calculate_international_shipping or calculate_delivery_cost. No mention of prerequisites (e.g., needing dimensions) or when volumetric vs. actual weight applies.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_shoe_sizeCInspect
Convert shoe sizes between systems
| Name | Required | Description | Default |
|---|---|---|---|
| size | Yes | Shoe size | |
| from_system | Yes | From system |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries the full disclosure burden. It fails to explain what target system(s) the conversion outputs to (the schema only has 'from_system', not 'to_system'), nor does it mention output format, precision, or supported size ranges (e.g., half sizes, children's sizes).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of five words is appropriately brief, but suffers from under-specification rather than true conciseness. The content does not earn its place given the tool's behavioral complexity and ambiguous sibling relationships.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the existence of multiple shoe size conversion siblings and the unusual schema asymmetry (input lacks target system), the description is insufficient. It also omits any description of return values despite having no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with fields 'Shoe size' and 'From system' already documented. The description adds no specific semantics about parameter formats (e.g., numeric vs. string sizes) or valid value explanations, meriting the baseline score for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the core action (convert) and resource (shoe sizes) but is vague on scope ('between systems' is plural while the schema only accepts a single 'from_system'). It fails to distinguish from siblings calculate_shoe_size_convert and convert_shoe_size, which appear functionally identical.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to prefer this tool over the similar siblings (convert_shoe_size, calculate_shoe_size_convert) or other conversion tools. No prerequisites or constraints mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_shoe_size_convertBInspect
Convert shoe size between EU, US (M/W) and UK systems
| Name | Required | Description | Default |
|---|---|---|---|
| size | Yes | Shoe size in source system | |
| to_system | Yes | Target sizing system | |
| from_system | Yes | Source sizing system |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose output format (returns number? object?), rounding behavior, or whether conversions are approximate. No mention of validation behavior beyond schema constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no wasted words. Front-loaded with the core action. However, brevity comes at cost of completeness regarding output format and sibling differentiation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks output schema and fails to describe return values in description. Critical gap: does not clarify relationship with near-identical sibling 'convert_shoe_size', creating selection ambiguity for the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds value by clarifying that 'US_M' and 'US_W' enum values correspond to 'US (M/W)' systems, making the gender split explicit before viewing the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Convert) and resource (shoe size) and lists the supported systems (EU, US M/W, UK). However, fails to distinguish from sibling tool 'convert_shoe_size' which appears to have identical functionality based on naming.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus siblings like 'convert_shoe_size' or 'calculate_shoe_size'. No mention of prerequisites or conversion limitations (e.g., half sizes, children's vs adult sizes).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_simple_interestAInspect
Calculate simple interest: I = Prt
| Name | Required | Description | Default |
|---|---|---|---|
| years | Yes | Duration in years | |
| principal | Yes | Initial amount | |
| annual_rate | Yes | Annual interest rate in % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries the full burden. It provides the mathematical formula but omits behavioral details such as whether this is a pure calculation (read-only), what the output format looks like, or any computational limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single efficient sentence with zero waste. Every element earns its place: the action verb, the resource name, and the disambiguating formula.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (3 primitive parameters, 100% schema coverage, no output schema), the description combined with the schema provides adequate context. The mathematical formula compensates for the lack of output schema documentation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, the baseline is 3. The formula I = P*r*t adds valuable semantic context by mapping the mathematical relationship between parameters (P=principal, r=annual_rate, t=years) and indicating the output variable (I).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb (Calculate) and resource (simple interest), and includes the formula I = P*r*t which implicitly distinguishes it from sibling compound interest tools. However, it lacks explicit contrast with alternatives like calculate_compound_interest.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites or constraints beyond the schema validation (e.g., when simple interest vs. compound interest is appropriate).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_sleep_cyclesCInspect
Analyze sleep quality from bedtime and wake time
| Name | Required | Description | Default |
|---|---|---|---|
| bedtime | Yes | Bedtime HH:MM | |
| wake_time | Yes | Wake time HH:MM |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool 'analyzes' but fails to specify what the analysis entails (e.g., 90-minute sleep cycles, REM phases), what data is returned, or whether calculations are performed locally. No information on side effects, data persistence, or output format is provided.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single nine-word sentence that wastes no space. However, its extreme brevity becomes a liability given the lack of annotations and output schema, leaving critical information gaps rather than being appropriately concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool named 'calculate_sleep_cycles', the description omits the core calculation methodology (sleep cycle theory), expected return values (number of cycles, quality score), and distinction from simple duration calculation. With no output schema and 100% input schema coverage, the description should compensate for missing behavioral context but does not.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description references 'bedtime and wake time' which maps to the two parameters, but adds no semantic value beyond the schema's existing 'HH:MM' format descriptions. With 100% schema description coverage, the baseline score applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses 'Analyze sleep quality' while the tool name uses 'calculate_sleep_cycles', creating a verb/noun mismatch that confuses the exact output (cycles count vs quality score). It lacks differentiation from siblings like calculate_biorhythm or calculate_jet_lag_recovery. While the inputs are clear, the specific resource and action remain ambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, no prerequisites (e.g., requiring same-day times or handling overnight periods), and no indication of typical use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_slopeBInspect
Calculate slope in %, degrees, and ratio
| Name | Required | Description | Default |
|---|---|---|---|
| run_m | Yes | Horizontal run m | |
| rise_m | Yes | Vertical rise m |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It partially succeeds by specifying three output formats (percentage, degrees, ratio), but omits safety properties, side effects, caching behavior, and the exact return structure (object vs array). It implies a read-only operation but doesn't confirm it.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely compact at six words with the action verb front-loaded. No wasted text, though the brevity borders on insufficient given the extensive sibling tool list where additional context could improve disambiguation.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple two-parameter calculation tool, the description mentions the key output variants but lacks detail on the return value structure since no output schema exists. It meets minimum needs but doesn't address error conditions or precision limits.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with 'Horizontal run m' and 'Vertical rise m' clearly documented in the properties. The description adds no supplemental parameter guidance (e.g., units clarification, validation rationale for run_m minimum), warranting the baseline score for complete schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the action (Calculate), resource (slope), and output formats (%, degrees, and ratio). However, it fails to differentiate from siblings like 'calculate_drain_slope' or 'calculate_pythagoras', leaving ambiguity about when to choose this generic implementation over more specialized alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites or input constraints beyond the schema. With many geometric calculation siblings available, the absence of selection criteria is a significant gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_smoking_savingsCInspect
Calculate money saved by quitting smoking
| Name | Required | Description | Default |
|---|---|---|---|
| pack_price | Yes | Price per pack | |
| cigarettes_per_day | Yes | Cigarettes smoked per day | |
| cigarettes_per_pack | No | Cigarettes per pack |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of disclosure but fails to indicate what time period the savings cover (daily/monthly/yearly), what currency format is returned, or that this is a safe read-only operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is front-loaded and contains zero redundancy. While extremely brief, every word earns its place in identifying the tool's function.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having well-documented parameters, the description lacks critical output context given the absence of annotations and output schema. It should specify the time horizon of calculated savings (e.g., daily, monthly, yearly) and currency handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for all three parameters. The description adds no additional semantic context beyond the schema, which is acceptable given the baseline but doesn't enhance understanding of parameter relationships.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a clear verb-object pattern stating exactly what the tool computes (money saved from quitting smoking), which effectively distinguishes it from the numerous other calculate_* siblings in the list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus other financial calculators, nor any mention of prerequisites like needing specific currency units or consumption data. It assumes the user knows they want this specific calculation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_soil_ph_amendmentCInspect
Soil pH amendment calculator
| Name | Required | Description | Default |
|---|---|---|---|
| area_m2 | Yes | Garden area m² | |
| target_ph | Yes | Target pH | |
| current_ph | Yes | Current soil pH |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of disclosure but reveals nothing about what the tool returns (amount? cost? product recommendations?), whether it supports specific amendment types, or any constraints. The term 'calculator' implies non-destructive read-only behavior but lacks explicit confirmation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At only four words, it is brief to the point of being under-specified. The noun phrase structure ('Soil pH amendment calculator') functions as a label rather than a front-loaded explanation of capabilities. Conciseness without information density is not valuable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of soil chemistry and the lack of an output schema, the description should explain what calculation is performed and what the output represents. It fails to address the return value or the specific amendment calculation methodology.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear parameter descriptions ('Current soil pH', 'Target pH', 'Garden area m²'). The description adds no additional semantic value regarding parameter relationships or calculation logic, warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Soil pH amendment calculator' is essentially a tautology of the tool name (calculate_soil_ph_amendment). While it identifies the domain, it fails to specify what the tool actually calculates (e.g., quantity of lime/sulfur, cost, bags needed) and does not differentiate from siblings like calculate_ph, calculate_garden_soil, or calculate_fertilizer_npk.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus alternatives (e.g., calculate_ph for simple conversions), prerequisites (soil testing requirements), or specific scenarios where this calculator is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_solar_panel_outputBInspect
Estimate daily and annual energy output of a solar panel installation
| Name | Required | Description | Default |
|---|---|---|---|
| area_m2 | No | Panel surface area in m2 (optional, informational) | |
| panel_watt_peak | Yes | Total peak power of the installation in Watts (Wp) | |
| hours_sun_per_day | No | Average peak sun hours per day (default 4) | |
| efficiency_loss_pct | No | System efficiency loss percentage (default 15%) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Estimate' implies a read-only calculation, the description fails to disclose the output format, units (kWh?), assumptions in the calculation model, or whether results vary by geography/season.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with no redundant words. It front-loads the action and target, making it immediately scannable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the well-documented schema (100% coverage) and simple flat structure, the description adequately covers intent. However, with no output schema provided, it should ideally describe what values are returned (daily kWh, annual kWh, etc.) to be complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the baseline is 3. The description mentions 'daily and annual' output, which broadly contextualizes the time-based parameters (hours_sun_per_day, efficiency_loss_pct), but does not add syntax details, parameter relationships, or explain that only panel_watt_peak is required.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Estimate') and resource ('energy output of a solar panel installation'), clearly indicating it's a calculation tool. It distinguishes from generic siblings by specifying solar panel context, though it could better differentiate from 'calculate_solar_roi' which also deals with solar panels.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'calculate_solar_roi' or 'calculate_energy_physics'. There are no prerequisites, conditions, or explicit exclusions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_solar_roiCInspect
Solar panel return on investment
| Name | Required | Description | Default |
|---|---|---|---|
| price_kwh | No | Electricity price EUR/kWh | |
| annual_kwh | Yes | Annual production kWh | |
| system_cost | Yes | Total system cost EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure but fails to explain what the tool returns (currency amount, years to breakeven, percentage) or financial assumptions (inflation, degradation rates).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely brief (4 words), which prevents bloat, but is likely underspecified for a financial calculation tool. However, it is front-loaded and wastes no space.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given this is a financial calculation tool with no output schema, the description should explain the return format and units. The current description leaves significant gaps in understanding what values the tool produces.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so all three parameters are adequately documented in the schema itself. The description adds no additional parameter guidance, meeting the baseline expectation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Solar panel return on investment' essentially restates the tool name 'calculate_solar_roi' without adding specificity about what the calculation produces (e.g., payback period, percentage return, net savings).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus siblings like 'calculate_roi' (generic) or 'calculate_solar_panel_output' (energy focused), nor prerequisites for the calculation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_solution_dilutionCInspect
Dilution calculation C1V1=C2V2
| Name | Required | Description | Default |
|---|---|---|---|
| c1 | Yes | Initial concentration mol/L | |
| c2 | Yes | Target concentration mol/L | |
| v1_ml | Yes | Initial volume mL |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description carries the full burden of behavioral disclosure. It fails to state what value is returned (V2, volume to add, dilution factor), what units the output uses, or any constraints on the calculation (e.g., non-zero concentrations).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
At only four words, the description is extremely terse. While it avoids redundancy with the schema, it is so brief that it fails to provide necessary behavioral context or usage guidance. It is front-loaded with the key concept but underspecified for practical use.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema and annotations, the description should explain the return value (calculated final volume) and clarify the relationship to sibling dilution tools. It currently provides only the formula name without operational context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents each parameter's purpose and units. The description adds value by providing the mathematical relationship C1V1=C2V2, which contextualizes how the parameters interact in the dilution equation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states it performs dilution calculations using the formula C1V1=C2V2, which identifies the specific chemical calculation performed. However, it fails to specify what variable is being solved for (presumably final volume V2 given the input schema), and does not differentiate from the sibling tool 'calculate_dilution' despite being one of hundreds of calculation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this specific tool versus the generic 'calculate_dilution' sibling, nor are there any stated prerequisites, assumptions about units, or conditions regarding when this formula applies (e.g., ideal dilutions).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_speed_distance_timeAInspect
Solve speed/distance/time — provide any 2 of 3 values to compute the missing one
| Name | Required | Description | Default |
|---|---|---|---|
| speed | No | Speed in km/h | |
| distance | No | Distance in kilometers | |
| time_minutes | No | Time in minutes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, description carries full burden. It discloses the core calculation logic (2 inputs yield 1 computed output) but omits error handling behavior, output format details, and validation constraints (e.g., what happens if all 3 values are provided).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with zero waste. Front-loaded with action verb, immediately followed by usage instruction. No redundant phrases or unnecessary preamble.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple deterministic calculator. Given no output schema, description could improve by briefly noting the return value represents the computed missing variable, though this is arguably implicit. Parameter relationships are sufficiently explained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage (units clearly defined), establishing baseline of 3. Description adds crucial semantic constraint that parameters operate as 'any 2 of 3' interdependent values, explaining the mutual exclusivity logic not present in isolated schema field descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'Solve' with clear resource 'speed/distance/time'. Uniquely identifies this tool among 200+ calculate_* siblings by specifying the classic physics tripartite relationship.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states the input constraint 'provide any 2 of 3 values to compute the missing one', clearly defining when to use the tool. Lacks explicit 'when not to use' (e.g., providing all 3 or only 1 value) or sibling alternatives for different unit systems.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_speed_of_soundBInspect
Speed of sound in air at given temperature
| Name | Required | Description | Default |
|---|---|---|---|
| temperature_c | Yes | Celsius |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It adds valuable domain context ('in air') missing from the schema, but fails to describe the output format (units, numeric precision) or confirm the read-only/calculation nature of the operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at nine words with zero redundancy. The noun phrase structure is efficient, though it sacrifices verb clarity for brevity. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a single-parameter calculation tool, but gaps remain: output units (m/s) are unspecified despite no output schema existing, and the -273°C minimum constraint in the schema lacks contextual explanation in the description.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with 'temperature_c' well-documented as 'Celsius'. The description aligns with the parameter ('at given temperature') but adds no additional syntax, example values, or constraint details beyond what the schema provides, warranting the baseline score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the resource (speed of sound) and scope (in air at given temperature), distinguishing it from kinematic siblings like 'calculate_speed_distance_time'. However, it lacks an explicit action verb (e.g., 'Calculate'), relying on the tool name to imply the operation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus alternative physics calculators or unit converters (like 'convert_speed'). No prerequisites or explicit constraints mentioned beyond the schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_sphereCInspect
Sphere volume and surface area
| Name | Required | Description | Default |
|---|---|---|---|
| radius | Yes | Radius |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden but fails to disclose whether both volume and surface area are always returned, the expected output format, units of result, or any precision/rounding behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at five words with zero redundancy, but potentially too terse for a tool lacking an output schema. The fragment structure lacks a verb and front-loaded impact.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of an output schema, the description should explain what values are returned and in what format. It mentions two calculations but doesn't confirm if both are returned simultaneously or separately.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (radius parameter described in schema). The description adds no semantic context beyond the schema (e.g., expected units, validation rationale), warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description identifies the resource (sphere) and operations (volume and surface area) but lacks a verb (e.g., 'Calculate'), making it a fragment rather than a clear action statement. It distinguishes from siblings like calculate_cylinder by specifying 'sphere', but the missing verb weakens clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus siblings (calculate_cylinder, calculate_cone, etc.) or prerequisites such as required units (meters vs. centimeters) for the radius input.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_spring_constantCInspect
Spring constant from Hooke's law
| Name | Required | Description | Default |
|---|---|---|---|
| force_n | Yes | Applied force N | |
| displacement_m | Yes | Displacement m |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, and the description carries minimal behavioral context. It does not disclose the return format (numeric value), expected units (N/m), deterministic nature, or error conditions (e.g., handling near-zero displacement beyond the schema constraint).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely brief at six words, but every word earns its place by specifying the resource and method. It is front-loaded with the core concept. However, it verges on under-specification given the lack of annotations or output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a low-complexity tool with two well-documented parameters, but gaps remain regarding the output value (unit, format, precision) and any physics-specific caveats. Given 100% schema coverage and simple intent, it meets minimum viability but lacks richness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with clear unit suffixes (_n, _m) and descriptions in the properties. The description mentions Hooke's law, which implicitly contextualizes the force and displacement relationship, but adds no additional semantic detail beyond what the self-documenting parameter names and schema descriptions already provide.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description identifies the specific output (spring constant) and the governing physics principle (Hooke's law), making it clear this is a mechanics calculation distinct from financial or biological sibling tools. While it omits an explicit verb, the phrase '[Tool] Spring constant from Hooke's law' sufficiently communicates the function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this physics calculator versus other calculation tools, nor does it mention prerequisites like ensuring displacement is non-zero (though the schema enforces a minimum). No alternatives or exclusions are noted.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_staircaseBInspect
Calculate staircase dimensions using Blondel formula
| Name | Required | Description | Default |
|---|---|---|---|
| total_height_cm | Yes | Total height cm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full disclosure burden. It reveals the calculation method (Blondel formula) but omits what values are returned (step count, riser height, tread depth), what defaults are assumed for missing variables, and whether results follow standard building codes.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is efficiently structured with the action front-loaded. Every word earns its place, though the extreme brevity arguably leaves room for one additional sentence describing output without becoming verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema and the complexity of staircase calculations, the description fails to specify what calculated values are returned (e.g., optimal step count, dimensions) or how the single input parameter suffices for the Blondel calculation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter 'total_height_cm', which is adequately described in the schema. The description adds no parameter-specific semantics, earning the baseline score for well-documented schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool calculates 'staircase dimensions' using the specific 'Blondel formula', providing a concrete verb, resource, and methodology that distinguishes it from generic calculation siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
While the Blondel formula mention implies a specific use case, there is no explicit guidance on when to choose this over siblings like 'calculate_concrete_stairs' or prerequisites for using the tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_staking_rewardsCInspect
Calculate staking rewards with optional compounding for a given APY and duration
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Initial staking amount in coins or fiat | |
| apy_pct | Yes | Annual Percentage Yield in percent | |
| compounding | Yes | Compounding frequency | |
| duration_days | Yes | Staking duration in days |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Mentions 'optional compounding' behavior but fails to disclose output format (rewards only vs total value), calculation methodology (simple vs effective APY), or precision/rounding behavior expected from a financial calculator.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence front-loaded with verb. No redundant text. However, brevity sacrifices necessary behavioral details for a financial calculation tool with no output schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for parameter description given complete schema coverage, but incomplete regarding output specification. With no output schema and no annotations, description should indicate what value(s) are returned (principal + interest vs interest only).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description mentions 'optional compounding' which aligns with compounding parameter enum including 'none', and references APY and duration, but adds minimal syntax clarification beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' with clear domain resource 'staking rewards'. Mentions key inputs (APY, duration) and feature (optional compounding). Distinguishes from generic compound_interest siblings by domain-specific terminology.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this versus calculate_compound_interest or calculate_compound_interest_monthly siblings. No prerequisites or constraints mentioned despite financial domain complexity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_stamp_duty_ukCInspect
Calculate UK Stamp Duty Land Tax (SDLT)
| Name | Required | Description | Default |
|---|---|---|---|
| price | Yes | Property purchase price in GBP | |
| first_time_buyer | No | Whether buyer is a first-time buyer (default false) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description fails to disclose behavioral traits (e.g., read-only calculation, no side effects, what the return value represents). The agent has no indication whether this performs a lookup, computation, or requires external API calls.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise single sentence with no waste, but errs toward under-specification. The brevity is appropriate for structure but insufficient for contextual richness given zero annotations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 100% schema coverage and simple 2-parameter input, the description is minimally viable. However, lacking output schema and annotations, it should disclose the calculation nature (bands, rates, reliefs) or return format to be complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% ('Property purchase price in GBP', 'Whether buyer is a first-time buyer'), establishing baseline 3. The description adds no semantic clarification beyond the schema (e.g., valid price ranges, first-time buyer definition).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States the specific action (Calculate) and resource (UK Stamp Duty Land Tax/SDLT), distinguishing it from generic calculators. However, it omits context that this applies to property/land purchases, which is only implied by the parameter schema.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus siblings like 'calculate_property_transfer_tax' or other UK tax calculators. No mention of prerequisites (e.g., property location) or calculation scope.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_star_magnitude_distanceBInspect
Calculate star distance from apparent and absolute magnitude
| Name | Required | Description | Default |
|---|---|---|---|
| absolute_magnitude | Yes | Absolute magnitude (M) | |
| apparent_magnitude | Yes | Apparent magnitude (m) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, so the description carries full burden. It fails to disclose output units (parsecs, light-years?), the distance modulus formula relationship, input validation ranges, or that this is a safe pure function.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single 9-word sentence with zero waste. The core action and domain appear immediately at the front.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 2-parameter pure calculation function, the description is sufficient for tool selection despite lacking output units. No output schema exists to compensate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters fully described. The description mentions the parameters but only repeats their names without adding semantic context like expected astronomical value ranges or that 'M' and 'm' are standard notation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a specific verb ('Calculate'), target resource ('star distance'), and exact inputs required ('apparent and absolute magnitude'), clearly distinguishing it from the hundreds of non-astronomy sibling calculation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to prefer this over alternatives (none exist in siblings), prerequisites (astronomical knowledge), or when the calculation is valid.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_statisticsCInspect
Calculate descriptive statistics: mean, median, mode, std dev, quartiles
| Name | Required | Description | Default |
|---|---|---|---|
| values | Yes | Array of numbers |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. While it lists statistical operations, it omits critical behavioral context such as the return format (object structure), confirmation that this is a read-only operation, or handling of edge cases like single-item arrays.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a compact, front-loaded fragment that efficiently communicates the tool's function in nine words without repetition. While appropriately brief for a simple tool, the lack of complete sentence structure slightly reduces clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (single parameter, no nested objects) and absence of an output schema, the description adequately covers the core purpose but leaves gaps regarding the output structure and specific statistical calculation methodologies (e.g., sample vs. population standard deviation).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage with the 'values' parameter documented as 'Array of numbers.' The description adds minimal semantic meaning beyond the schema, though it implicitly contextualizes the parameter as the dataset for analysis. Baseline score applies given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description employs the specific verb 'Calculate' and identifies the resource as 'descriptive statistics,' listing specific measures (mean, median, mode, std dev, quartiles). This effectively distinguishes it from sibling tools like `calculate_average` (likely singular) and domain-specific calculators by clarifying it provides comprehensive statistical summaries.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description enumerates the statistical outputs but provides no guidance on when to select this tool over alternatives such as `calculate_average` or `calculate_linear_regression`. It lacks explicit prerequisites, usage constraints, or exclusion criteria for when this tool is inappropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_string_tensionAInspect
Calculate guitar or bass string tension in pounds, kilograms and Newtons
| Name | Required | Description | Default |
|---|---|---|---|
| frequency_hz | Yes | Target tuning frequency in Hz (e.g. 329.63 for E4) | |
| gauge_inches | Yes | String gauge in inches (e.g. 0.010 for a light gauge high E) | |
| scale_length_inches | Yes | Instrument scale length in inches (e.g. 25.5 for Fender Stratocaster) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It successfully specifies the output units/format (pounds, kg, Newtons) but omits mention of side effects, error conditions (e.g., invalid ranges), or whether this is a pure computational function.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, nine words. Every element serves a purpose: action verb, target resource, and output specification. No redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Mentions output units but lacks description of the return structure (object with three fields? array?) since no output schema exists. Given the simple 3-parameter input schema and lack of annotations, the description is minimally adequate but incomplete regarding return value shape.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with excellent parameter descriptions (including examples like '25.5 for Fender Stratocaster'). Description provides domain context 'guitar or bass' but does not add syntactic or semantic details beyond what the schema already covers, warranting the baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specific verb 'Calculate' with clear resource 'guitar or bass string tension' and explicit output units (pounds, kilograms, Newtons). Distinct from 150+ sibling calculate_* tools by specifying musical instrument domain.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus alternatives, or prerequisites for the calculation. While the domain is niche, there is no explicit 'when to use' or 'when not to use' guidance in the description text.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_student_loan_repaymentCInspect
Calculate student loan repayment schedule
| Name | Required | Description | Default |
|---|---|---|---|
| annual_rate | Yes | Annual interest rate percent | |
| loan_amount | Yes | Loan amount EUR | |
| monthly_payment | Yes | Monthly payment EUR |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full disclosure burden. It fails to describe what the tool returns (amortization table? total interest? payoff date?), whether the calculation assumes fixed rates, or if there are any limitations on loan terms. The mention of 'schedule' implies recurring calculation behavior but lacks specifics needed for an agent to predict output utility.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single 5-word sentence with no waste, but it is under-specified rather than efficiently informative. In the context of 200+ sibling calculation tools, this brevity fails to provide necessary discriminative detail, making it difficult for an agent to select this tool correctly among similar options.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of annotations, missing output schema, and crowded namespace with semantically similar tools (8+ loan/mortgage calculators), the description is insufficiently complete. It should clarify the output format, country/currency applicability, and specific student loan features (e.g., grace periods) to be viable for correct agent selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 3 parameters have descriptions), establishing a baseline of 3. The description adds no additional parameter context (e.g., whether monthly_payment must cover at least interest, or typical rate ranges), but the schema adequately covers the semantics without requiring description supplementation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the verb (Calculate) and resource (student loan repayment schedule), but fails to distinguish from siblings like `calculate_loan_payment` (generic) or `calculate_us_student_loan`/`calculate_uk_student_loan` (country-specific). It does not clarify if this is country-agnostic, what constitutes a 'schedule' (amortization table vs summary), or why an agent should select this over the 5+ related loan calculators.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No usage guidance provided. The description does not indicate when to use this tool versus alternative loan calculators (e.g., `calculate_loan_payment` for simple payments vs this for schedules), nor does it mention prerequisites like specific currency requirements (EUR implied by schema but not stated in description).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_study_scheduleCInspect
Generate a study schedule based on exam date and topics
| Name | Required | Description | Default |
|---|---|---|---|
| exam_date | Yes | Exam date YYYY-MM-DD | |
| topics_count | Yes | Number of topics to study | |
| hours_per_topic | Yes | Hours needed per topic |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It states the action (generate) but omits critical details: whether the schedule is persisted, what format it returns (string vs structured data), whether it accounts for weekends, or any calculation constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single 9-word sentence is efficient and front-loaded, with no redundancy. However, extreme brevity leaves gaps in behavioral and output documentation that additional structured sentences could have addressed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite 100% schema coverage for inputs, the tool lacks an output schema and annotations. The description does not compensate by describing the return value (e.g., daily breakdown, total study hours, calendar format), leaving the agent uninformed about tool outputs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. The description mentions 'exam date and topics' which loosely maps to exam_date and topics_count, but fails to mention hours_per_topic, a required parameter. The description adds minimal semantic context beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Generate' and resource 'study schedule', clearly identifying the tool's function. However, it does not explicitly differentiate from sibling calculation tools like calculate_reading_time or calculate_vacation_days_optimal.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus other scheduling or time-management calculators in the sibling list, nor does it mention prerequisites like needing specific date formats.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_sun_exposureCInspect
Calculate safe sun exposure time based on UV index and Fitzpatrick skin type
| Name | Required | Description | Default |
|---|---|---|---|
| uv_index | Yes | UV index at destination (1–11+) | |
| skin_type | Yes | Fitzpatrick skin type: 1=very fair, 6=very dark |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry full behavioral disclosure. It fails to specify the output format (minutes? hours?), what 'safe' means precisely (erythema threshold?), or include necessary health disclaimers for a medical-adjacent calculation. The agent cannot infer the risk profile or return value structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of nine words with no redundancy. However, given the lack of annotations and output schema, this brevity may be insufficient rather than optimally concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a health-related tool with no output schema and no annotations, the description is incomplete. It omits the output unit, the specific dermatological interpretation of 'safe exposure,' and any indication that this estimates time before sunburn (as implied by Fitzpatrick scale usage).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% description coverage with clear parameter semantics (uv_index range, Fitzpatrick scale definitions). The description mentions both parameters by name but adds no additional semantic context beyond what the schema already provides, meeting the baseline for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the resource (safe sun exposure time) and required inputs (UV index, Fitzpatrick skin type). However, it does not explicitly distinguish from the sibling tool 'calculate_sunscreen_reapply' or specify what 'safe' quantifies (time to burn vs. vitamin D synthesis).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives, prerequisites (e.g., understanding of Fitzpatrick scale), or whether this applies to protected vs. unprotected skin. The description lacks any 'when to use' or 'when not to use' context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_sunrise_approxCInspect
Approximate sunrise/sunset times
| Name | Required | Description | Default |
|---|---|---|---|
| latitude | Yes | Latitude | |
| day_of_year | Yes | Day of year (1-366) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, yet the description discloses nothing about approximation accuracy, calculation method, or the significant limitation that only latitude is required (no longitude/timezone), which are critical behavioral traits for a sunrise calculation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While not verbose, the three-word description is under-specified rather than concise. Critical information about approximation scope and parameter constraints is missing, making it insufficient for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and an unusual input schema (latitude-only sunrise calculation), the description should explain the approximation methodology and accuracy limits. It provides none of this necessary context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for latitude and day_of_year. The description adds no additional semantic context (such as explaining why longitude is omitted), but baseline 3 is appropriate given complete schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states the basic function (approximating sunrise/sunset) but fails to differentiate from sibling tool 'calculate_sunrise_sunset'. With two tools offering similar functionality, the agent lacks guidance on which to select.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this approximate version versus the presumably more accurate 'calculate_sunrise_sunset', or what limitations trigger the need for this specific tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_sunrise_sunsetCInspect
Approximate sunrise and sunset times based on latitude and day of year
| Name | Required | Description | Default |
|---|---|---|---|
| latitude | Yes | Latitude in degrees | |
| day_of_year | Yes | Day of year (1-365) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
While the description discloses that results are 'Approximate,' it lacks critical behavioral details given no annotations exist. It omits timezone handling (UTC vs local), output format (ISO strings, minutes, hours?), and polar day/night edge cases despite accepting latitudes from -90 to 90 where sun may not rise/set for months.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single-sentence description is efficiently front-loaded with no wasted words. However, extreme brevity leaves no room for essential context like timezone assumptions or polar edge cases that would justify a second sentence.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter astronomical calculation with 100% schema coverage but no output schema or annotations, the description covers the basic function but fails to address domain-specific complexities. It should specify how polar latitudes are handled and what format times are returned in.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already documents both parameters ('Latitude in degrees', 'Day of year (1-365)'). The description merely repeats these parameter names without adding semantic context (e.g., that day_of_year excludes leap days or that latitude implies spherical model).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific action ('Approximate') and resource ('sunrise and sunset times') with clear input dependencies ('latitude and day of year'). It implicitly distinguishes from sibling 'calculate_sunrise_approx' by mentioning both sunrise and sunset, though it could clarify why it uses 'Approximate' when a sibling has 'approx' in its name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus 'calculate_sunrise_approx' or other astronomical calculators. There are no prerequisites, exclusions, or alternative workflow suggestions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_sunscreen_reapplyCInspect
Calculate sun protection duration and reapplication time
| Name | Required | Description | Default |
|---|---|---|---|
| spf | Yes | SPF factor | |
| uv_index | Yes | Current UV index | |
| skin_type | Yes | Fitzpatrick skin type 1-6 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It fails to disclose whether this is a pure calculation, what the return value represents (duration in hours?), or any caveats about sun exposure estimates. No mention of side effects or idempotency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 7 words with no redundancy. However, extreme brevity leaves gaps in contextual information that would help an agent understand the tool's specific utility.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a health-related calculation tool with 3 parameters and no output schema, the description is incomplete. It lacks medical disclaimers (estimates only), expected return value description, and guidance on interpreting results for different skin types.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (SPF factor, Fitzpatrick skin type, UV index), so baseline is 3. Description implies these inputs affect the calculation but doesn't add semantic meaning beyond the schema (e.g., doesn't explain how skin type affects duration).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Calculate) and resource (sun protection duration/reapplication time), distinguishing it from sibling calculation tools. However, it lacks specificity about output format (minutes, hours, timestamp?) and scope constraints.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus siblings like 'calculate_sun_exposure' or general timing tools. No mention of prerequisites (e.g., needing current UV index data) or medical disclaimers.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_surface_carrezCInspect
Calculate Carrez law surface area (French legal measurement)
| Name | Required | Description | Default |
|---|---|---|---|
| rooms | Yes | List of rooms with area and ceiling height |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It fails to disclose the critical behavioral rule of Carrez law calculations: that only spaces with ceiling height ≥ 1.80m are fully counted (and how lower spaces are treated), which is the defining characteristic of this legal measurement.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at just seven words with no redundancy. However, the extreme brevity comes at the cost of omitting essential behavioral context, slightly reducing the score from a perfect 5.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a specialized legal calculation tool with no output schema and no annotations, the description is incomplete. It fails to explain the calculation methodology (height thresholds) or what the return value represents, leaving significant gaps for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description does not mention the 'rooms' parameter or explain why both area and ceiling height are required inputs for this specific calculation, relying entirely on the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Uses specific verb 'Calculate' and identifies the specific resource 'Carrez law surface area'. Distinguishes from generic siblings like 'calculate_area' by referencing 'French legal measurement', though it does not explain what distinguishes Carrez law from standard measurements.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus siblings like 'calculate_area' or 'calculate_floor_area', nor does it mention prerequisites such as needing room dimensions with ceiling heights.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_swimming_paceBInspect
Calculate swimming pace per 100m and SWOLF efficiency estimate
| Name | Required | Description | Default |
|---|---|---|---|
| distance_m | Yes | Distance swum in meters | |
| time_minutes | Yes | Total swim time in minutes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses the two outputs calculated (pace per 100m and SWOLF efficiency) which adds useful context. However, it fails to explain what SWOLF represents (strokes + seconds), expected precision, or whether outputs are rounded.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb, zero redundancy. Every word earns its place; appropriate length for a straightforward calculation utility.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 primitive parameters, functional calculation), the description adequately covers the scope by naming both outputs (pace and SWOLF). Absence of output schema is partially mitigated by describing the return concepts, though an explanation of SWOLF would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions ('Distance swum in meters', 'Total swim time in minutes'). Description adds minimal semantic value beyond schema but implies the relationship between inputs and pace calculation. Baseline 3 appropriate given complete schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb (Calculate) and resources (swimming pace per 100m, SWOLF efficiency) identified. However, no explicit differentiation from sibling `calculate_running_pace` or other fitness calculators in the description text.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus alternatives (e.g., `calculate_running_pace` for running), no prerequisites mentioned, and no input validation hints beyond the schema.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_swiss_income_taxAInspect
Calculate Swiss income tax — federal + estimated cantonal tax
| Name | Required | Description | Default |
|---|---|---|---|
| canton | No | Canton of residence | geneve |
| income | Yes | Annual taxable income in CHF |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It successfully indicates that cantonal tax is 'estimated' (crucial accuracy context), but omits other behavioral details like return format, whether values are cached, or that the calculation is read-only/destructive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is optimally concise—every word serves a purpose. It front-loads the action ('Calculate'), specifies the domain ('Swiss income tax'), and adds the critical qualifier ('federal + estimated cantonal') with zero redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (2 simple parameters, no nested objects) and lack of output schema, the description adequately covers the calculation's purpose and estimated nature. It would benefit from a brief note on output format, but remains sufficiently complete for tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (both parameters have descriptions), establishing a baseline of 3. The description adds semantic value by linking the 'canton' parameter to the 'cantonal tax' output, explaining why the canton selection matters for the calculation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') with a clear resource ('Swiss income tax'). The addition of 'federal + estimated cantonal tax' precisely distinguishes this from sibling tools like calculate_swiss_wealth_tax (cantonal only) and calculate_swiss_vat, clarifying the scope of the calculation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implicit usage context through the specificity of 'federal + estimated cantonal,' indicating when this tax calculation is appropriate. However, it lacks explicit when-to-use guidance or distinctions from related siblings like calculate_swiss_salary or calculate_swiss_wealth_tax.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_swiss_lppBInspect
Calculate Swiss occupational pension (LPP / 2e pilier) contributions by age bracket
| Name | Required | Description | Default |
|---|---|---|---|
| age | Yes | Age of employee | |
| gross_annual | Yes | Annual gross salary in CHF |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses calculation logic ('by age bracket'), implying age-dependent contribution rates. However, it fails to describe output format (e.g., employee vs employer shares), side effects, or specific age brackets used.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with action verb, zero redundancy. Appropriately brief for a two-parameter calculator, though given the lack of annotations and output schema, one additional sentence describing return values would have improved completeness without harming conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for identifying the tool's domain (Swiss LPP) and calculation method. However, with no output schema and no annotations, the description should ideally specify what the calculation returns (contribution amounts, rates) to be complete. The 100% schema coverage mitigates this slightly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline 3. Description mentions 'by age bracket', adding semantic context to the 'age' parameter (that it determines rate brackets), but does not significantly expand on 'gross_annual' beyond the schema's 'Annual gross salary in CHF'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Calculate' and specific resource 'Swiss occupational pension (LPP / 2e pilier)'. Uses domain-specific terminology (2e pilier) that identifies the exact pension system. Does not explicitly differentiate from sibling tools like calculate_swiss_salary or calculate_swiss_pillar3a, preventing a 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use versus other Swiss calculation tools (e.g., calculate_swiss_salary) or prerequisites (e.g., minimum coordination salary). Lacks explicit 'when-not-to-use' or alternative suggestions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_swiss_pillar3aCInspect
Calculate Swiss pillar 3a tax savings (3e pilier lie)
| Name | Required | Description | Default |
|---|---|---|---|
| marginal_tax_rate | Yes | Marginal income tax rate in % (federal + cantonal combined) | |
| annual_contribution | Yes | Annual contribution to pillar 3a in CHF (max 7056 for employees, 35280 for self-employed) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but only states the high-level function. It fails to disclose whether this performs a pure mathematical calculation, if it uses current-year tax rules, validation behavior, or what format the result takes.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is front-loaded with the verb 'Calculate' and contains no redundant words. However, extreme brevity leaves gaps in behavioral and usage context that could have been addressed in a second sentence.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple two-parameter calculation tool with rich schema coverage, but lacking expected details given no output schema or annotations exist. It does not mention return value type, calculation methodology, or Swiss-specific tax year considerations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds no information about parameters, but the input schema has 100% coverage with detailed descriptions for both 'annual_contribution' (including CHF limits) and 'marginal_tax_rate' (including federal/cantonal scope), meeting the baseline expectation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the action ('Calculate') and specific domain ('Swiss pillar 3a tax savings'), including the French term '(3e pilier lie)' for disambiguation. However, it does not explicitly differentiate from siblings like 'calculate_swiss_income_tax' or clarify this is specifically for voluntary pension contributions vs. mandatory deductions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., 'calculate_swiss_income_tax' for overall tax calculations), nor does it mention prerequisites or specific use cases such as annual tax filing planning.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_swiss_salaryBInspect
Convert Swiss gross monthly salary to estimated net salary
| Name | Required | Description | Default |
|---|---|---|---|
| gross_monthly | Yes | Gross monthly salary in CHF |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It discloses 'estimated' net salary, indicating approximation/uncertainty, which is valuable. However, it lacks details on calculation methodology, cantonal variations, or what specific deductions (AHV, IV, EO, unemployment insurance) are factored into the estimate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, front-loaded with the action verb. No redundancy or generic filler. Every word earns its place by conveying the transformation (gross→net), domain (Swiss), and nature (estimated).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter conversion tool with no output schema, the description is minimally adequate. However, given the complexity of Swiss salary structures (cantonal tax variations, pillar deductions), the description should clarify whether output is monthly or annual, and what the 'estimate' encompasses. Sibling tools exist for detailed breakdowns, but this isn't referenced.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter description ('Gross monthly salary in CHF'). The tool description implies the input ('gross monthly salary') but does not add syntax details, constraints, or examples beyond what the schema already provides. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear action ('Convert') and specific resource ('Swiss gross monthly salary' to 'estimated net salary'). Effectively distinguishes from siblings like calculate_swiss_income_tax (tax-only) and calculate_belgian_salary (different jurisdiction) by scope. Lacks specificity on which deductions are included (tax, social security, etc.) which would make it a 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus the more specific Swiss calculators (calculate_swiss_income_tax, calculate_swiss_lpp, calculate_swiss_pillar3a) or other salary tools. No mention of prerequisites or limitations for the estimation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_swiss_vatCInspect
Calculate Swiss VAT — convert between HT and TTC
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Input mode: ht=before tax, ttc=after tax | ht |
| rate | No | VAT rate: 2.6% (reduced), 3.8% (hotel), 8.1% (standard) | 8.1 |
| amount | Yes | Amount in CHF |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose output format, whether results include both HT and TTC values, or any calculation methodology details. 'Convert between HT and TTC' implies bidirectional capability but lacks behavioral specifics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 7 words. Front-loaded with purpose. No redundant text, though minimalism limits value-add. Dash-separated subordinate clause efficiently conveys dual functionality.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple 3-parameter calculation tool with excellent schema documentation. Lacks HT/TTC terminology explanation for non-French speakers, but schema compensates. No output schema present, but description doesn't address return structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with detailed descriptions for mode ('ht=before tax, ttc=after tax'), rate (with percentages), and amount (CHF). Description mentions 'HT and TTC' which aligns with mode parameter but doesn't add syntax details beyond schema. Baseline 3 appropriate given schema completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific action (Calculate) and resource (Swiss VAT), with clear scope (convert between HT/TTC). Distinguishes from other country-specific VAT siblings by naming 'Swiss', though doesn't differentiate from calculate_vat_generic.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus calculate_vat_generic, calculate_vat_reverse, or other country-specific VAT calculators. No prerequisites or exclusion criteria mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_swiss_wealth_taxCInspect
Calculate Swiss wealth tax (impot sur la fortune) by canton
| Name | Required | Description | Default |
|---|---|---|---|
| canton | No | Canton of residence | geneve |
| net_wealth | Yes | Net wealth in CHF (assets minus debts) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Fails to disclose whether this is an estimate, which tax year rates apply, if results include cantonal and municipal taxes, or any disclaimer about tax advice.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 7 words, front-loaded with verb 'Calculate'. No redundant information, though brevity comes at the cost of completeness for a tax calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks output schema and provides no indication of return value structure (tax amount, effective rate, breakdown). Missing crucial tax context like applicable year, deduction rules, or whether results include federal wealth tax (which doesn't exist in Switzerland - cantons only).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage (net_wealth defined as 'assets minus debts', canton as 'Canton of residence'). Description mentions 'by canton' reinforcing the enum parameter, but adds no additional semantic detail beyond schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Calculate) and resource (Swiss wealth tax), with French term 'impot sur la fortune' clarifying the specific tax type. However, lacks explicit distinction from sibling tools like calculate_swiss_income_tax.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus siblings (e.g., calculate_swiss_income_tax, calculate_swiss_salary) or prerequisites like requiring specific canton residency.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_tdeeBInspect
Calculate Total Daily Energy Expenditure from BMR and activity level
| Name | Required | Description | Default |
|---|---|---|---|
| bmr | Yes | Basal Metabolic Rate in kcal | |
| activity_level | Yes | Activity level |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full responsibility for behavioral disclosure. It only states the calculation purpose without mentioning side effects (none expected), return value format (the calculated TDEE number), or whether the operation is idempotent. Given the lack of annotation safety hints, this lack of behavioral disclosure leaves gaps for the agent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It front-loads the action verb and immediately specifies the resource and required inputs, making it appropriately sized for a simple two-parameter calculation tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple calculation tool with two parameters and 100% schema coverage, the description is minimally adequate. However, gaps remain: no output schema exists and the description does not indicate what value is returned (e.g., 'returns daily calorie expenditure as a number'), nor does it explain the relationship between activity_level enum values and multipliers.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (bmr: 'Basal Metabolic Rate in kcal', activity_level: 'Activity level'), so the baseline is 3. The description mentions 'from BMR and activity level' which confirms the parameter relationship but does not add syntax details, validation constraints, or format information beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool calculates Total Daily Energy Expenditure (expanding the acronym TDEE) and identifies the specific inputs required (BMR and activity level). However, it does not explicitly distinguish from the sibling calculate_bmr tool (which produces the BMR input this tool requires), though mentioning BMR as an input provides implicit differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like calculate_bmr or calculate_calories_burned. There is no mention that calculate_bmr should be called first to obtain the required bmr parameter, nor when TDEE calculation is appropriate versus other metabolic calculations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_telescope_magnificationCInspect
Calculate telescope magnification and useful limit
| Name | Required | Description | Default |
|---|---|---|---|
| eyepiece_mm | Yes | Eyepiece focal length mm | |
| focal_length_mm | Yes | Telescope focal length mm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It mentions calculating 'useful limit' which hints at behavioral scope beyond basic magnification, but fails to explain what this limit represents (typically maximum theoretical magnification based on aperture) or the calculation method (focal_length / eyepiece_focal_length).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 6 words. Every word earns its place, though 'useful limit' is technical jargon that would benefit from brief elaboration. Front-loaded with action verb. No redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter calculation tool, it identifies the primary output (magnification) and secondary output (useful limit), but leaves the latter unexplained. Missing formula disclosure (focal_length / eyepiece) and definition of 'useful limit'. Adequate but incomplete for astronomy domain specificity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both focal_length_mm and eyepiece_mm are described). The description does not add parameter-specific guidance (e.g., expected ranges, units clarification), but per rubric baseline is 3 when schema coverage is high. No value added beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb (Calculate) and resource (telescope magnification), and distinguishes scope by mentioning 'useful limit' which hints at additional computational output beyond simple magnification. However, it does not explicitly differentiate from other optical calculation siblings like calculate_depth_of_field or calculate_crop_factor.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no explicit when-to-use guidance, alternatives, or prerequisites. While the domain-specific name 'telescope' provides implicit filtering among the many calculate_* siblings, the description itself offers no usage context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_tile_groutCInspect
Grout quantity for tiling
| Name | Required | Description | Default |
|---|---|---|---|
| area_m2 | Yes | Area m² | |
| tile_cm | Yes | Tile size cm | |
| depth_mm | No | Joint depth mm | |
| joint_mm | No | Joint width mm |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails entirely. It does not specify the output format (kilograms? liters? bags?), the calculation formula used, accuracy limitations, or whether it accounts for waste/spillage factors.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While brief at only 4 words, this represents under-specification rather than efficient conciseness. The description is front-loaded with the topic but omits critical functional context that would help an agent understand what calculation is performed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a calculation tool with 4 parameters and no output schema or annotations, the description is insufficient. It lacks formula explanation, unit specifications for results, and differentiation from the sibling tile calculator tool, leaving significant gaps in contextual understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage with brief but clear parameter definitions (e.g., 'Joint depth mm'). The description adds no additional parameter guidance, but baseline 3 is appropriate given the schema adequately documents the 4 parameters without requiring description supplementation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Grout quantity for tiling' identifies the domain (tiling) and subject (grout quantity) but lacks a specific action verb (e.g., 'Calculate', 'Estimate'). It functions as a noun phrase rather than a functional description, making the tool's exact operation ambiguous despite hinting at the general purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus close siblings like 'calculate_tile_quantity' or other construction calculators. There is no mention of prerequisites, required unit systems, or scenarios where this calculation would be needed.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_tile_quantityBInspect
Calculate tiles needed with waste margin
| Name | Required | Description | Default |
|---|---|---|---|
| area_m2 | Yes | Area m² | |
| tile_l_cm | Yes | Tile length cm | |
| tile_w_cm | Yes | Tile width cm | |
| waste_pct | No | Waste % |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It mentions 'waste margin' indicating the calculation includes waste adjustment, but lacks details on calculation methodology, unit conversions (cm to m), or what the return value represents (total tiles, boxes, etc.).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise at 6 words with no redundancy. Front-loaded with the action verb. However, given the lack of annotations and output schema, this brevity may be excessive; the single sentence earns its place but leaves significant gaps.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Incomplete given no output schema and numerous similar siblings. Missing return value description, no differentiation from calculate_carpet_flooring or calculate_wallpaper_rolls, and no behavioral hints despite zero annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with basic param descriptions. The description mentions 'waste margin' connecting to waste_pct, but adds no semantic clarification about unit handling (m² vs cm) or how waste percentage is applied. Baseline 3 appropriate given schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Calculate' + specific resource 'tiles' and mentions 'waste margin' as key functionality. However, it fails to distinguish from similar flooring/material siblings like calculate_carpet_flooring or calculate_brick_count.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to select this tool versus the many sibling calculation tools (calculate_paint_needed, calculate_carpet_flooring, etc.). No prerequisites or exclusions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_time_differenceCInspect
Calculate time difference between two major cities worldwide
| Name | Required | Description | Default |
|---|---|---|---|
| city1 | Yes | First city | |
| city2 | Yes | Second city |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It fails to disclose: what units/format are returned (hours? string? object?), whether DST is accounted for, that operations are read-only/safe, or that despite 'worldwide' in the description, only 10 enumerated cities are supported. The behavioral contract remains largely undocumented.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely compact at 7 words in a single sentence. No redundant phrases or tautologies. However, brevity comes at the cost of omitting critical behavioral and differentiating information that would help distinguish this from 300+ sibling tools. Structure is efficient but under-loaded for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 2-parameter schema with 100% coverage and no output schema, the description provides minimal viable context for invocation but leaves significant gaps. It omits return value structure, fails to clarify the limited city enumeration (10 cities vs implied global coverage), and provides no read-only safety assurance despite being a calculation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both parameters have descriptions), establishing a baseline of 3. The description adds 'major cities worldwide' as semantic context, which aligns with the enum values (Paris, Tokyo, etc.), but does not add syntax details, format requirements, or clarify the mutual exclusivity constraints beyond what the schema already explicitly defines.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb ('Calculate') and resource ('time difference') with scope ('between two major cities worldwide'), clearly indicating it computes temporal offsets between geographic locations. However, it fails to distinguish from siblings like `calculate_time_zone_difference` and `calculate_timezone_convert`, leaving ambiguity about whether this returns timezone offsets, current time differences, or travel durations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus the numerous sibling timezone tools (`calculate_timezone_convert`, `calculate_time_zone_difference`, etc.). No mention of prerequisites, exclusions, or selection criteria. The agent must guess whether this is for current time comparison, timezone offset lookup, or scheduling purposes.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_time_signature_beatsBInspect
Calculate total beats and duration for a musical passage in bars
| Name | Required | Description | Default |
|---|---|---|---|
| bpm | Yes | Tempo in beats per minute | |
| bars | Yes | Number of bars | |
| beat_value | No | Note value of one beat (denominator of time signature, e.g. 4 for quarter note) | |
| beats_per_bar | No | Number of beats per bar (numerator of time signature) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden but fails to disclose output format (e.g., duration in seconds), whether the calculation is pure/idempotent, or any precision constraints. It only states what gets calculated, not how the tool behaves.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence with no redundant words. Front-loaded with the verb 'Calculate' and immediately identifies the outputs (beats and duration) and scope (musical passage).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While adequate for a simple calculation tool with well-documented parameters, the lack of output schema means the description should specify the duration unit (seconds, minutes?) and default time signature behavior (common time 4/4), which it omits.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, providing detailed descriptions for all parameters including `beat_value` (denominator) and `beats_per_bar` (numerator). The description adds minimal semantic value beyond the schema, only contextualizing that these relate to a 'musical passage'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific calculation (total beats and duration), the domain (musical), and the input context (bars). However, it does not explicitly differentiate from siblings like `calculate_bpm_to_ms` or `calculate_frequency_note`.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this versus other music-related calculators, prerequisites for the inputs, or expected use cases (e.g., composition vs. analysis).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_timezone_convertCInspect
Convert time between UTC offsets
| Name | Required | Description | Default |
|---|---|---|---|
| time | Yes | Time to convert HH:MM | |
| to_offset | Yes | Target UTC offset hours | |
| from_offset | Yes | Source UTC offset hours (e.g. 1 for UTC+1) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but fails to indicate output format, whether date rollover is returned when conversion crosses midnight, or error handling for edge cases. 'Convert' implies transformation but lacks implementation details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The single sentence is terse and front-loaded, but underspecified given the tool's complexity and lack of output schema. The brevity creates information gaps rather than efficient precision.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacking both annotations and output schema, the description fails to disclose critical execution details like return value structure, date handling for midnight crossings, or validation constraints, leaving agents under-informed for invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% description coverage with clear descriptions (e.g., 'HH:MM', 'UTC+1' example). The description adds no syntax details or usage examples beyond the schema, warranting the baseline score for high-coverage schemas.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the core action (convert) and specific resource (UTC offsets). However, it does not explicitly differentiate from siblings like 'convert_time' (likely for named timezones) or 'calculate_time_zone_difference' (likely for delta calculation), leaving potential ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides no guidance on when to use this tool versus the 300+ sibling calculation tools, including similar ones like 'convert_time' or 'calculate_timezone_offset'. No mention of prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_time_zone_differenceBInspect
Calculate hour difference between two major cities and current local time
| Name | Required | Description | Default |
|---|---|---|---|
| city1 | Yes | First city | |
| city2 | Yes | Second city |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries the full burden. It adds valuable context that the tool returns 'current local time' (not just static offsets), but critically omits DST handling, whether the difference is signed or absolute, and output format details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single efficient sentence with no filler words. However, the phrase 'and current local time' creates slight syntactic ambiguity (it could modify 'cities' or be a separate output clause), which slightly reduces clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple 2-parameter enum-only input and no output schema, the description adequately covers the main function. However, for a timezone calculation tool, the omission of DST behavior and directional semantics (which city is reference) leaves operational gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage ('First city', 'Second city'), the baseline is met. The description adds 'major cities' which aligns with the enum constraints, but doesn't clarify if city1/city2 order affects results or add semantic differentiation between the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the core action ('Calculate hour difference') and domain ('between two major cities'), with 'and current local time' indicating additional output. However, it doesn't explicitly distinguish from siblings like calculate_timezone_convert or calculate_time_difference.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance is provided on when to use this tool versus siblings (calculate_timezone_convert, calculate_time_difference) or when to use alternatives for non-enum cities. The 'major cities' constraint is mentioned but not the consequence of using this limited set.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_timezone_offsetBInspect
Calculate hour difference between two standard time zones
| Name | Required | Description | Default |
|---|---|---|---|
| to_zone | Yes | Target time zone | |
| from_zone | Yes | Source time zone |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure but provides minimal detail. It mentions 'standard time zones' implying potential DST limitations, but does not specify output format (signed integer? string?), behavior when zones are identical, or whether the result is directional (from_zone to to_zone).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence of seven words with no redundancy. Every word earns its place, delivering the core purpose immediately without extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (two enum parameters with 100% schema coverage) and lack of output schema, the description is minimally adequate. However, it could significantly improve by clarifying the output format and differentiating from the similar 'calculate_time_zone_difference' sibling tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters fully documented ('Source time zone', 'Target time zone'). The description adds the qualifier 'standard time zones,' which provides useful context about the constrained enum values, but this is marginal given the schema already defines the specific allowed values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description provides a clear verb ('Calculate') and specific resource ('hour difference between two standard time zones'), accurately conveying the tool's function. However, it fails to distinguish from similar siblings like 'calculate_time_zone_difference' or 'calculate_timezone_convert'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description offers no guidance on when to use this tool versus the numerous sibling calculation tools (e.g., 'calculate_time_zone_difference', 'calculate_timezone_convert'), or when not to use it. No prerequisites or alternatives are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_tipCInspect
Calculate tip amount and split between people
| Name | Required | Description | Default |
|---|---|---|---|
| bill | Yes | Bill amount | |
| split | No | Number of people splitting | |
| tip_pct | No | Tip percentage |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full disclosure burden. It mentions calculations performed but fails to specify return format (object with total, tip amount, per-person shares?), currency handling, or rounding behavior. Missing output schema compounds this gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single efficient sentence front-loaded with action verb. No redundant or wasted text. However, brevity sacrifices completeness given the lack of output schema and sibling tool confusion.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Incomplete given no output schema exists. Description should specify return values (e.g., total with tip, individual shares) and clarify relationship to similar tools. With 100% input schema coverage but zero output documentation, the tool's complete behavior remains underspecified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing baseline of 3. The description mentions splitting, which aligns with the 'split' parameter, but adds no semantic detail beyond what schema already provides (e.g., no guidance on tip percentage conventions or bill currency expectations).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool calculates tip amounts and handles splitting between people. Specific verb (Calculate) and resources (tip amount, split) are present. However, it fails to differentiate from siblings 'calculate_tip_split' and 'calculate_tip_worldwide', which could cause selection ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance provided on when to use this tool versus 'calculate_tip_split' or 'calculate_tip_worldwide'. No prerequisites or conditions mentioned. The description implies usage but doesn't guide selection among alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_tip_splitBInspect
Calculate tip and per-person amount for a restaurant bill
| Name | Required | Description | Default |
|---|---|---|---|
| tip_pct | Yes | Tip percentage | |
| num_people | Yes | Number of people splitting | |
| bill_amount | Yes | Total bill amount |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the calculation purpose but omits details about output format, rounding behavior, currency handling, or whether it returns individual amounts versus totals.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of nine words with clear front-loading. No redundant or wasted language. Appropriate brevity for a simple calculation utility.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (three primitive parameters, no output schema) and straightforward purpose, the description is minimally sufficient. However, it should disclose what values are returned (e.g., total tip amount and per-person share) given the absence of an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all three parameters (bill_amount, tip_pct, num_people) adequately documented in the schema. The description provides operational context ('per-person amount') but does not add syntax details, constraints, or examples beyond what the schema already provides. Baseline 3 is appropriate for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Calculate') and clearly identifies the resource/action (tip and per-person amounts for restaurant bills). It implicitly distinguishes from sibling 'calculate_tip' by mentioning 'per-person amount,' though it could explicitly reference bill-splitting to make the distinction clearer.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus siblings like 'calculate_tip' or 'calculate_tip_worldwide'. No prerequisites or contextual conditions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_tip_worldwideCInspect
Tip by country customs
| Name | Required | Description | Default |
|---|---|---|---|
| bill | Yes | Bill | |
| country | Yes | Country |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden but discloses no behavioral traits. It does not state what the tool returns (suggested amount? percentage? range?), whether calculations use standard or mandatory rates, or if the bill currency matters.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While physically short (4 words), this is under-specification rather than effective conciseness. The front-loaded ambiguity ('Tip' as noun) and lack of sentence structure hinder comprehension. Every sentence must earn its place; this phrase earns confusion.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 2-parameter calculation tool with no output schema, the description is inadequate. It omits what calculation is performed (multiplication? lookup?), how the country enum maps to tip rates, and what the return value represents.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with basic descriptions ('Bill', 'Country'), establishing a baseline of 3. The description adds no additional parameter semantics (e.g., bill currency expectations, country code format) beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The telegraphic phrase 'Tip by country customs' is grammatically ambiguous (noun vs. verb) and fails to clearly state that the tool calculates tip amounts. It hints at country-specific logic but does not explicitly distinguish from the generic sibling `calculate_tip`.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance provided on when to use this tool versus `calculate_tip` or `calculate_tip_split`. The agent cannot determine from the description whether this is for travel scenarios, requires specific currencies, or when the generic calculator is preferred.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_torusCInspect
Torus volume and surface area
| Name | Required | Description | Default |
|---|---|---|---|
| major_r | Yes | Major radius (center to tube center) | |
| minor_r |