Skip to main content
Glama

Server Details

Connect your espresso routine to your AI assistant. This MCP server allows LLMs to act as a personal coffee coach by accessing your shot logs, grinder settings, and active bean inventory. Use it to automatically calculate extraction adjustments, track remaining coffee supply, and lock in winning recipes directly from your chat session.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsB

Average 3.4/5 across 29 of 29 tools scored. Lowest: 2.4/5.

Server CoherenceA
Disambiguation5/5

Each tool targets a distinct operation on a specific entity (e.g., beans, grinders, shots) or a specific diagnostic function. Even similar-sounding tools like diagnose_shot and diagnose_preview are clearly differentiated by sandbox vs. live.

Naming Consistency5/5

Tool names consistently follow a verb_noun pattern (e.g., register_coffee, list_shots, compute_age). Minor exceptions like kb_changelog are still predictable and do not cause confusion.

Tool Count4/5

With 29 tools, the server covers a broad domain of coffee logging and dialing. While slightly on the heavy side, the tools are well-scoped and each serves a clear purpose without redundancy.

Completeness4/5

The tool set covers most lifecycle operations: CRUD for beans, equipment, and shots, plus advanced diagnostics. Missing operations like update_shot or update_grinder are minor gaps that agents can work around.

Available Tools

29 tools
compute_ageC
Read-only
Inspect

Compute coffee age in days off roast and resting/staling verdict.

ParametersJSON Schema
NameRequiredDescriptionDefault
bean_idYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the description doesn't need to repeat that. However, it fails to disclose behavioral traits like output format (e.g., days as integer, verdict as string), error handling for invalid IDs, or reliance on bean roast dates. The description adds minimal value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no extraneous words. Every word contributes meaning, making it highly efficient for a simple tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite being a simple tool with one parameter and no output schema, the description omits crucial details: it doesn't specify the return structure (e.g., is the verdict a string or boolean?), any assumptions about the bean's registration, or how to interpret results. This leaves ambiguity for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage, and the description does not mention the 'bean_id' parameter at all. It leaves the agent to infer what bean_id refers to and how to obtain it, which is insufficient for correct invocation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Compute coffee age in days off roast and resting/staling verdict' clearly states the tool's function with a specific verb and resource. It distinguishes itself from siblings like 'get_stats' or 'list_beans' by focusing on age computation, but does not explicitly differentiate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites, such as needing a valid bean_id from 'list_beans', nor does it exclude scenarios where this tool is inappropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_shotA
DestructiveIdempotent
Inspect

Delete a logged shot from history. Restores the bean's remaining weight (which is derived from logged doses). Hard delete — there is no undo.

ParametersJSON Schema
NameRequiredDescriptionDefault
shot_idYesID of the shot to delete
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide destructiveHint and idempotentHint. The description adds important behavioral details: restores bean weight and states hard delete with no undo, which goes beyond annotations and helps the agent understand consequences.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no fluff. Action and key consequences are front-loaded. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description provides complete context: purpose, side effects (weight restoration), and irreversibility. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter (shot_id) with 100% schema coverage. Description does not add meaning beyond the schema's 'ID of the shot to delete'. No further elaboration on how to obtain ID or constraints, so baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (delete), resource (shot), and adds important context about restoring bean weight and irreversibility. It distinguishes itself from sibling tools like log_shot, list_shots, diagnose_shot.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives. Does not mention when not to use or reference related tools like archiving. The 'no undo' warning implies caution but is insufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

diagnose_previewA
Read-only
Inspect

Sandbox-only diagnosis: evaluates a hypothetical shot against the real rules engine without logging anything. Identical output shape to diagnose_shot.

ParametersJSON Schema
NameRequiredDescriptionDefault
dose_gYesDose in grams
sourceYesGrinder position source — affects G1 warning
time_sYesExtraction time in seconds
bean_idYesCoffee bean ID (must belong to this account)
yield_gYesYield in grams
grinder_idYesGrinder ID (must belong to this account)
machine_idYesMachine ID (must belong to this account)
grind_labelYesGrinder setting label, e.g. "1.1.3"
sensory_tagsNoSensory observations, e.g. ["sour","bitter"]
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=false. The description adds valuable context: it 'evaluates a hypothetical shot against the real rules engine without logging anything', which transparently explains the non-destructive, sandbox behavior beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with only two short sentences, yet it conveys the core purpose, behavior, and relationship to the sibling tool. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description lacks an explicit output schema but references the output shape of diagnose_shot. For a sandbox tool with 9 parameters fully described in the schema, this is mostly adequate, though it assumes agent knowledge of the sibling's output.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema itself provides adequate meaning for all 9 parameters. The description does not add any additional semantic information about the parameters beyond what is already in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it is a 'sandbox-only diagnosis' that evaluates a hypothetical shot without logging. It specifies 'identical output shape to diagnose_shot', which distinguishes it from the sibling tool 'diagnose_shot' that likely logs results.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'sandbox-only', implying it is for testing or hypothetical scenarios. By referencing 'identical output shape to diagnose_shot', it contrasts with the production sibling, providing context for when to use this tool instead of diagnose_shot.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

diagnose_shotA
Read-only
Inspect

Evaluate shot metrics to provide ranked hypotheses (C6) and warn on flip-flops (G2) or fatigue (G6).

ParametersJSON Schema
NameRequiredDescriptionDefault
shot_idNoOptional shot ID; defaults to last pulled shot.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint true, and the description is consistent. It adds behavioral context by specifying the output types (ranked hypotheses, warnings) but does not disclose any side effects, authentication requirements, or data limits beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no filler. It front-loads key information: the action (evaluate), inputs (shot metrics), and outputs. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one optional parameter, no output schema), the description covers the main purpose and outputs. However, it lacks guidance on when to use this tool versus siblings, which would make it more complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter 'shot_id' fully described. The description adds no additional meaning beyond what the schema already provides, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (evaluate), resource (shot metrics), and outputs (ranked hypotheses and warnings). It references specific diagnostic labels (C6, G2, G6) and distinguishes from sibling tools like 'diagnose_preview'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for diagnosing shot metrics but provides no explicit guidance on when to use this tool versus alternatives like 'diagnose_preview' or 'suggest_next_step'. The context is implied but not clarified.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_dial_stateA
Read-only
Inspect

Retrieve current dialing state, active context, recent shots, recommendations, and locked recipes.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=false. Description adds no additional behavioral context beyond listing the components returned. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence of 12 words, front-loaded with the verb 'Retrieve'. Every word is informative, no filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and zero parameters, the description covers the purpose and output components adequately. Could optionally note that no arguments are needed, but that is implicit. Still sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters; schema coverage is 100%. Description adds context by enumerating the state components, which is helpful for an agent to understand what the tool returns.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb 'Retrieve' and specific resource 'current dialing state' plus listed associated data. Distinct from sibling tools like list_shots or list_recipes by combining multiple pieces of information into one call.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies use for obtaining a comprehensive snapshot of dial state, but does not explicitly state when to use this over alternatives like list_shots or get_stats. No exclusion criteria provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_kb_versionA
Read-only
Inspect

Retrieve the current version of the knowledge base.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, so the description is consistent but adds no additional behavioral context beyond stating the action. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is a single, clear sentence with no unnecessary words. Perfectly concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the description covers the basic purpose, it does not explain what format the version takes or any other return value details. Without an output schema, the description could be more complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, so the schema coverage is 100%. Baseline for 0 parameters is 4; description does not need to add parameter detail.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool retrieves the current version of the knowledge base, using a specific verb and resource. It distinguishes from sibling tools as none others target version retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. While the tool is simple, the description does not provide any context for its appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_ruleA
Read-only
Inspect

Retrieve the detailed text of a specific rule from the knowledge base by its rule ID.

ParametersJSON Schema
NameRequiredDescriptionDefault
rule_idYesThe ID of the rule to fetch, e.g. CAT.DARK
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=false. The description adds that it retrieves 'detailed text', which is consistent but does not disclose any additional behavioral traits beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words, efficiently conveying the tool's purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single parameter, no output schema), the description is complete. It specifies what is retrieved and how.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the parameter description 'The ID of the rule to fetch, e.g. CAT.DARK' already provides clear meaning. The description's mention of 'rule ID' adds no value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'retrieve', the resource 'detailed text of a specific rule', and the method 'by its rule ID'. It distinguishes from sibling tools like get_dial_state or get_kb_version by specifying 'rule'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when you have a rule ID, but does not explicitly state when to use this tool versus alternatives or provide exclusions. It provides no guidance on when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_statsB
Read-only
Inspect

Get compact flat usage stats for a grinder, machine, or bean.

ParametersJSON Schema
NameRequiredDescriptionDefault
idYesThe entity ID
scopeYesThe stats scope
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so safety is known. The description adds 'compact flat' but no further behavioral details like output structure or limitations. Minimal addition beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence of 12 words, front-loaded with the core purpose. No superfluous text, highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple two-parameter tool with annotations, the description covers the purpose and scope. Slightly ambiguous about what 'compact flat usage stats' means, but overall adequate given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both 'id' and 'scope'. The description merely restates the scope but adds no new semantic information beyond the schema, achieving the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'get' and the resource 'compact flat usage stats' for specific entities (grinder, machine, bean). It is specific but could better differentiate from sibling tools like 'list_grinders' that list entities rather than stats.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as 'list_*' tools or 'get_dial_state'. The description does not specify prerequisites or situations where this tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

grinder_mathC
Read-only
Inspect

Determine steps/clicks difference and directions between setting labels (C3).

ParametersJSON Schema
NameRequiredDescriptionDefault
to_labelYes
from_labelYes
grinder_idYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. The description adds that it computes differences and directions, which is consistent. However, it does not disclose any additional behavioral traits beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no fluff. It is as concise as possible, though it sacrifices information density for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, 3 required parameters, and no enums, the description should provide more context. It does not explain what the output looks like, what 'C3' means, or how to use the result. The tool is not fully documented for an autonomous agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description must compensate but only vaguely hints at 'from_label' and 'to_label'. It does not explain parameter types, valid values, or the role of 'grinder_id'. This is insufficient for an agent to invoke correctly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool determines steps/clicks difference and directions between setting labels, which is a specific verb and resource. The 'C3' reference is ambiguous but does not undermine overall clarity. It distinguishes from sibling tools as the only 'math' tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. Among many sibling tools, there is no mention of use cases or exclusions. The description does not help an agent decide when this tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

kb_changelogA
Read-only
Inspect

Retrieve the changelog showing updates and version differences for the knowledge base.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=false, so safety is clear. The description adds that it shows 'updates and version differences', which is additional behavioral context, but does not elaborate on format, pagination, or limits. Some value added beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence that is front-loaded and contains no extraneous words. Every part is useful.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple no-parameter retrieval tool with no output schema, the description is mostly complete. It explains what is retrieved (changelog with updates and version differences). However, it could briefly mention the return format (e.g., list of changes) to be fully self-contained, especially given many sibling tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has no parameters, so the schema coverage is 100% by default. The description provides meaning about what the changelog contains ('updates and version differences'), adding value beyond the empty schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Retrieve') and the resource ('changelog... for the knowledge base'), providing a specific verb and resource. It distinguishes itself from siblings like 'get_kb_version' which likely retrieves the current version, not history.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage (when you want changelog details) but does not provide explicit guidance on when to use this tool versus alternatives, nor any exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_beansA
Read-only
Inspect

List coffee beans registered for the account, with status filter.

ParametersJSON Schema
NameRequiredDescriptionDefault
statusNoFilter by status: current, archived, or all. Defaults to current.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and openWorldHint=false. The description adds the ability to filter by status, which is useful but does not disclose other behavioral aspects like pagination, ordering, or error conditions. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no wasted words. It is concise and directly conveys the purpose and key feature (status filter).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple listing tool with one optional parameter and no output schema, the description is nearly complete. It could mention that the default behavior is 'current' as indicated in the schema, but that is already in the schema. No critical missing information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a single parameter 'status' that includes enums and a default description. The description's mention of 'status filter' adds minimal extra meaning beyond what the schema already provides, earning a baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'list' and the resource 'coffee beans' with a specific scope 'registered for the account' and a filter option. It effectively distinguishes from sibling tools like list_shots or list_grinders which target different resources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the tool is used for listing beans with a status filter, but provides no explicit guidance on when to use it versus alternatives (e.g., search or get endpoints). No mention of prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_grindersA
Read-only
Inspect

List grinders registered for the account, with status filter.

ParametersJSON Schema
NameRequiredDescriptionDefault
statusNoFilter by status: current, archived, or all. Defaults to current.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=false. The description adds minimal behavioral context beyond the purpose, consistent with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with the action, no redundancy or unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional enum parameter and no output schema, the description covers the necessary purpose and parameter adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the status parameter fully described. The description mentions the filter but adds no new semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists grinders and supports a status filter, distinguishing it from sibling tools like register_grinder or set_equipment_archived.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage through the name and filter but lacks explicit guidance on when to use this tool over alternatives or when to avoid it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_machinesA
Read-only
Inspect

List machines registered for the account, with status filter.

ParametersJSON Schema
NameRequiredDescriptionDefault
statusNoFilter by status: current, archived, or all. Defaults to current.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. The description adds that it lists machines with a status filter, but does not disclose pagination, limits, sorting, or any side effects. With annotations covering safety, the description provides minimal additional behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence. It is front-loaded with the primary action and resource. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with one optional parameter and no output schema. While the description covers the basic functionality, it lacks details about return format, pagination, or possible error conditions. Given the low complexity, it is somewhat complete but could be improved.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description mentions 'with status filter,' which matches the schema's 'status' parameter, but adds no new semantics beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (List), resource (machines), and scope (registered for the account). It distinguishes from sibling tools like list_beans and list_grinders by naming the resource explicitly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, no when-not-to-use conditions, and no prerequisites mentioned. The description simply states what it does without contextual usage advice.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_recipesC
Read-only
Inspect

List recipes (all or filtered by equipment and status).

ParametersJSON Schema
NameRequiredDescriptionDefault
statusNoFilter by status: current, archived, or all. Defaults to current.
bean_idNo
grinder_idNo
machine_idNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the description adds minimal behavioral context. It does not mention default behavior (e.g., status defaults to 'current'), pagination, or ordering.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no filler. It communicates the core purpose efficiently, though it could include a bit more detail without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description lacks information about return format, pagination, and defaults for equipment filters. Given the absence of an output schema and low schema coverage, the description is insufficient for complete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description groups bean_id, grinder_id, and machine_id as 'equipment' but does not differentiate them or explain their use. With 25% schema coverage, the description should provide more detail for the undocumented parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies the verb 'list' and the resource 'recipes', and indicates filtering by equipment and status. It clearly states the tool's function, though it could be more specific about which equipment parameters (bean_id, grinder_id, machine_id) are involved.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like list_beans or list_shots. Given multiple sibling list tools, explicit context for when to choose list_recipes would be helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_shotsA
Read-only
Inspect

List shot history with support for pagination and filtering.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoNumber of records to return (1-500)
offsetNoOffset for pagination
filtersNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint=true) already declare read-only behavior. Description adds minimal context beyond listing, lacking details on sorting, default ordering, or return format. Does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with the main action. No redundant information; every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking output schema, description does not explain return format, sorting behavior, or how multiple filters combine. For a tool with nested filters and pagination, more behavioral context is needed for complete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is high (67%), with descriptions for limit, offset, and filter sub-fields. Description merely repeats 'pagination and filtering' without adding new meaning or clarifying parameter relationships.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'List shot history', specifying the verb and resource. Among sibling tools, it distinguishes itself from list tools for other resources (list_beans, list_grinders) and from shot-specific create/delete tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies usage for listing shots with pagination/filtering but provides no explicit guidance on when to use vs. alternatives (e.g., other list tools) or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_watersB
Read-only
Inspect

List registered waters for the account.

ParametersJSON Schema
NameRequiredDescriptionDefault
statusNoFilter by status: current (unarchived), archived, or all. Defaults to current.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, so the description doesn't need to restate that. It adds no behavioral details beyond listing waters, but does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence that efficiently conveys the purpose. It is front-loaded and avoids unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with one optional parameter and read-only annotations, the description is minimally adequate. However, it lacks context about what 'waters' are, the account scope, and return information, especially given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema fully describes the single parameter with enum and default. The description adds no additional parameter information, so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (list) and the resource (registered waters) for the account. It effectively communicates the tool's purpose, though it does not distinguish from sibling list tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like list_beans or list_shots. The description lacks explicit usage context or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lock_recipeB
Idempotent
Inspect

Lock a successful shot as the reference dialing recipe for this equipment profile.

ParametersJSON Schema
NameRequiredDescriptionDefault
drink_intentYes
from_shot_idYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true and destructiveHint=false. The description adds that it 'locks a successful shot' implying a state change, but does not explain what locking entails for future shots or if it overrides previous references. With annotations already covering idempotency and non-destructiveness, the description provides modest additional context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with a clear verb-object structure. No wasted words. Front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple mutation tool with 2 required params and no output schema, the description gives a basic purpose. However, it omits what happens to previous reference recipes, return behavior, or required conditions (e.g., shot must be successful). Adequate but not comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% with no parameter descriptions. The description mentions 'successful shot' hinting at from_shot_id, but does not explain drink_intent (enum straight/milk) at all. The agent must infer its purpose from context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'lock' and the resource 'a successful shot as the reference dialing recipe for this equipment profile', distinguishing it from siblings like get_dial_state or suggest_next_step.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The description does not mention prerequisites, such as having a successful shot, or when locking is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

log_recommendationCInspect

Log a recommended dialing change to allow contradiction/oscillation checking (G2).

ParametersJSON Schema
NameRequiredDescriptionDefault
leverYes
directionYes
rationaleYes
confidenceYes
cited_rulesNo
based_on_shotNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description only says 'log', which implies a write operation, but does not disclose any behavioral traits such as whether it modifies state, requires authentication, or has side effects. Annotations are all 'false' and provide no safety hints, so the description carries the full burden but fails to reveal important behaviors.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that is concise and front-loaded with the action and resource. However, it could be more informative without sacrificing brevity, and the cryptic 'G2' detracts from clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 6 parameters, no output schema, and minimal annotations, the description is woefully incomplete. It does not explain what 'contradiction/oscillation checking' is, how the recommendation is formed, or what the agent should expect after logging. The agent lacks context to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 6 parameters (4 required, 3 with enums) but the description provides zero information about them. It does not explain what 'lever', 'direction', 'rationale', 'confidence', etc., mean or how they relate to the recommended dialing change. Schema description coverage is 0%, so the description adds no value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the action ('log') and the resource ('recommended dialing change'), and specifies the purpose ('contradiction/oscillation checking (G2)'). However, the reference to 'G2' is cryptic and not explained, and there is no differentiation from similar tools like 'log_shot' or 'suggest_next_step', though the unique name helps.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not state prerequisites, when not to use it, or what other tools might be more appropriate. The agent must infer usage solely from the tool name and purpose.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

log_shotBInspect

Log an espresso or alternative shot/brew attempt under active context.

ParametersJSON Schema
NameRequiredDescriptionDefault
tdsNoTotal Dissolved Solids percentage (e.g. 9.1 or 1.35)
dose_gYes
ratingNo
time_sYes
verdictNo
yield_gYes
water_idNoOptional references to waters formulation
best_brewNo1 if marked as best brew, 0 otherwise
favouriteNo1 if favourite, 0 otherwise
beverage_gNoYield mass excluding cup/vessel weight in grams
temp_tasteNo
vessel_nameNoCup/vessel name
bloom_time_sNoPre-wetting or bloom duration in seconds
drink_intentNo
flavor_notesNo
flow_profileNoOpaque flow profile data (JSON/TEXT)
method_toolsNoJSON representation of method tools used (e.g., paper filters, screen)
observationsNo
sensory_tagsNo
finish_actionYes
temperature_cNoBrew temperature in Celsius
data_confidenceNo
vessel_weight_gNoCup/vessel weight in grams
pressure_profileNoPressure profile description or values
first_drip_time_sNoTime to first drip in seconds
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations show readOnlyHint=false and destructiveHint=false, indicating a non-read, potentially non-destructive write. The description adds 'under active context' but does not elaborate on side effects, required permissions, or data handling beyond that.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence that is front-loaded and concise. However, it lacks structured sections or additional details that could improve scannability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (25 parameters, nested objects, no output schema), the description is too brief. It does not define 'active context' or explain return values, which are critical for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 52%, yet the description provides no additional parameter context. It does not mention any of the 25 parameters, leaving the agent to rely solely on the schema for meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('log'), the resource ('espresso or alternative shot/brew attempt'), and the context ('under active context'). This distinguishes it from sibling tools like delete_shot, diagnose_shot, and list_shots.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for logging shots but does not provide explicit guidance on when to use this tool versus alternatives like log_recommendation. No exclusions or when-not-to-use are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_coffeeCInspect

Register a new bag of coffee beans.

ParametersJSON Schema
NameRequiredDescriptionDefault
eanNo
urlNo
costNo
nameYes
stateNo
originNo
ratingNo
co2e_kgNo
originsNoList of bean origin details
qr_codeNo
roasterNo
storageNo
varietyNo
archivedNo1 if archived, 0 otherwise
bean_mixNoE.g. blend details
buy_dateNoDate in YYYY-MM-DD format
currencyNo
finishedNo1 if finished, 0 otherwise
aromaticsNo
bag_notesNo
favouriteNo1 if favourite, 0 otherwise
frozen_atNoDatetime in YYYY-MM-DD HH:MM:SS format
roast_dateYesDate in YYYY-MM-DD format
attachmentsNoJSON or comma-separated list of attachments
frozen_noteNo
opened_dateNo
roast_levelYes
roast_rangeNo
unfrozen_atNoDatetime in YYYY-MM-DD HH:MM:SS format
bag_weight_gNoBag weight in grams
best_by_dateNoDate in YYYY-MM-DD format
process_typeYes
roast_customNo
decaffeinatedNo1 if decaf, 0 if regular
dial_categoryNo
roasting_typeNo
cupping_pointsNo
frozen_storageNo
rest_window_days_maxNo
rest_window_days_minNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description does not disclose behavioral traits beyond the annotations (destructiveHint=false, readOnlyHint=false). It fails to explain side effects, required permissions, or what happens post-registration, which is critical for a write operation with 40 parameters.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no fluff, but it is overly terse for a complex tool, sacrificing necessary detail for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (40 parameters, 4 required, no output schema), the description provides insufficient context. It omits return value, required fields beyond schema, and usage patterns, making it incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With only 33% schema description coverage and a single-line description that does not explain any parameters, the agent receives no meaningful guidance on the many fields, severely hampering correct invocation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'Register' and the resource 'a new bag of coffee beans', making the tool's purpose unambiguous and distinct from siblings like list_beans and update_bean.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as update_bean, nor are there any prerequisites or context for registration mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_grinderBInspect

Register a manual or electric grinder with its dial settings scheme.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesThe name/model of the grinder, e.g. Kinu M47
notesNoAdditional notes
photoNoPhoto path or URL
archivedNo1 if archived, 0 if active
burr_typeYes
max_valueNoOptional travel max bounds
min_valueNoOptional travel min bounds
motor_typeYes
nominal_stepYesSmallest adjustment increment (e.g. 1 click or 0.1collar units)
setting_schemeYesCollar mark style: single clicks or compound (rotation.number.clicks)
components_specNoJSON mapping string representing compound adjustment math
microns_per_stepNoOptional mechanical burr travel microns per nominal step
direction_conventionYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are neutral but description fails to disclose side effects, uniqueness constraints, or required permissions typical for a creation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence is concise and front-loaded with the action, though could include key details without becoming bloated.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, and description omits expected return value, error conditions, or which of 13 parameters are critical, leaving agent underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers most parameters with descriptions (77% coverage), and the description adds little beyond 'with its dial settings scheme'; baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool registers a grinder with its dial settings scheme, distinguishing it from listing or math tools among siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like list_grinders or update functions; no mention of prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_machineCInspect

Register an espresso or alternative preparation machine.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesMachine name/model, e.g. Dedica EC685
notesNo
photoNoPhoto path or URL
toolsNoJSON or comma-separated tools, e.g., paper filters, flow control, metal mesh
archivedNo1 if archived, 0 if active
prep_typeNoPreparation equipment type, e.g., espresso_machine, v60_dripper, French_press
prep_styleNoPreparation style category, e.g., espresso, filter, immersion
boiler_typeNo
control_typeYes
connected_deviceNoMetadata mapping to smart hardware APIs
basket_size_gramsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description only says 'Register', implying a write operation, but adds no behavioral details beyond the annotations. Annotations indicate it is not read-only, not idempotent, and not destructive, but the description does not elaborate, e.g., whether it creates a new record, updates an existing one, or requires authentication.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, which is concise but at the expense of essential details. It front-loads the purpose but omits usage, parameters, or behavior, making it insufficiently informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (11 parameters, no output schema), the description is incomplete. It does not explain the required fields (name, control_type), the meaning of 'alternative preparation machine', or any side effects. The agent lacks necessary context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With schema description coverage at 64%, the description adds no explanation of the 11 parameters. Users must rely solely on the schema, which lacks descriptions for some parameters (e.g., notes, boiler_type, basket_size_grams). The description should have clarified critical parameters like prep_type or control_type.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool registers a machine and specifies 'espresso or alternative preparation machine', distinguishing it from other registration tools like register_grinder or register_coffee. However, it could be more specific about the types of machines covered.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as register_grinder or register_coffee. The description does not mention any prerequisites, contraindications, or preferred scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_programAInspect

Register a programmed shot button on a machine. A program cuts the shot off either by weight (program_type "volumetric", set target_volume_g) or by time (program_type "timed", set target_time_s). One machine can have a mix of both.

ParametersJSON Schema
NameRequiredDescriptionDefault
machine_idYes
descriptionNo
program_typeNoCutoff method: "volumetric" (weight) or "timed" (duration). Defaults to volumetric.
target_time_sNoTarget shot duration cutoff in seconds (required for timed programs)
volume_sourceNoConfidence in the stored cutoff value, for either type
program_numberYesProgram number / button index (e.g. 1 or 2)
target_volume_gNoTarget yield cutoff in grams (required for volumetric programs)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate the tool is not readOnly and not destructive. The description adds context about program types and parameter dependencies but does not disclose behavior like whether registering an existing program_number overwrites or errors. More detail on side effects would improve transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loading the purpose and key distinctions. Every sentence is concise and meaningful, with no waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains the core logic of program types and their parameters, which is sufficient given the schema covers individual parameters. It could mention that program_number is a button index, but the schema already provides that. No output schema exists, so return values are not required.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 71%, and the description adds value by clarifying the default program_type ('volumetric') and the conditional requirement of target_volume_g or target_time_s. Parameters like volume_source are not elaborated, but the schema already describes them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool registers a programmed shot button on a machine, distinguishing between volumetric and timed program types. The verb 'register' is specific, and the mention of mixing both types on one machine differentiates from sibling tools like delete_shot or log_shot.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly explains when to use the tool and which parameters to set based on program_type (target_volume_g or target_time_s). However, it does not mention prerequisites (e.g., machine must exist) or explicitly state when not to use it, though the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

register_waterBInspect

Register a custom or commercial water formulation.

ParametersJSON Schema
NameRequiredDescriptionDefault
ghNoGeneral hardness (GH) in ppm
khNoKarbonate hardness (KH) in ppm
tdsNoTDS in ppm
nameYesWater formulation name, e.g. Lotus Light & Bright, Third Wave Water
typeNoType, e.g., mineralized, tap, distilled, RO
notesNoAdditional notes
sodiumNoSodium concentration in ppm
calciumNoCalcium concentration in ppm
magnesiumNoMagnesium concentration in ppm
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations show non-readonly and non-destructive, but the description adds no behavioral details (e.g., whether duplicate names are allowed, if it overwrites, or what happens on success). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no wasted words. Could be slightly expanded without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 9 parameters and no output schema, the description is minimal. Does not explain return value, validation rules, or behavior for duplicates. Incomplete for a registration tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds nothing beyond the schema; it does not elaborate on parameter meaning or usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'register' and the resource 'water formulation', distinguishing it from siblings like register_coffee or register_grinder. It covers both custom and commercial types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., list_waters to view existing waters). No exclusion criteria or context provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_activeC
Idempotent
Inspect

Set active dial profile context.

ParametersJSON Schema
NameRequiredDescriptionDefault
bean_idNo
grinder_idNo
machine_idNo
program_idNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotency (idempotentHint=true) and non-destructiveness (destructiveHint=false). Description verb 'Set' aligns with readOnlyHint=false but adds no new behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence is concise but under-specified for a tool with 4 parameters and many siblings. Sacrifices completeness for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 4 optional parameters, no output schema, and a vague description, the tool is incomplete. Agent cannot determine how to invoke it correctly or what 'dial profile context' entails.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% and description does not explain any of the 4 parameters (bean_id, grinder_id, machine_id, program_id). Their purpose and usage are completely unspecified.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states verb 'Set' and resource 'active dial profile context,' but the term is vague and doesn't distinguish from sibling tools like 'set_equipment_archived' or 'set_grinder_position.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. No exclusions, prerequisites, or examples provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_equipment_archivedB
Idempotent
Inspect

Archive or restore a grinder, machine, or water.

ParametersJSON Schema
NameRequiredDescriptionDefault
idYesThe ID of the equipment
kindYesThe kind of equipment to archive/restore
archivedYes1 to archive, 0 to restore
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses the archive/restore behavior, and annotations already provide idempotentHint=true. It adds minimal new information beyond annotations, so a 3 is appropriate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence with no wasted words. It is appropriately sized for a simple tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with 3 parameters and no output schema, the description adequately covers the action and object types. It does not explain return values, but that is acceptable given the tool's simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema documents all parameters fully. The description does not add extra meaning beyond the schema, baseline 3 is correct.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (archive or restore) and the resource types (grinder, machine, or water). However, it could be more precise by explicitly stating it sets the archived flag. It distinguishes from sibling tools like set_active or set_grinder_position.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool, when not, or what alternatives exist. Given the sibling tools include list_grinders and register_grinder, such context would be helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set_grinder_positionA
Idempotent
Inspect

Set the current grinder collar position. source="measured" resets verification freshness.

ParametersJSON Schema
NameRequiredDescriptionDefault
sourceYesWhether setting is verified ("measured") or guess ("recalled"/"assumed")
grinder_idYes
setting_labelYesVerbatim display label, e.g. "1.1.2"
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide idempotentHint=true and destructiveHint=false; the description adds context about source='measured' resetting verification freshness, which is useful behavioral detail beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with main purpose, no redundancy. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple set operation with no output schema, the description covers core behavior but omits return values, prerequisites (e.g., grinder existence), and explanation of 'verification freshness'. Adequate but leaves some gaps for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 67%; two parameters have descriptions in schema. The description does not add new meaning for grinder_id (missing schema description) beyond stating the role of source='measured'. Adequate but does not fully compensate for the gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool sets the grinder collar position, adding a specific behavioral note about source='measured' resetting verification freshness. It distinguishes from siblings as no other tool sets this position.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for setting grinder position but provides no explicit guidance on when to use vs alternatives or prerequisites. Siblings include get_dial_state for reading, but no exclusion criteria are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

suggest_next_stepB
Read-only
Inspect

Retrieve the single next experiment step from the reasoning engine.

ParametersJSON Schema
NameRequiredDescriptionDefault
bean_idYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and openWorldHint=false. Description aligns with 'Retrieve' but adds no behavioral details beyond what annotations provide. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single, concise sentence that front-loads the purpose. No extraneous words or redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the single parameter with no documentation and lack of output schema, the description should provide more context on parameter usage, return behavior, and when to call this function. It fails to inform the agent adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% coverage of parameter descriptions. The description does not explain what 'bean_id' means or its role, leaving the agent to infer from the name alone. This is insufficient for unambiguous invocation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'Retrieve', the specific resource 'single next experiment step', and the source 'reasoning engine'. It distinguishes itself from sibling tools like diagnose_shot or get_stats by focusing on experiment step retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings. The description does not indicate prerequisites, scenarios, or exclusions. A single sentence with no usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_beanC
Idempotent
Inspect

Update coffee bag fields with structured reasoning (G5, C10).

ParametersJSON Schema
NameRequiredDescriptionDefault
fieldsYes
reasonYesA clear justification why this category/roast level is updated.
bean_idYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotentHint=true and destructiveHint=false. The description adds 'structured reasoning' but does not elaborate on side effects, return behavior, or required permissions. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (one sentence) but includes cryptic 'G5, C10' which adds noise. It is concise but clarity is somewhat sacrificed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the nested 'fields' object, lack of output schema, and sparse description, the tool definition is insufficiently complete. The agent is left without crucial context on success/error handling or expected behavior beyond basic update.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 33% for top-level parameters; sub-properties of 'fields' are well-documented. The description only hints at the 'reason' parameter ('structured reasoning') and adds no additional meaning for 'bean_id' or 'fields'. It does not compensate for the low coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool updates coffee bag fields and requires structured reasoning. The verb 'Update' and resource 'coffee bag fields' are specific enough, and the name 'update_bean' reinforces the purpose. However, the cryptic 'G5, C10' may confuse.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There is no mention of prerequisites, when to avoid, or comparison to sibling tools. The description is too minimal.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources