Skip to main content
Glama

Japan Public Ledgers MCP

Server Details

Tamper-evident daily time-series ledgers across 10 Japanese public-data domains (subsidies-laws).

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsC

Average 3/5 across 72 of 72 tools scored. Lowest: 2.1/5.

Server CoherenceA
Disambiguation5/5

Each tool belongs to a clearly distinct domain (agent, bid_watch, grant_watch, etc.) and within each domain, suffixes like search, get, recent_changes, timeline, verify_ledger unambiguously define the operation. No two tools overlap in purpose.

Naming Consistency5/5

All tool names follow a consistent snake_case pattern with domain prefix (e.g., agent_, bid_watch_) and standardized verb suffixes. The naming schema is uniform across all 72 tools.

Tool Count4/5

72 tools is a high count, but it covers a wide range of Japanese public ledger domains plus agent utilities. Each domain has a compact and well-scoped set (5-6 tools), making the total appropriate for the aggregate scope.

Completeness5/5

Each watch domain provides all necessary operations: search, get detail, recent changes, timeline, and integrity verification. The agent tools cover identity, memory, proxy, captcha, temp mail, trust, and webhooks comprehensively. No obvious gaps.

Available Tools

77 tools
agent_audit_queryCInspect

Query agent actions with filters

ParametersJSON Schema
NameRequiredDescriptionDefault
toNo
fromNo
limitNo
agentIdNo
riskMinNo
sessionIdNo
actionTypeNo
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description only says 'Query' which implies read-only but does not explicitly state behaviors like destructiveness, rate limits, or authorization needs. The description fails to add any behavioral context beyond the verb.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, but it is under-specified. While it is front-loaded, it does not earn its place because it lacks critical information about parameters and behavior, making it insufficiently concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters, no output schema, and no annotations, the description provides almost no useful context. It does not explain return format, pagination, error conditions, or any constraints, making it incomplete for an AI agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning the schema provides no explanations for the 7 parameters. The description only mentions 'with filters' without elaborating on what each filter ('to', 'from', 'limit', 'agentId', 'riskMin', 'sessionId', 'actionType') does or how to use them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Query agent actions with filters' clearly specifies the verb (query) and resource (agent actions), indicating a read operation with filtering. It distinguishes from siblings like 'agent_audit_record' or 'agent_audit_report' by implying a broad query rather than a single record or report generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as 'agent_audit_record' or 'agent_audit_report'. There is no mention of prerequisites, context for use, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_audit_recordCInspect

Record an agent action for audit and compliance

ParametersJSON Schema
NameRequiredDescriptionDefault
inputNo
outputNo
agentIdYes
metadataNo
sessionIdNo
actionNameYes
actionTypeYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description bears full responsibility for behavioral disclosure. It only states the purpose but omits important traits such as idempotency, side effects, required permissions, or error handling. This is insufficient for an audit recording tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, which is concise. However, it is too minimal and does not front-load critical information. A bit more detail would improve utility without sacrificing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 7 parameters, no output schema, and no annotations, the description is extremely sparse. It lacks necessary context about data constraints, usage patterns, or expected behavior, leaving the agent under-informed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 7 parameters with 0% description coverage. The description does not explain any parameter beyond their names and types. For example, 'input' and 'output' are generic objects with no clarification on structure requirements.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Record an agent action for audit and compliance', which clearly indicates the verb (record) and resource (agent action). However, it does not differentiate from sibling tools like agent_audit_query and agent_audit_report, which are related but distinct operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives (e.g., agent_audit_query, agent_audit_report). There is no mention of prerequisites, context, or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_audit_reportCInspect

Generate audit report (json/markdown/soc2 format)

ParametersJSON Schema
NameRequiredDescriptionDefault
toYes
fromYes
formatNomarkdown
agentIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavior. It states 'generate' but does not clarify if this is a read-only operation, if it triggers side effects, or what permissions are required. The required parameters (agentId, from, to) hint at a time-bound query, but no behavioral details are provided.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single line, which is concise, but it sacrifices important details. It does not front-load critical usage constraints or parameter hints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 4 parameters, no output schema, and the complexity of an audit report generator, the description is inadequate. It fails to mention return format, pagination, or any limitations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, yet the description adds no parameter explanations. Only the format enum is hinted by listing formats in parentheses. The meaning of agentId, from, to, and format (e.g., differences between markdown and soc2) is left ambiguous.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates an audit report and lists supported formats (json/markdown/soc2). It distinguishes from sibling 'agent_audit_query' and 'agent_audit_record' through the verb 'generate', but does not explicitly differentiate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like agent_audit_query or agent_audit_record. The description implies report generation but lacks context for decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_captcha_solveCInspect

Solve a CAPTCHA for a domain you own or have explicit permission to access

ParametersJSON Schema
NameRequiredDescriptionDefault
typeYes
actionNo
pageUrlYes
siteKeyNo
questionNo
imageBase64No
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavior. It states 'solve a CAPTCHA' but omits critical details such as what the tool returns (e.g., a token), any side effects, authentication requirements, or rate limits. This is insufficient for an automated CAPTCHA-solving tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, technically concise, but lacks essential details. It sacrifices completeness for brevity, making it under-specified for effective agent use.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of annotations, output schema, and 0% parameter description coverage, the description is grossly incomplete. It does not explain return values, authentication, or parameter usage, leaving agents without crucial context for successful invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, yet the description adds no information about the 6 parameters. Parameter roles like 'type', 'pageUrl', 'siteKey', 'imageBase64' remain unexplained, relying solely on property names. The description fails to clarify expected input formats or relationships.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The name 'agent_captcha_solve' and description 'Solve a CAPTCHA' clearly indicate the primary function. However, it does not differentiate from the sibling tool 'agent_captcha_verify_domain', leaving ambiguity about when to use solving versus verification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies a condition: 'for a domain you own or have explicit permission to access', which implies appropriate usage contexts. But it provides no guidance on when not to use this tool or mention of alternatives like the verification tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_captcha_verify_domainCInspect

Verify ownership of a domain before using CAPTCHA solving

ParametersJSON Schema
NameRequiredDescriptionDefault
domainYes
methodYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description only states the basic action without disclosing behavioral traits such as whether it modifies data, requires authentication, or what happens on failure. It lacks transparency beyond the literal action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that front-loads the action. It is concise with no wasted words, though it could benefit from slightly more detail without losing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a verification tool with 2 parameters and no output schema, the description is incomplete. It does not explain expected outcomes, return values, or how to interpret results, which is critical for a verification step.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description should add meaning to parameters. It does not explain the enum values for 'method' or constraints on 'domain'. The parameter names are self-explanatory, but no additional context is provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Verify ownership of a domain' and ties it to the context 'before using CAPTCHA solving'. It is specific enough to distinguish from sibling tool 'agent_captcha_solve'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The phrase 'before using CAPTCHA solving' implies a prerequisite context, but there is no explicit guidance on when not to use or alternatives. It provides a hint but not full directive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_identity_activitiesAInspect

List activity records for an identity, newest first (owner only)

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
identityIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It reveals ordering (newest first) and access control (owner only). However, it does not disclose rate limits, pagination behavior, or any side effects. The provided context is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that conveys essential information without any unnecessary words. All parts are relevant and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and only two parameters, the description is minimal. It does not explain what constitutes an activity, what fields are returned, or if there is pagination. It is adequate for a simple list tool but could be more complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description must compensate. It clarifies that identityId refers to the identity whose activities are listed, but it does not describe the limit parameter or its default behavior. Partial compensation for one parameter, missing detail for the other.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List', the resource 'activity records for an identity', and the scope 'owner only'. This is specific and distinguishes the tool from siblings like agent_identity_record or agent_identity_lookup.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for listing activities owned by the caller (owner only). However, it does not explicitly state when to use this tool versus alternatives like agent_identity_record or the many watch tools, nor does it provide explicit when-not or alternative tool suggestions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_identity_badgeBInspect

Get the issuer-signed badge and signed fields for an identity

ParametersJSON Schema
NameRequiredDescriptionDefault
identityIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Description only states it's a get operation but lacks behavioral details like prerequisites, error handling, or what happens if identityId is invalid.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, 10 words, front-loaded with action. No fluff, but lacks any structural cues like bullet points or clarifications. Adequate for minimal description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has low complexity (1 param, no output schema, no annotations). Description states what it retrieves but doesn't mention return value or behavior. Adequate but could be more complete by noting output includes badge and signed fields.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and description adds minimal meaning beyond the schema. 'identityId' is stated as 'for an identity', but no format, constraints, or use context provided. With a single parameter, more detail expected.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'Get' and resource 'issuer-signed badge and signed fields' for an identity. This distinguishes it from sibling tools like agent_identity_lookup or agent_identity_record, which focus on general identity info.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. For example, doesn't specify that agent_identity_lookup is for basic info, making it unclear when to choose this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_identity_lookupBInspect

Look up an identity. Returns signatureValid (issuer+integrity only, NOT an authenticity/safety signal) and a disclaimer.

ParametersJSON Schema
NameRequiredDescriptionDefault
identityIdYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explicitly clarifies that signatureValid is only an issuer+integrity check and not an authenticity/safety signal, which is a critical behavioral warning. It also notes a disclaimer is returned. This adds transparency beyond the bare schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, consisting of one sentence with a parenthetical clarification. Every word adds value, and the most important action ('Look up an identity') is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple lookup tool with no output schema, the description provides the return fields and a caveat, but it does not explain what an 'identity' is, how the identityId is used, or the nature of the disclaimer. It is adequate but incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, but the description does not explain the identityId parameter at all. It only mentions return values, leaving the parameter's meaning and format entirely unspecified.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Look up' and the resource 'an identity', distinguishing it from sibling tools like agent_identity_record or agent_identity_badge. It also specifies the return values (signatureValid and disclaimer), adding clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like agent_identity_activities or agent_identity_badge. The description does not mention prerequisites, context, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_identity_recordBInspect

Append a hash-chained activity record (owner only). Optional provenance (repo/version/config) is self-reported.

ParametersJSON Schema
NameRequiredDescriptionDefault
contentNo
identityIdYes
provenanceNoSelf-reported origin of the activity (NOT verified)
activityTypeYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so description carries the burden. It discloses that the operation appends (mutation) and that provenance is self-reported and not verified. However, it omits details like whether the record is immutable, what authorization exactly is needed, and any rate limits or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, each carrying essential information: the core action and the optional nature of provenance. Word-efficient and front-loaded. No wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 4 parameters including a nested object, no output schema, and no annotations, the description is insufficient. It does not explain valid activity types, content structure, or prerequisites like an existing identity. An agent would lack confidence in constructing a correct call.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 25% (provenance has a description). The description adds that provenance is self-reported but does not explain the content, identityId, or activityType fields. With low coverage, the description should provide more detail for those parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool appends a hash-chained activity record, owned only by the user. The resource and action are specific, and it distinguishes from siblings like agent_identity_register (which creates identities) and agent_identity_activities (which queries activities).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'owner only' which implies a usage constraint but does not clarify when to use this tool versus alternatives like agent_memory_store or other identity tools. No explicit guidance on prerequisites or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_identity_registerBInspect

Register an agent and get a unique identity ID + issuer-signed badge. agent_name/metadata are self-reported and unverified.

ParametersJSON Schema
NameRequiredDescriptionDefault
metadataNo
agentNameYes
publicKeyNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses that inputs are self-reported and unverified, but lacks details on destructive actions, authentication requirements, or rate limits. Without annotations, the description partially fulfills the transparency burden.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences with front-loaded purpose. Every word adds value. No redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains return value (ID and badge) but no output schema exists. It doesn't cover prerequisites, permissions, or how to use the badge. Adequate but not comprehensive for a 3-parameter tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so description must compensate. It explains agentName and metadata but omits publicKey entirely. The metadata object type is not elaborated. Only two of three parameters receive any semantic help.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool registers an agent and returns an identity ID and badge. It specifies that agent_name/metadata are self-reported and unverified, which helps differentiate from lookup or record tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like agent_identity_lookup or agent_identity_badge. No 'when not to use' information provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_memory_deleteCInspect

Delete a memory or all memories in a namespace

ParametersJSON Schema
NameRequiredDescriptionDefault
keyNo
agentIdYes
namespaceNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations, so description must cover behavior. It only states 'delete' with no details on irreversibility, authorization needs, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence is concise and front-loaded, but lacks structure to clarify parameter usage.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a deletion tool with 3 undocumented parameters and no output schema, the description is incomplete; lacks explanation of when key vs. namespace deletes all.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and description adds no meaning beyond parameter names. Ambiguity about key optionality and namespace behavior.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action (delete) and resource (memory or all memories in a namespace), distinguishing it from sibling tools like get, search, store.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidance provided; does not specify when to use this tool vs alternatives, nor any prerequisites or caveats.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_memory_getCInspect

Retrieve a stored memory by key

ParametersJSON Schema
NameRequiredDescriptionDefault
keyYes
agentIdYes
namespaceNodefault
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose any behavioral traits such as idempotency, authorization requirements, or side effects. For a simple retrieval, basic behavioral context is missing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no waste, but it is too minimal to be fully useful. It could be expanded with parameter details while remaining concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema and param descriptions, the description is incomplete. It does not explain return values or parameter roles, leaving the agent with insufficient context to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage and no parameter descriptions in the input schema, the description adds no meaning to the three parameters. It only mentions 'by key' but fails to explain agentId or namespace.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Retrieve a stored memory by key' clearly states the verb 'retrieve', the resource 'memory', and the method 'by key'. It distinguishes this tool from siblings like agent_memory_store and agent_memory_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like agent_memory_search. There is no mention of prerequisites or context for invocation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_memory_storeBInspect

Store a memory for an AI agent (key-value, with TTL and metadata)

ParametersJSON Schema
NameRequiredDescriptionDefault
keyYes
valueYesAny JSON value
agentIdYesAgent identifier
metadataNo
namespaceNodefault
ttlSecondsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full behavioral burden. It states 'store' (indicating mutation) and mentions TTL/metadata features, but does not disclose potential side effects (e.g., overwriting existing keys), authorization needs, size limits, or return behavior. Significant gaps for a write operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence of 14 words, front-loading the core action and features. No extraneous information. Efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 6 parameters, no output schema, and no annotations, the description covers the basic functionality but omits key details about agentId and namespace. It also does not explain the return value or behavior on success/failure. Adequate but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (33%). The description adds meaning by calling it 'key-value' and referencing TTL and metadata, which maps to key, value, ttlSeconds, and metadata. However, it does not explain the agentId or namespace parameters, nor does it clarify that key is required. Partial compensation but leaves gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'store', the resource 'memory for an AI agent', and the key-value nature with TTL and metadata. This distinguishes it from sibling tools like agent_memory_get, agent_memory_delete, and agent_memory_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. No mention of prerequisites or when not to use it. The description is purely functional without usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_proxy_fetchBInspect

Fetch a URL via a rotating proxy (region/type selectable). robots.txt enforced.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYes
bodyNo
typeNo
methodNo
regionNo
headersNo
sessionIdNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must fully disclose behavior. Only mentions robots.txt enforcement; lacks details on rate limits, error handling, or what happens on blocked requests. Insufficient for a 7-parameter tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loads core action, adds key details efficiently. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has 7 parameters, nested objects, no output schema. Description is too brief—omits return format, error scenarios, and usage of headers/method. Incomplete for safe and effective invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%; description only adds meaning for 'region' and 'type' via 'region/type selectable'. Fails to explain 'body', 'method', 'headers', 'sessionId'. Parameters largely undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Fetch a URL via a rotating proxy' with specifics on region/type selectability and robots.txt enforcement. Distinguishes well from siblings like agent_proxy_session.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Mentions rotating proxy and robots.txt enforcement, implying usage for web scraping with proxy selection. However, no explicit when-to-use or when-not-to-use guidance compared to alternatives like direct fetch or captcha solving.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_proxy_sessionBInspect

Create a sticky proxy session (same IP for multiple requests)

ParametersJSON Schema
NameRequiredDescriptionDefault
typeNo
regionNo
ttlSecondsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It adds the key behavioral trait of 'stickiness' (same IP for multiple requests), but fails to disclose other behaviors such as session expiration (though hinted by ttlSeconds parameter), how to terminate the session, or whether the session is reusable across different requests.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence (8 words) that immediately conveys the core purpose. No superfluous information or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has three parameters, no output schema, and no annotations, the description is insufficient. It does not explain what the session returns (e.g., a session ID token), how to use the session in subsequent requests, or any configuration details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning the input schema provides no explanations. The description does not mention any of the three parameters (type, region, ttlSeconds), leaving the agent to guess their purpose and valid values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Create' and the resource 'sticky proxy session', and parenthetically explains what 'sticky' means (same IP for multiple requests). This effectively distinguishes the tool from sibling 'agent_proxy_fetch', which likely fetches a single non-sticky proxy.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'agent_proxy_fetch'. The description does not mention any prerequisites, exclusions, or context in which the tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_tempmail_createCInspect

Create a temporary email address (auto-expires)

ParametersJSON Schema
NameRequiredDescriptionDefault
ttlSecondsNo
preferredPrefixNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It mentions 'auto-expires' but fails to specify the expiration duration (though the parameter ttlSeconds hints at it), return value, or any side effects. The description is too brief for a creation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, with only one sentence. While brevity is good, it sacrifices clarity on parameters and behavior. It could add a brief sentence about parameter usage without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has two optional parameters that are not described, and no output schema is provided. The description does not explain valid values for ttlSeconds or preferredPrefix, nor what the tool returns. For a creation tool, this is insufficient for an agent to use correctly without further inference.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description provides no information about the parameters (ttlSeconds, preferredPrefix). Since schema coverage is 0%, the description fails to add meaning to the schema, leaving the agent guessing about parameter formats and usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb-resource combination: 'Create a temporary email address'. The '(auto-expires)' adds lifecycle context. This clearly distinguishes it from sibling tools like agent_tempmail_get (retrieve), agent_tempmail_list (list), agent_tempmail_wait (wait for email).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for creating temporary email addresses but does not provide explicit guidance on when to use this tool versus siblings, nor does it mention prerequisites or alternative approaches. Usage is inferred from the context of the sibling tool names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_tempmail_getCInspect

Get full message content with extracted verification links/codes

ParametersJSON Schema
NameRequiredDescriptionDefault
mailboxIdYes
messageIdYes
includeRawNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full responsibility for behavioral transparency. It does not disclose side effects, idempotence, rate limits, or any other behavioral traits beyond stating what it does.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (one sentence), but it is too minimal and lacks structure. It could be improved by adding parameter descriptions or usage context without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters, no annotations, no output schema, and no nested objects, the description fails to provide sufficient context. It does not explain return values, the format of extracted links/codes, or any error handling.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate by explaining parameters. It does not mention any parameter (mailboxId, messageId, includeRaw) or their semantics, leaving the agent to infer from names alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action ('Get') and the resource ('full message content'), and adds value with 'extracted verification links/codes', distinguishing it from siblings like agent_tempmail_list which lists messages, and agent_tempmail_wait which waits for new messages.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. Does not mention that it should be used after obtaining a message ID from agent_tempmail_list, nor does it specify prerequisites or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_tempmail_listCInspect

List received messages in a mailbox

ParametersJSON Schema
NameRequiredDescriptionDefault
afterNo
limitNo
mailboxIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full behavioral disclosure burden. It only states the basic action, omitting important traits such as authentication requirements, idempotency, error behavior (e.g., mailbox not found), or whether it mutates state.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence is front-loaded and concise, but may be too brief. While no wasted words, the description could be expanded slightly to add value without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters, no output schema, and no annotations, the description is incomplete. It fails to explain parameter roles (e.g., pagination with 'limit', date filtering with 'after') or return value structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no semantic meaning to the parameters ('after', 'limit', 'mailboxId'). The agent gets no additional context beyond the schema itself.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'List received messages in a mailbox' uses a specific verb ('List') and resource ('messages in a mailbox'), making the action and object clear. It distinguishes from sibling tools like 'agent_tempmail_get' (get one message) and 'agent_tempmail_create' (create mailbox).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings like 'agent_tempmail_get' or 'agent_tempmail_wait'. The description does not provide context for appropriate usage or exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_tempmail_waitBInspect

Wait for an incoming message (long polling, max 60s)

ParametersJSON Schema
NameRequiredDescriptionDefault
mailboxIdYes
fromContainsNo
timeoutSecondsNo
subjectContainsNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses 'long polling, max 60s' which is a key behavioral trait. However, no annotations are provided, and the description does not cover what happens on timeout (e.g., empty response or error), nor whether it is destructive. Additional context on behavior would improve this.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

One short sentence, front-loaded with purpose. Concise but slightly underspecified. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 4 parameters, no output schema, and no annotations, the description is insufficient. It does not explain filtering parameters (fromContains, subjectContains) or the timeout behavior in detail. A more complete description would include what the tool returns (e.g., message object) and how the parameters affect waiting.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain any of the 4 parameters. For example, 'fromContains' and 'subjectContains' are left entirely undefined. The description mentions max 60s but the default timeoutSeconds is 30, creating minor inconsistency.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verb 'Wait' and resource 'incoming message', with method 'long polling, max 60s'. It clearly distinguishes from sibling tools like agent_tempmail_get or agent_tempmail_list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for waiting for new messages, but lacks explicit when-to-use or when-not-to-use compared to alternatives like agent_tempmail_get or agent_tempmail_list. No guidance on prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_trust_batchBInspect

Get trust scores for multiple subjects in one call (max 100)

ParametersJSON Schema
NameRequiredDescriptionDefault
subjectsYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must carry burden. It mentions the max batch size, but does not disclose behavior like idempotency, rate limits, error handling, or read-only nature. Minimal transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no fluff, front-loaded with purpose and constraint. Every word is necessary.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, no annotations, and no parameter detail. The description only provides a high-level purpose and a limit. Lacks essential information for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and description adds no explanation of the 'subjects' parameter's structure (the 'type' and 'value' fields). Without guidance, an agent may not know what values to provide.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'Get' and the resource 'trust scores for multiple subjects', distinguishing it from sibling tools like agent_trust_score which likely handles single subjects. The batch nature and limit are explicit.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for batch queries with a maximum of 100 subjects, but does not explicitly state when to use this tool vs agent_trust_score or provide exclusions/alternatives. Context from sibling tools suggests differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_trust_feedbackCInspect

Submit feedback about an agent/wallet (positive or negative)

ParametersJSON Schema
NameRequiredDescriptionDefault
ratingYes
categoryYes
evidenceNo
subjectTypeYes
subjectValueYes
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose all behavioral traits. It states 'Submit feedback' but does not clarify whether this is a write operation, if it updates existing entries, requires authentication, or has rate limits. The effect on the trust system is undefined, making the behavior opaque.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence with no fluff, but it lacks crucial information about parameters and behavior. It is concise but at the expense of completeness. A slightly longer description with key details would be more useful.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and 0% parameter description coverage, the description is severely incomplete. It does not explain return values, side effects, or parameter semantics. The tool's full context is missing, leaving the agent with insufficient information to invoke it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must add meaning. It hints that rating reflects positive/negative sentiment but does not explain the -2 to 2 scale, the meaning of category, evidence, or subjectType. The agent must infer from parameter names and enums, which is risky. More detail is needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Submit feedback' and the resource 'agent/wallet', and mentions positive/negative, which conveys the tool's core purpose. However, it does not explicitly distinguish it from sibling trust tools like agent_trust_score or agent_trust_batch, but the action is distinct enough.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. It does not mention prerequisites, limitations, or differentiate it from agent_trust_batch (batch trust actions) or agent_trust_score (checking trust). The description is silent on usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_trust_scoreCInspect

Get trust score for a wallet, agent card URL, or domain

ParametersJSON Schema
NameRequiredDescriptionDefault
subjectTypeYes
subjectValueYes
includeDetailsNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations present, so description must cover behavioral traits. It only states it 'gets trust score' without mentioning read-only nature, rate limits, error handling, or data freshness. Minimal information.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is extremely concise (one sentence) and front-loaded, but at the cost of missing essential information. Appropriate length for a simple tool, but incomplete.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 3 parameters, no output schema, and no annotations, the description provides almost no context. It fails to explain return values, parameter details, or error conditions, making it insufficient for correct use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and description adds no meaning to parameters. subjectType's enum is mentioned but not elaborated, subjectValue lacks format hints, and includeDetails is unspecified despite default value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'Get' and resource 'trust score' for specific subject types (wallet, agent card URL, domain). It distinguishes from siblings like agent_trust_batch by implying a single query, but lacks explicit differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives such as agent_trust_batch or agent_trust_feedback. No prerequisites or context for usage provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_webhook_createCInspect

Create a webhook endpoint that relays requests to your agent

ParametersJSON Schema
NameRequiredDescriptionDefault
agentIdNo
pushUrlNo
ttlSecondsNo
descriptionNo
deliveryModeYes
transformRulesNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, and the description only says 'create' without disclosing side effects, idempotency, authentication, rate limits, or what happens on duplicate webhook. The behavioral implications of creation are not explained.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence is concise but fails to provide necessary detail. The description is under-specified and does not earn its place given the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 6 parameters (1 required), no output schema, and no annotations, the description is severely incomplete. It omits parameter explanations, return values, and usage context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and the description does not mention or explain any of the 6 parameters (e.g., deliveryMode, ttlSeconds). The description adds zero value beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Create a webhook endpoint that relays requests to your agent,' which is a specific verb+resource. It distinguishes from sibling webhook tools (list_requests, poll, replay) by indicating creation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., poll vs create). No prerequisites or context are provided about the necessity of creation before other operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_webhook_list_requestsCInspect

List requests received by a webhook endpoint

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
offsetNo
endpointIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must fully disclose behavior. Only states 'list requests' without indicating idempotency, authentication needs, or return format. The agent lacks essential behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Very concise single sentence, front-loaded with main purpose. No wasted words. However, for a tool with multiple parameters, slightly more detail could be warranted.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Incomplete given 3 parameters and no output schema/annotations. Does not describe what 'requests' are, ordering, or response structure. Lacks context for effective use compared to similar tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so description should explain parameters. It does not define endpointId, limit, or offset. While names are somewhat self-explanatory, the absence of explanation for pagination (limit/offset) reduces usefulness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool lists requests for a webhook endpoint. It uses specific verb 'list' and resource 'requests', and the target 'webhook endpoint' is clear from name and description. Distinguishes from sibling tools like agent_webhook_create and agent_webhook_replay.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. Does not mention that endpointId must come from agent_webhook_create, nor does it explain pagination (limit/offset) or any prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_webhook_pollBInspect

Poll for new webhook requests (long polling, max 60s)

ParametersJSON Schema
NameRequiredDescriptionDefault
afterNo
limitNo
timeoutNo
endpointIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so description carries full burden. It mentions long polling and a max timeout, but fails to disclose blocking behavior, return on timeout, or any side effects. Minimal behavioral context beyond the basic operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently conveys the core action and key behavioral trait (long polling, max 60s). No filler words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters, no output schema, and no annotations, the description lacks crucial context such as what 'after' filters, what the response contains, and how to interpret results. It only hints at the polling mechanism.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for its 4 parameters, and the tool description adds no meaning to 'after', 'limit', 'timeout', or 'endpointId'. The only hint is 'max 60s' which loosely relates to timeout but does not specify default or unit alignment.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Poll for new webhook requests') and resource, and includes key characteristics ('long polling, max 60s'). It distinguishes from sibling webhook tools (create, list, replay).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for waiting on new webhook requests, but does not explicitly guide when to use this versus alternatives like list_requests. No when-not-to-use or alternative names given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

agent_webhook_replayCInspect

Replay a stored webhook request

ParametersJSON Schema
NameRequiredDescriptionDefault
toUrlNo
requestIdYes
endpointIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry full behavioral disclosure. 'Replay' implies resending a request, but it does not explain side effects (e.g., duplicate submissions), idempotency, or authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (one sentence), which is a positive for conciseness but sacrifices comprehensiveness. It could be expanded slightly without losing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 3 parameters, no output schema, and no annotations, the description is insufficient. It does not explain return values or the replay process, leaving an agent with inadequate information to invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds zero meaning to parameters. Terms like toUrl, requestId, and endpointId are unexplained; the description does not clarify their roles or defaults.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Replay a stored webhook request' clearly states the action (replay) and resource (stored webhook request). However, it does not differentiate from sibling tools like agent_webhook_create or agent_webhook_list_requests, limiting its clarity for distinguishing use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., agent_webhook_create, agent_webhook_poll). No context on prerequisites, such as needing a stored request first.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bid_watch_getBInspect

Get a bid notice detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description bears full responsibility. It discloses two return fields (firstSeenAt and ledgerVerified) but omits any behavioral details such as authentication, rate limits, or side effects. The read-only nature is implied but not explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, consisting of two short sentences that convey the core purpose and key return fields. No unnecessary words or information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema), the description remains incomplete. It does not explain what a 'bid notice' is, the structure of the timeline, or how to use the returned fields. More context is needed for autonomous use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage for the sole parameter 'itemId', and the description provides no clarification on its meaning, format, or how to obtain it. This leaves the agent with no guidance on what constitutes valid input.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool retrieves 'bid notice detail plus full event timeline', clearly identifying the resource and scope. It distinguishes from siblings like bid_watch_recent_changes and bid_watch_search by specifying the comprehensive nature of the output.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or alternative guidance is provided. The description implies usage for obtaining full details and timeline, but fails to clarify when one should use bid_watch_timeline or bid_watch_verify_ledger instead.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bid_watch_recent_changesAInspect

Recent appearance / deadline-move / close / cancel / award events across all bid notices since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceYes
entityNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description must cover behavioral traits. It discloses that output includes firstSeenAt and ledgerVerified and lists event types. However, it does not mention ordering, pagination behavior, error conditions, or authentication requirements, which are important for a polling tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core functionality, and contains no redundant information. Every phrase adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a change-list tool with no output schema, the description provides essential event types and output fields. However, it lacks details on pagination, error handling, and the optional 'entity' parameter. This is adequate but not fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It explains the 'since' parameter but does not describe 'limit' (pagination) or 'entity' (filtering scope). This leaves two of three parameters unclear, which is a significant gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the tool's purpose: listing recent change events (appearance, deadline-move, close, cancel, award) for bid notices since a given timestamp. It distinguishes from sibling tools like bid_watch_get and bid_watch_search by focusing on events and time-based filtering.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for polling recent changes with a timestamp, but does not explicitly state when to use this tool versus alternatives like bid_watch_timeline or bid_watch_get. No exclusions or alternatives are mentioned, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bid_watch_timelineBInspect

Time-ordered events only for a bid notice (the differentiator: when it appeared, deadline moved, closed, was cancelled or awarded). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits. It states the tool returns time-ordered events and mentions fields (firstSeenAt, ledgerVerified), but does not clarify read-only nature, authentication needs, pagination, or error behavior. For a simple read tool, the lack of explicit read-only indication is a gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (two sentences) and front-loaded with the core purpose. However, it omits parameter information, which would have justified a perfect score.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description should fully describe what the tool returns. It mentions event types and fields but does not specify output structure (list vs. object), pagination, or that itemId is required. Completeness is inadequate for a tool with no annotations or output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has a single required string parameter 'itemId' with 0% description coverage. The tool description does not explain what itemId refers to, leaving the agent to infer it identifies the bid notice. This adds no value beyond the schema itself.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns time-ordered events for a bid notice, listing specific event types (appearance, deadline change, closure, cancellation, award). This distinguishes it from sibling tools like bid_watch_get (which likely returns full notice details) and bid_watch_recent_changes (which returns recent changes, not full timeline).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for obtaining a bid notice's event timeline but does not explicitly contrast with alternatives like bid_watch_get, bid_watch_search, or bid_watch_recent_changes. No when-not-to-use guidance is provided; usage is implied but not explicitly guided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bid_watch_verify_ledgerBInspect

Verify the hash-chain integrity of a bid notice (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description must bear the full burden. It discloses return fields (chainValid, brokenAt, etc.) and mentions tamper detection, but omits behavioral details such as read-only nature, side effects, authentication requirements, or potential costs. This leaves an information gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief and front-loaded: the first sentence states the core purpose, the second lists return fields. Every sentence contributes value; no redundancy exists. However, the structure could be slightly improved by grouping behavioral and parameter info.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple single-parameter schema and no output schema, the description partially covers return fields but lacks context on when to invoke this tool (e.g., after fetching a bid notice). The absence of output schema is partially mitigated by listing return fields, but depth is missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The sole parameter 'itemId' is not described beyond its name and type. With 0% schema description coverage, the description adds no semantic context (e.g., format, source, or relationship to bid notices). This forces the agent to infer or guess the parameter's meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Verify the hash-chain integrity') and the specific resource ('a bid notice'). It distinguishes itself from sibling verify_ledger tools by specifying 'bid notice', ensuring proper differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance is provided on when to use this tool versus alternatives (e.g., other bid_watch tools or verify_ledger tools for different entity types). The description only implies tamper detection without clarifying prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

grant_watch_getCInspect

Get a grant call detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description bears full responsibility. It only mentions returned fields, omitting behavioral traits like authentication requirements, rate limits, destructive potential, or response structure. Minimal behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, efficient but lacks crucial details. It is front-loaded with purpose but too brief to stand alone. Better structuring could include parameter explanation and more context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite low complexity (single parameter, no output schema), the description omits essential details: what itemId is, full response structure, and any usage conditions. It is incomplete for effective tool selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage and only one parameter (itemId), the description should clarify what itemId represents (e.g., a grant identifier, format). It adds no meaning beyond the schema's type definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a grant call detail plus full event timeline, specifying two returned fields. This distinguishes it from sibling tools like grant_watch_search (search) and grant_watch_recent_changes (recent changes), and the resource-specific naming (grant_watch_get vs. bid_watch_get) makes the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives (e.g., grant_watch_search for filtering, grant_watch_timeline for timeline-only). The description fails to give usage context or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

grant_watch_recent_changesBInspect

Recent appearance / deadline-move / close / close-early events across all grant calls since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
fieldNo
limitNo
sinceYes
funderNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses event types and included fields (firstSeenAt, ledgerVerified) but omits details on pagination, ordering, rate limits, or whether it's a read-only operation. The limit parameter suggests pagination, but this is not mentioned.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences are concise and front-load key information (event types and time filter). However, the structure could be improved by listing parameters or output fields more systematically.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite 4 parameters and no output schema, the description omits essential context: what 'field' and 'funder' control, how limit affects results, and the full return structure. This makes the tool harder to invoke correctly without additional knowledge.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% and the description only explains the 'since' parameter (ISO8601 timestamp). It fails to clarify the purpose of 'field', 'limit' (though default shown), and 'funder', leaving the agent with incomplete understanding of how to use these parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns 'Recent appearance / deadline-move / close / close-early events across all grant calls' since a timestamp, and distinguishes it from siblings like grant_watch_get (single call) and grant_watch_search (search criteria) by the scope 'across all grant calls'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like grant_watch_get, grant_watch_search, or grant_watch_timeline. The description only implies use for recent changes across all calls without stating exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

grant_watch_timelineAInspect

Time-ordered events only for a grant call (the differentiator: when it opened, deadline moved, closed, or closed early). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It mentions included fields (firstSeenAt, ledgerVerified) and time-ordering, but lacks details on sorting order, pagination, error conditions, or permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with core purpose, no extraneous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers the main purpose and hints at output fields, but lacks details on the full output structure or event types, which is noticeable given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% and the description does not explain the sole parameter 'itemId' (e.g., what it represents or how to specify it), failing to add meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns 'time-ordered events only for a grant call' and lists specific event types (opened, deadline moved, closed, closed early), which differentiates it from siblings like grant_watch_get or grant_watch_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by highlighting the differentiator and included fields, but does not explicitly state when not to use it or name alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

grant_watch_verify_ledgerBInspect

Verify the hash-chain integrity of a grant call (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses return fields (chainValid, brokenAt, etc.) but does not mention side effects, authentication needs, or whether it is read-only. With no annotations, more behavioral detail is needed, but the description provides basic output info.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is one sentence that conveys purpose and key return values. It is front-loaded and contains no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description covers purpose and return fields but lacks parameter explanation and usage context. More detail about when to use and what the parameter means would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description does not explain the itemId parameter. Schema coverage is 0%, and the description adds no meaning to the single required parameter (e.g., what a grant call identifier is or how to obtain it).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: verifying hash-chain integrity for a grant call for tamper detection. It lists return fields, distinguishing it from siblings like grant_watch_get or other verify_ledger tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., other grant_watch tools or other verify_ledger tools). The description only explains what it does, not the context of use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

license_watch_getBInspect

Get a license registration detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. The description mentions return fields (firstSeenAt, ledgerVerified) but fails to disclose behavioral traits like idempotency, side effects, authorization requirements, or data freshness for a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two short sentences, front-loading the purpose and key return fields without unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, yet the description only mentions two return fields and a timeline. It lacks details on structure, pagination, errors, or parameter semantics, leaving users underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and the description does not explain the 'itemId' parameter, providing no additional meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get') and the resource ('license registration detail plus full event timeline'), distinguishing it from sibling tools like license_watch_timeline (which only returns timeline) and license_watch_recent_changes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when needing both detail and timeline, but lacks explicit guidance on when to use this tool versus alternatives such as license_watch_timeline or license_watch_search.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

license_watch_recent_changesBInspect

Recent appearance / revoked / suspended events across all license ledgers since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceYes
registryNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It notes changes are across all ledgers and items include 'firstSeenAt' and 'ledgerVerified', but omits critical traits like read-only behavior, pagination, rate limits, or what happens with invalid timestamps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no redundancy: first states purpose and key input, second adds return detail. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, no annotations, and 3 parameters, the description lacks necessary information on usage boundaries, error behavior, and full parameter semantics, making it incomplete for reliable tool selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so description must explain all parameters. It explains 'since' as 'ISO8601 timestamp' but does not clarify 'limit' (presumably page size) or 'registry' (possibly a filter). The description adds minimal value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists 'recent appearance / revoked / suspended events across all license ledgers' since a timestamp, distinguishing it from sibling tools like license_watch_get (single entity) and license_watch_search (query-based). The verb 'watch recent changes' is specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implicitly suggests usage when needing recent changes since a given timestamp, but lacks explicit guidance on when not to use it (e.g., for specific ledgers or historical queries) and does not mention alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

license_watch_timelineBInspect

Time-ordered events only for a license registration (the differentiator: when it appeared, when it was revoked / expired / suspended). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It mentions included fields but does not describe read-only nature, error handling, performance, or authentication needs. The agent cannot assess side effects or return completeness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no extraneous words. Front-loaded with purpose and differentiator.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given a single parameter and no output schema, the description is minimally adequate. It explains purpose and included fields but lacks details on output structure or potential errors.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description should explain the itemId parameter. It indirectly implies itemId is a license registration ID but adds no specifics about format or validation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns 'time-ordered events only for a license registration' and specifies the differentiator: events like when it appeared, revoked, expired, or suspended. It distinguishes from sibling tools like license_watch_get (single record) and license_watch_search (filtered list).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies this tool is for viewing timeline events, but does not explicitly state when to use it over alternatives like license_watch_recent_changes or license_watch_search. No exclusions or prerequisites are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

license_watch_verify_ledgerBInspect

Verify the hash-chain integrity of a license registration (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must cover behavioral traits. It lists return fields but does not disclose whether the operation is read-only, requires authentication, or has any side effects. The description lacks important transparency for a verification tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, with one sentence and a list of return fields. It is well-structured and avoids unnecessary text, though it could be slightly more detailed without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has one parameter and no output schema, the description covers the basic purpose and return fields. However, it lacks context on error handling, prerequisites, or when to use the tool among many similar siblings.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% and the description does not explain the single parameter 'itemId'. It only hints at 'license registration', but does not specify what the identifier is or how it should be formatted.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: verifying hash-chain integrity of a license registration for tamper detection. It distinguishes from sibling tools like other verify_ledger tools (different domains) and other license_watch tools (different actions).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. While the description implies use for tamper detection, it does not specify exclusions or provide context for choosing among similar verify_ledger tools or other license_watch tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ordinance_watch_getCInspect

Get an ordinance detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must convey all behavioral traits. It only mentions that the tool returns firstSeenAt and ledgerVerified, but does not disclose whether it is read-only, requires authentication, has rate limits, or any side effects. The behavioral profile is under-specified.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, which is concise but omits critical details. It front-loads the purpose but sacrifices completeness. Every sentence should earn its place; here, brevity leads to gaps.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (one parameter, no output schema), the description is incomplete. It does not specify the return structure beyond two fields, possible error conditions, or expected input format. More context is needed for reliable use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It provides no explanation of the itemId parameter, its format, or constraints. The tool description only names the parameter but adds no semantic value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool gets an ordinance detail plus full event timeline and returns specific fields (firstSeenAt, ledgerVerified). It distinguishes from sibling tools in other domains (e.g., bid_watch_get) but does not explicitly differentiate from the sibling ordinance_watch_timeline tool, which may cause confusion about which to use for timeline-related queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like ordinance_watch_timeline, ordinance_watch_recent_changes, or others. The description lacks any context about prerequisites, appropriate scenarios, or exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ordinance_watch_recent_changesBInspect

Recent appearance / amendment / repeal events across all ordinances since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceYes
issuerCodeNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description should fully disclose behavioral traits. It mentions output fields but omits pagination, ordering, error handling, and whether the operation is read-only (though implied). Does not clarify that issuerCode filters the 'all ordinances' claim.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded single sentence with clear action and scope, plus a second sentence adding output detail. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and 0% schema description coverage, the description lacks details on return format, pagination, ordering, and edge cases. It also fails to reconcile the 'all ordinances' phrasing with the optional issuerCode filter.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must explain all parameters. It explains 'since' (ISO8601 timestamp) but not 'limit' (default 100) or 'issuerCode' (filtering scope). Only partial compensation for the coverage gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs (appearance/amendment/repeal events) and resource (ordinances), clearly distinguishing from sibling tools like ordinance_watch_get or ordinance_watch_timeline, and from other watch_recent_changes tools for different entities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives (e.g., ordinance_watch_timeline for full history, ordinance_watch_get for a single entity). The description implies use for recent changes but does not state exclusions or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ordinance_watch_timelineBInspect

Time-ordered events only for an ordinance (the differentiator: when it appeared / was amended / was repealed). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It mentions data fields (firstSeenAt and ledgerVerified) and implies read-only, but does not clarify idempotency, error handling, rate limits, or safety profile.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded with the core purpose and differentiator. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description covers the purpose and key fields. However, it lacks details on pagination, ordering, or event types, leaving some gaps for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has one parameter (itemId) with no description, and the description does not explain what itemId represents or how to obtain it. With 0% schema coverage, the description should compensate but fails to add meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides time-ordered events for an ordinance, specifically when it appeared, was amended, or repealed. This distinguishes it from sibling tools like ordinance_watch_get (get details) and ordinance_watch_recent_changes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for timeline events but does not explicitly state when to use this versus other tools like ordinance_watch_search. The phrase 'the differentiator' gives context but lacks explicit when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ordinance_watch_verify_ledgerAInspect

Verify the hash-chain integrity of an ordinance record (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It describes return values but does not state whether the operation is read-only/idempotent, any rate limits, or side effects. The description adds context beyond what annotations would provide, but misses behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that concisely states the purpose and lists return fields. No wasted words, front-loaded with main verb.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 1 parameter and no output schema, the description covers purpose and output fields. However, it could mention the tool's safety profile (likely read-only) for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so description should compensate. The description mentions 'ordinance record' and 'hash-chain integrity', which gives context for the itemId parameter, but does not add format or constraints. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool verifies hash-chain integrity for tamper detection, and lists specific return fields. It distinguishes from siblings (other verify_ledger tools) by specifying 'ordinance record'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives (e.g., other verify_ledger tools for different domains). Usage is implied by the tool's domain (ordinance) but no exclusions or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pharma_watch_getCInspect

Get a pharmaceutical record detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries the full burden. It implies a read operation but does not explicitly state read-only behavior, permissions, rate limits, or side effects. The description is minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose and return fields. No redundancy, appropriate length for a simple get tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description mentions return fields but does not describe the full structure or timeline format. With no output schema and no annotations, the description is insufficient for complete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain the itemId parameter at all. No information about expected format or value range is provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and identifies the resource as 'pharmaceutical record detail plus full event timeline'. It also specifies return fields. However, it does not explicitly distinguish from other watch_get tools, though the name provides enough context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like pharma_watch_search or pharma_watch_timeline. The description only states what it does without any contextual recommendations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pharma_watch_recent_changesBInspect

Recent approval / NHI-listed / price-revised events across all pharmaceutical records since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceYes
categoryNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description must carry the full burden. It only mentions that items include firstSeenAt and ledgerVerified, but omits whether the tool is read-only, if pagination is needed, or any rate limits. Missing critical behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero filler. Front-loaded with purpose and output details. Efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and no annotations, the description should cover all necessary context. It fails to explain the output format beyond two fields, how to paginate (limit), or what category values are valid. Incomplete for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must explain all parameters. It only addresses 'since' implicitly via 'ISO8601 timestamp', but completely omits 'limit' (default 100) and 'category' (string). No additional meaning provided for any parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists recent approval/NHI-listed/price-revised events since a given ISO8601 timestamp, distinguishing it from sibling tools like pharma_watch_get (retrieve specific) or pharma_watch_search (search by criteria).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for recent events but provides no explicit when-not or alternative guidance. The sibling context exists but the description itself lacks direction for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pharma_watch_timelineCInspect

Time-ordered events only for a pharma record (the differentiator: when it was approved / NHI-listed / price-revised). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must carry full behavioral disclosure. It mentions included fields (firstSeenAt, ledgerVerified) but does not state whether the operation is read-only, has side effects, requires authentication, or has rate limits. Lacks critical behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently convey purpose and key fields. Front-loaded with the main function. No wasted words, though the parameter info is missing.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, so the agent lacks return structure info beyond mentioned fields. One parameter undocumented. The description is insufficient to fully use the tool correctly without prior knowledge. Sibling tools (e.g., pharma_watch_get) with more context suggest this is incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has one required parameter (itemId) with 0% schema description coverage. The description does not explain what itemId represents or how to obtain it. No additional semantics provided beyond the parameter name.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it returns time-ordered events for a pharma record, specifying the differentiator (approval, NHI-listed, price-revised). This distinguishes it from other pharma_watch siblings like get, search, or verify_ledger. The verb 'events' is slightly vague but overall clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage: use when you need the timeline of a specific pharma record. However, no explicit guidance on when not to use it or alternatives among siblings like pharma_watch_get or pharma_watch_recent_changes. The 'differentiator' hint helps but is not fully explicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pharma_watch_verify_ledgerAInspect

Verify the hash-chain integrity of a pharma record (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses that the tool is for tamper detection and lists return fields (chainValid, brokenAt, etc.), but does not mention side effects, authentication needs, or whether the operation is read-only. The description is minimally adequate for a verification operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is one sentence with a clear action and a list of return fields. It is front-loaded with the verb and resource, and every word adds value. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema), the description adequately covers the purpose and return values. However, it lacks usage guidelines and error conditions, which would improve completeness for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has one parameter (itemId) with no description coverage. The tool description does not elaborate on the parameter's format, constraints, or example values. However, the name 'itemId' is inherently clear, and the single parameter structure reduces ambiguity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'verify' and the specific resource 'hash-chain integrity of a pharma record' with tamper detection. It distinguishes from sibling tools (e.g., bid_watch_verify_ledger, grant_watch_verify_ledger) by specifying the pharma domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not explicitly state when to use this tool vs alternatives, nor does it provide exclusion criteria. While the name implies verification, there is no guidance on when verification is appropriate compared to other pharma watch tools like get, search, or timeline.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubcom_watch_getCInspect

Get a public-comment notice detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description must cover behavior. It states it returns data but does not clarify if the operation is read-only, if authentication is needed, or any side effects. The mention of specific return fields is helpful but incomplete.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short with two sentences. It is concise, but it sacrifices completeness. It front-loads the main purpose but lacks detail on parameters and behavioral context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple lookup tool with one parameter and no output schema, the description provides the essential purpose but omits details about the event timeline and the meaning of returned fields. It is adequate but not thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has one parameter (itemId) with 0% description coverage. The description does not explain what itemId is, its format, or how to obtain it. Since schema coverage is low, the description should compensate, but it fails to do so.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get', the resource 'public-comment notice detail plus full event timeline', and mentions specific fields returned. This distinguishes it from sibling tools like pubcom_watch_search or pubcom_watch_timeline.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. It does not mention scenarios like searching, listing recent changes, or verifying ledger, nor does it give conditions for use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubcom_watch_recent_changesBInspect

Recent appearance / deadline-move / close / result-published events across all notices since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceYes
agencyNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full burden. It mentions that items include firstSeenAt and ledgerVerified, but does not disclose key behaviors such as pagination, ordering, rate limits, potential for empty results, or whether the events list is exhaustive. The read-only nature is implied but not explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loading the core functionality and then adding detail about returned fields. It is concise and avoids redundancy, though it could be slightly more structured with parameter hints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and low parameter coverage, the description is insufficient. It does not explain the output format beyond two fields, lacks information on sorting, event types' representation, or whether results are limited by default. Additional context about typical use cases or limitations would be valuable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It explains the required 'since' parameter as an ISO8601 timestamp, but does not explain the optional 'limit' (default 100) or 'agency' parameters. Two out of three parameters lack any contextual explanation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns recent events (appearance, deadline-move, close, result-published) across all notices since a timestamp. It distinguishes from sibling tools by specifying the resource (pubcom notices) and the operation (recent changes), and from other pubcom tools like pubcom_watch_get or pubcom_watch_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies it is for polling recent changes after a given timestamp, but does not explicitly state when to use this tool versus alternatives like pubcom_watch_get, pubcom_watch_search, or pubcom_watch_timeline. No 'when not to use' guidance is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubcom_watch_timelineCInspect

Time-ordered events only for a notice (the differentiator: when it opened, deadline moved, closed, or result was published). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must disclose behavioral traits. It states included fields but omits whether the tool is read-only, authentication needs, rate limits, or side effects. For a watch tool, read-only is implied but not confirmed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with front-loaded key information. No redundancy. Could be slightly more structured but remains efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple tool with one parameter and no output schema. Describes purpose and key fields but lacks output format details, ordering, or differentiation from sibling timeline tools. Additional context would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and description adds minimal meaning above the schema. It mentions 'for a notice' but does not explicitly state that itemId is the notice identifier. The parameter definition is not enhanced.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides time-ordered events for a notice, listing specific event types (opened, deadline moved, closed, result published) and fields (firstSeenAt, ledgerVerified). The tool name and context differentiate it from sibling tools like pubcom_watch_get or pubcom_watch_recent_changes, though it could more explicitly state it returns a list of timeline events.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The description implies it is for a notice's timeline but does not distinguish from pubcom_watch_recent_changes or pubcom_watch_get. Lacks when-to-use, when-not-to-use, or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pubcom_watch_verify_ledgerBInspect

Verify the hash-chain integrity of a notice (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries full burden. Lists return fields but does not disclose side effects, performance, or error behavior (e.g., what happens if integrity fails). Barely adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

One sentence plus a list of return fields. Front-loaded and free of fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 1-parameter verification tool with no output schema, the description lists return fields but lacks explanation of how to interpret them (e.g., what 'brokenAt' means). Adequate but not thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and description does not describe the single parameter 'itemId' at all. Agent receives no guidance on its meaning or expected format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'Verify' and resource 'hash-chain integrity of a notice', with explicit return fields. While sibling differentiation is implicit via naming, the purpose is unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like pubcom_watch_get or other verify_ledger tools. No prerequisites or exclusions provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recall_watch_getCInspect

Get a recall detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It only mentions return fields (firstSeenAt, ledgerVerified) but does not disclose any behavioral traits like read-only, authentication, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at two sentences, front-loading the main purpose and adding key return fields with no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple get tool, the description is incomplete: it lacks parameter semantics, usage context, and does not cover potential error scenarios or prerequisites despite having no output schema to compensate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage for the parameter 'itemId', and the description adds no meaning about its format, source, or constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'Get' and the resource 'recall detail plus full event timeline', distinguishing it from sibling tools like recall_watch_search or recall_watch_timeline by adding 'full event timeline' and specific return fields.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs. alternatives such as recall_watch_search or recall_watch_recent_changes. The description lacks any when-to-use or when-not-to-use information.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recall_watch_recent_changesCInspect

Recent appearance / severity-escalated events across all recalls since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceYes
agencyNo
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. The description reveals that the tool returns events with firstSeenAt and ledgerVerified, but does not disclose read-only nature, rate limits, or any side effects. Basic behavioral context is missing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, concise and front-loaded with key information. However, it could be improved by structuring parameter details separately. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, so the description must provide sufficient context. It mentions two fields (firstSeenAt, ledgerVerified) but omits others. Parameter descriptions are missing. The tool is incomplete for an agent to use correctly without additional knowledge.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description must explain parameters. It mentions 'since ISO8601 timestamp' but does not explicitly link to the 'since' parameter. 'limit' and 'agency' are not described at all, leaving the agent with no semantic understanding of how to use them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists recent appearance/severity-escalated events across all recalls since an ISO8601 timestamp, differentiating it from other recall watch tools by focusing on 'recent changes' and severity escalation. It mentions output fields (firstSeenAt, ledgerVerified), but does not fully distinguish from recall_watch_get or recall_watch_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like recall_watch_search or recall_watch_timeline. The description implies usage for recent changes, but fails to exclude other scenarios or contrast with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recall_watch_timelineCInspect

Time-ordered events only for a recall (the differentiator: when it appeared, when severity escalated, when it was completed). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions output fields (firstSeenAt, ledgerVerified) but does not disclose behavioral traits such as whether it is read-only, destructive, requires authentication, or has rate limits. This is insufficient for a tool with no annotation support.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, consisting of two sentences. It front-loads the main purpose and lists key fields. However, it could be slightly more structured, but overall it is efficient and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input schema, lack of annotations, and no output schema, the description is incomplete. It does not explain the output format, whether results are paginated, or what ledgerVerified means. More context is needed for the agent to effectively use this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter, itemId, is not explained in the description. Schema coverage is 0%, so the description should clarify what itemId represents (e.g., the recall ID). Without this, the agent may not know what value to provide.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states that the tool returns time-ordered events for a recall, specifying the types of events (appearance, severity escalation, completion). It distinguishes from sibling timeline tools by specifying 'for a recall' and from other recall_watch tools by focusing on timeline. However, it could be more precise about the exact output.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides minimal guidance on when to use this tool. It mentions 'the differentiator' but does not explicitly state when to use it over alternatives like recall_watch_search or recall_watch_get. No exclusions or when-not-to-use are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recall_watch_verify_ledgerAInspect

Verify the hash-chain integrity of a recall record (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It lists return fields (chainValid, brokenAt, etc.) which gives insight into behavior, but it does not disclose whether the tool is read-only, requires authentication, or has side effects. Adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence followed by a clear list of return fields. It is front-loaded with the purpose and wastes no words. Ideal for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (1 parameter, no output schema), the description covers the core function and return values. However, it lacks context on error conditions, prerequisites, or the meaning of the integrity check, which could impact an agent's ability to use it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage, and the description does not explain the parameter 'itemId' beyond being a required string. The context of the tool name implies it is a recall record ID, but no format, source, or meaning is provided. The description fails to add value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'verify' and the resource 'hash-chain integrity of a recall record', with a parenthetical '(tamper detection)' that further clarifies. This distinguishes it from sibling *verify_ledger tools for other record types and other recall_watch tools that perform different operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for tamper detection but does not explicitly state when to use this tool versus alternatives like recall_watch_get or recall_watch_timeline. No direct guidance on prerequisites or exclusion criteria is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sanction_watch_getCInspect

Get a sanction detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden for behavioral disclosure. It mentions returns but does not disclose prerequisites (e.g., authentication), error handling, rate limits, or whether the operation is read-only. The description is too minimal for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, which is concise but lacks necessary details. It could be considered under-specified rather than efficiently concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool (one parameter, no output schema), the description is incomplete. It mentions two return fields but not the full structure of the response. More context about the timeline format or error responses would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning the parameter 'itemId' is undocumented in the schema. The description does not explain what 'itemId' is or how to obtain it, adding no semantic value beyond the field name.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get') and the resource ('sanction detail plus full event timeline'), and mentions specific return fields ('firstSeenAt and ledgerVerified'). This is specific and distinguishes from sibling tools for different watch types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like sanction_watch_search or other watch_get tools. There are many sibling tools with similar patterns, but no when-to-use or when-not-to-use information.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sanction_watch_recent_changesAInspect

Recent appearance / lift events across all sanctions since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceYes
regulatorNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that the tool returns 'appearance / lift events' and includes specific fields (firstSeenAt, ledgerVerified), indicating a read-only operation. However, it lacks details on auth requirements, rate limits, pagination, or side effects. The behavioral disclosure is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the key action and constraints. Every sentence provides value with no redundancy or filler. It is appropriately concise for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low schema coverage (0%) and no output schema, the description should be more complete. It covers the 'since' parameter and output fields but omits details on 'limit' behavior, 'regulator' filtering, pagination, and result structure. For a tool with 3 parameters and no annotations, this is insufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It adds meaning for 'since' (ISO8601 timestamp) and output fields, but does not explain 'limit' (default 100) or 'regulator' (purpose unclear). Two out of three parameters are underspecified, leaving the agent with insufficient semantic guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool retrieves 'Recent appearance / lift events across all sanctions since the given ISO8601 timestamp.' It specifies the output includes 'firstSeenAt and ledgerVerified.' This distinguishes it from sibling tools like sanction_watch_get (specific sanction), sanction_watch_search (search), and sanction_watch_timeline (timeline).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for recent changes but does not explicitly state when to use this tool versus alternatives. No when-not-to-use guidance or comparison with sibling tools is provided. While the name and context suggest it, explicit guidelines are missing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sanction_watch_timelineCInspect

Time-ordered events only for a sanction (the differentiator: when it appeared and when it was lifted). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided. The description mentions output fields but does not disclose whether the tool is read-only, requires authentication, or has any side effects. For a tool with zero annotation coverage, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loads the core purpose, and avoids any unnecessary words. It is appropriately sized for the tool's simplicity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description partially covers return values (time-ordered events with firstSeenAt and ledgerVerified). However, it lacks detail on the event structure and does not explain the input parameter. It is adequate but not thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter, itemId, is not explained in the description or schema (0% coverage). The description implies it identifies a sanction by the tool's context but does not clarify what itemId represents or how to obtain it.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns 'time-ordered events only for a sanction' and highlights the differentiator (when it appeared and lifted) and included fields. This distinguishes it from other timeline tools like bid_watch_timeline, though it doesn't explicitly contrast with them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for sanction timeline events but provides no guidance on when to use this tool over alternatives such as sanction_watch_get or sanction_watch_search. No when-not or selection criteria are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sanction_watch_verify_ledgerAInspect

Verify the hash-chain integrity of a sanction record (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose all behavioral traits. It lists return fields (chainValid, brokenAt, etc.) but does not explicitly state it is read-only, mention auth requirements, or describe side effects. The term 'verify' implies a read operation, but more transparency on safety profile would improve the score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence followed by a list of return fields. It is concise, front-loaded with the core purpose, and every word adds value. No redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple verification tool with one parameter, the description covers the purpose and return fields adequately. However, it omits explanation of the itemId parameter and does not mention error handling, rate limits, or prerequisites. While functional, it leaves minor gaps that reduce completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'itemId' is not described in the tool description at all. Schema coverage is 0%, and the description fails to explain what itemId represents or its format. This is a significant gap, as the agent must rely solely on the parameter name and context, which may be insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'verify' and the resource 'hash-chain integrity of a sanction record', with explicit mention of 'tamper detection'. It distinguishes itself from sibling tools through domain specificity (sanction vs bid, grant, etc.), making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context that this tool is for sanction records, implying usage for ledger verification. However, it lacks explicit when-not-to-use or alternative tool references. The domain context is sufficient for an AI agent to infer appropriate usage among similar verify_ledger siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

subsidy_watch_getBInspect

Get a subsidy program detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
programIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses return fields (firstSeenAt, ledgerVerified) which adds value beyond the name, but does not describe other behavioral aspects like rate limits, permissions, or whether the timeline is complete.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that is efficient and front-loaded with the core purpose. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple get-by-ID tool, the description provides core purpose and return fields, but lacks any mention of prerequisites, error handling, or scope of the timeline (e.g., all events or recent?).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% and the description does not explain the programId parameter beyond its name. The return field mention is helpful but does not compensate for lack of parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool retrieves a subsidy program detail and full event timeline, and mentions specific return fields (firstSeenAt, ledgerVerified). Distinguishes from sibling watch_get tools by specifying the resource type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings like subsidy_watch_search or subsidy_watch_timeline. Does not specify that it retrieves by programId, which is the only parameter.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

subsidy_watch_recent_changesBInspect

Recent appearance / change / close events across all programs since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceYes
categoryNo
issuerCodeNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears full burden. It mentions firstSeenAt and ledgerVerified fields, but does not disclose pagination behavior, ordering, or other side effects. For a read-only tool, this is minimally adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two clear, front-loaded sentences with no redundant information. Every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the purpose is clear, the description is incomplete for a 4-parameter tool with no output schema and no annotations. It fails to explain parameters, ordering, or how to interpret results. Given many sibling tools, more detail is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%. The description only mentions the 'since' parameter context. It does not explain the 'category' or 'issuerCode' parameters, which could be misleading given the 'across all programs' phrasing. The agent lacks crucial parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists recent appearance/change/close events across all programs since a timestamp. It distinguishes from sibling tools like subsidy_watch_get (specific program) and subsidy_watch_search (search), but does not explicitly contrast with subsidy_watch_timeline.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies the required parameter (since) but offers no guidance on when to use this tool versus similar sibling tools (e.g., subsidy_watch_timeline). No exclusions or alternatives are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

subsidy_watch_timelineBInspect

Time-ordered events only for a program (the differentiator: when it appeared, changed, closed). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
programIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It states the tool returns 'Time-ordered events' and includes fields, but does not explicitly label it as a read-only operation, mention authentication needs, or describe any side effects. The behavioral profile is partially implied but not fully disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that front-loads the core purpose ('Time-ordered events only for a program'), then adds the differentiator and key fields. No redundant words; every element earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description adequately states what it does and mentions two fields. However, it lacks details on output structure (e.g., full list of fields, ordering, pagination) and the scope (e.g., events for a specific program, but not if it covers all or limited). Adequate but with gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has one required parameter (programId) with 0% description coverage. The description only indirectly references it by saying 'for a program', which is minimal. It adds no format, example, or additional meaning beyond the schema, failing to compensate for the low coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool provides 'Time-ordered events only for a program' and cites the differentiator: 'when it appeared, changed, closed'. This clearly identifies the verb (retrieve timeline) and resource (program events), distinguishing it from sibling tools like subsidy_watch_get (full record) or subsidy_watch_search (search).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by mentioning 'the differentiator', but does not explicitly state when to use this tool over alternatives (e.g., 'use subsidy_watch_get for full record details'). There is no mention of prerequisites or exclusions, leaving the agent to infer usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

subsidy_watch_verify_ledgerCInspect

Verify the hash-chain integrity of a program (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
programIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses the main behavior and return fields (chainValid, brokenAt, etc.) but does not mention side effects, authentication needs, or rate limits. It is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that front-loads the action and then lists return fields. It is concise without fluff, though additional structure for clarity could improve it slightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input schema and no output schema, the description covers the primary purpose and return values. However, it lacks details on error handling, programId format, or usage constraints, making it incomplete for a fully seamless agent interaction.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description does not elaborate on the only parameter 'programId', despite 0% schema description coverage. It mentions 'program' in the purpose but does not connect it to the parameter, leaving the agent to infer its meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool verifies hash-chain integrity for tamper detection, with a specific verb and resource. It distinguishes from sibling tools by focusing on 'program' integrity, though the name 'subsidy_watch_verify_ledger' provides domain specificity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like other verify_ledger tools or other watch operations. The description lacks context for appropriate usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

tos_watch_getCInspect

Get a ToS snapshot detail plus full event timeline. Returns firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It implies a read operation but does not state whether authentication is required, rate limits, or behavior on invalid itemId. Only mentions return fields, missing key behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is a single sentence with clear verb-object structure and output fields. It is short but not wasteful. Could be slightly improved by removing redundancy or adding parameter hint.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given one parameter and no output schema, description should provide complete context. It fails to explain itemId, what a 'ToS snapshot' entails, or any side effects. The tool is part of a family, but the description alone is insufficient for full understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and description provides no extra meaning for parameter 'itemId'. It does not explain what itemId represents (e.g., ToS snapshot ID) or constraints. This lack of semantics hinders correct invocation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Get a ToS snapshot detail plus full event timeline', clearly identifying the action and resource. It distinguishes from siblings like 'tos_watch_search' and 'tos_watch_timeline' by implying this retrieves a specific snapshot, but does not explicitly differentiate it from 'tos_watch_timeline' which might also return timelines.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. With many sibling tools (e.g., tos_watch_recent_changes, tos_watch_search, tos_watch_timeline), the description offers no usage context or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

tos_watch_recent_changesBInspect

Recent revised events across all SaaS ToS documents since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNo
sinceYes
vendorNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry the full burden. It discloses that the tool returns events across all documents and includes firstSeenAt and ledgerVerified, but lacks details on pagination, rate limits, or behavior for invalid timestamps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is one sentence (18 words) and front-loads the core purpose. It contains no fluff and is efficiently structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 3 parameters and no output schema, the description covers the purpose and key output fields but omits usage of limit and vendor. It is adequate for a simple watch tool but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Description mentions 'since the given ISO8601 timestamp' which corresponds to the required 'since' parameter, but does not explain 'limit' or 'vendor' parameters. With 0% schema description coverage, the description should compensate, but it adds minimal value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it returns recent revised events across SaaS ToS documents since a timestamp, and mentions output fields firstSeenAt and ledgerVerified. However, it does not explicitly differentiate from sibling tools like tos_watch_get or tos_watch_search, but the name 'recent_changes' provides context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as tos_watch_timeline or tos_watch_search. The description only states what the tool does, not when it is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

tos_watch_timelineBInspect

Time-ordered events only for a ToS document (the differentiator: when it appeared and each revision since). Includes firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries full burden. It discloses that the tool returns timeline events with specific fields, but does not mention whether it is read-only, any side effects, pagination, or other behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short and to the point, with two sentences. It avoids unnecessary words but could be slightly more structured by explicitly linking the parameter.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema), the description lacks essential details about the parameter and return format, making it incomplete for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description provides no explanation for the sole parameter 'itemId', leaving its meaning entirely unclear. With 0% schema coverage, the description fails to compensate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool returns time-ordered events for a ToS document, specifically its first appearance and each revision since. It distinguishes itself from sibling timeline tools by focusing on 'ToS document' and mentions included fields.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when a timeline of ToS document events is needed, but it does not explicitly contrast with other ToS tools like tos_watch_get or tos_watch_recent_changes, nor does it provide when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

tos_watch_verify_ledgerBInspect

Verify the hash-chain integrity of a ToS document (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.

ParametersJSON Schema
NameRequiredDescriptionDefault
itemIdYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses return fields but does not state if it is read-only, requires authentication, or has side effects. Minimal but no contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

One sentence plus a list of return fields. Extremely concise with no filler, and key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool is simple with 1 parameter and no output schema. Description covers return fields but omits parameter explanation. Adequate but not fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'itemId' is not explained in the description or schema. With 0% schema description coverage, the description should clarify what this ID represents, but it does not, leaving the agent to guess.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool verifies hash-chain integrity of a ToS document for tamper detection, using a specific verb and resource. It distinguishes from sibling tools like tos_watch_get or tos_watch_search by focusing on integrity verification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. Does not specify prerequisites, typical scenarios, or contrast with other watch tools like tos_watch_timeline or tos_watch_recent_changes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources