Japan Public Ledgers MCP
Server Details
Tamper-evident daily time-series ledgers across 10 Japanese public-data domains (subsidies-laws).
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 3/5 across 72 of 72 tools scored. Lowest: 2.1/5.
Each tool belongs to a clearly distinct domain (agent, bid_watch, grant_watch, etc.) and within each domain, suffixes like search, get, recent_changes, timeline, verify_ledger unambiguously define the operation. No two tools overlap in purpose.
All tool names follow a consistent snake_case pattern with domain prefix (e.g., agent_, bid_watch_) and standardized verb suffixes. The naming schema is uniform across all 72 tools.
72 tools is a high count, but it covers a wide range of Japanese public ledger domains plus agent utilities. Each domain has a compact and well-scoped set (5-6 tools), making the total appropriate for the aggregate scope.
Each watch domain provides all necessary operations: search, get detail, recent changes, timeline, and integrity verification. The agent tools cover identity, memory, proxy, captcha, temp mail, trust, and webhooks comprehensively. No obvious gaps.
Available Tools
77 toolsagent_audit_queryCInspect
Query agent actions with filters
| Name | Required | Description | Default |
|---|---|---|---|
| to | No | ||
| from | No | ||
| limit | No | ||
| agentId | No | ||
| riskMin | No | ||
| sessionId | No | ||
| actionType | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description only says 'Query' which implies read-only but does not explicitly state behaviors like destructiveness, rate limits, or authorization needs. The description fails to add any behavioral context beyond the verb.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, but it is under-specified. While it is front-loaded, it does not earn its place because it lacks critical information about parameters and behavior, making it insufficiently concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 parameters, no output schema, and no annotations, the description provides almost no useful context. It does not explain return format, pagination, error conditions, or any constraints, making it incomplete for an AI agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning the schema provides no explanations for the 7 parameters. The description only mentions 'with filters' without elaborating on what each filter ('to', 'from', 'limit', 'agentId', 'riskMin', 'sessionId', 'actionType') does or how to use them.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Query agent actions with filters' clearly specifies the verb (query) and resource (agent actions), indicating a read operation with filtering. It distinguishes from siblings like 'agent_audit_record' or 'agent_audit_report' by implying a broad query rather than a single record or report generation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives such as 'agent_audit_record' or 'agent_audit_report'. There is no mention of prerequisites, context for use, or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_audit_recordCInspect
Record an agent action for audit and compliance
| Name | Required | Description | Default |
|---|---|---|---|
| input | No | ||
| output | No | ||
| agentId | Yes | ||
| metadata | No | ||
| sessionId | No | ||
| actionName | Yes | ||
| actionType | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description bears full responsibility for behavioral disclosure. It only states the purpose but omits important traits such as idempotency, side effects, required permissions, or error handling. This is insufficient for an audit recording tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, which is concise. However, it is too minimal and does not front-load critical information. A bit more detail would improve utility without sacrificing conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 7 parameters, no output schema, and no annotations, the description is extremely sparse. It lacks necessary context about data constraints, usage patterns, or expected behavior, leaving the agent under-informed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 7 parameters with 0% description coverage. The description does not explain any parameter beyond their names and types. For example, 'input' and 'output' are generic objects with no clarification on structure requirements.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states 'Record an agent action for audit and compliance', which clearly indicates the verb (record) and resource (agent action). However, it does not differentiate from sibling tools like agent_audit_query and agent_audit_report, which are related but distinct operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives (e.g., agent_audit_query, agent_audit_report). There is no mention of prerequisites, context, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_audit_reportCInspect
Generate audit report (json/markdown/soc2 format)
| Name | Required | Description | Default |
|---|---|---|---|
| to | Yes | ||
| from | Yes | ||
| format | No | markdown | |
| agentId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must fully disclose behavior. It states 'generate' but does not clarify if this is a read-only operation, if it triggers side effects, or what permissions are required. The required parameters (agentId, from, to) hint at a time-bound query, but no behavioral details are provided.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single line, which is concise, but it sacrifices important details. It does not front-load critical usage constraints or parameter hints.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 4 parameters, no output schema, and the complexity of an audit report generator, the description is inadequate. It fails to mention return format, pagination, or any limitations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, yet the description adds no parameter explanations. Only the format enum is hinted by listing formats in parentheses. The meaning of agentId, from, to, and format (e.g., differences between markdown and soc2) is left ambiguous.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool generates an audit report and lists supported formats (json/markdown/soc2). It distinguishes from sibling 'agent_audit_query' and 'agent_audit_record' through the verb 'generate', but does not explicitly differentiate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like agent_audit_query or agent_audit_record. The description implies report generation but lacks context for decision-making.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_captcha_solveCInspect
Solve a CAPTCHA for a domain you own or have explicit permission to access
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | ||
| action | No | ||
| pageUrl | Yes | ||
| siteKey | No | ||
| question | No | ||
| imageBase64 | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must fully disclose behavior. It states 'solve a CAPTCHA' but omits critical details such as what the tool returns (e.g., a token), any side effects, authentication requirements, or rate limits. This is insufficient for an automated CAPTCHA-solving tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, technically concise, but lacks essential details. It sacrifices completeness for brevity, making it under-specified for effective agent use.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of annotations, output schema, and 0% parameter description coverage, the description is grossly incomplete. It does not explain return values, authentication, or parameter usage, leaving agents without crucial context for successful invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, yet the description adds no information about the 6 parameters. Parameter roles like 'type', 'pageUrl', 'siteKey', 'imageBase64' remain unexplained, relying solely on property names. The description fails to clarify expected input formats or relationships.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The name 'agent_captcha_solve' and description 'Solve a CAPTCHA' clearly indicate the primary function. However, it does not differentiate from the sibling tool 'agent_captcha_verify_domain', leaving ambiguity about when to use solving versus verification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies a condition: 'for a domain you own or have explicit permission to access', which implies appropriate usage contexts. But it provides no guidance on when not to use this tool or mention of alternatives like the verification tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_captcha_verify_domainCInspect
Verify ownership of a domain before using CAPTCHA solving
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | ||
| method | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description only states the basic action without disclosing behavioral traits such as whether it modifies data, requires authentication, or what happens on failure. It lacks transparency beyond the literal action.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that front-loads the action. It is concise with no wasted words, though it could benefit from slightly more detail without losing brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a verification tool with 2 parameters and no output schema, the description is incomplete. It does not explain expected outcomes, return values, or how to interpret results, which is critical for a verification step.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description should add meaning to parameters. It does not explain the enum values for 'method' or constraints on 'domain'. The parameter names are self-explanatory, but no additional context is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Verify ownership of a domain' and ties it to the context 'before using CAPTCHA solving'. It is specific enough to distinguish from sibling tool 'agent_captcha_solve'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The phrase 'before using CAPTCHA solving' implies a prerequisite context, but there is no explicit guidance on when not to use or alternatives. It provides a hint but not full directive.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_identity_activitiesAInspect
List activity records for an identity, newest first (owner only)
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| identityId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It reveals ordering (newest first) and access control (owner only). However, it does not disclose rate limits, pagination behavior, or any side effects. The provided context is adequate but not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence that conveys essential information without any unnecessary words. All parts are relevant and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of output schema and only two parameters, the description is minimal. It does not explain what constitutes an activity, what fields are returned, or if there is pagination. It is adequate for a simple list tool but could be more complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so the description must compensate. It clarifies that identityId refers to the identity whose activities are listed, but it does not describe the limit parameter or its default behavior. Partial compensation for one parameter, missing detail for the other.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'List', the resource 'activity records for an identity', and the scope 'owner only'. This is specific and distinguishes the tool from siblings like agent_identity_record or agent_identity_lookup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for listing activities owned by the caller (owner only). However, it does not explicitly state when to use this tool versus alternatives like agent_identity_record or the many watch tools, nor does it provide explicit when-not or alternative tool suggestions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_identity_badgeBInspect
Get the issuer-signed badge and signed fields for an identity
| Name | Required | Description | Default |
|---|---|---|---|
| identityId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Description only states it's a get operation but lacks behavioral details like prerequisites, error handling, or what happens if identityId is invalid.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, 10 words, front-loaded with action. No fluff, but lacks any structural cues like bullet points or clarifications. Adequate for minimal description.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool has low complexity (1 param, no output schema, no annotations). Description states what it retrieves but doesn't mention return value or behavior. Adequate but could be more complete by noting output includes badge and signed fields.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, and description adds minimal meaning beyond the schema. 'identityId' is stated as 'for an identity', but no format, constraints, or use context provided. With a single parameter, more detail expected.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'Get' and resource 'issuer-signed badge and signed fields' for an identity. This distinguishes it from sibling tools like agent_identity_lookup or agent_identity_record, which focus on general identity info.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. For example, doesn't specify that agent_identity_lookup is for basic info, making it unclear when to choose this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_identity_lookupBInspect
Look up an identity. Returns signatureValid (issuer+integrity only, NOT an authenticity/safety signal) and a disclaimer.
| Name | Required | Description | Default |
|---|---|---|---|
| identityId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description explicitly clarifies that signatureValid is only an issuer+integrity check and not an authenticity/safety signal, which is a critical behavioral warning. It also notes a disclaimer is returned. This adds transparency beyond the bare schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise, consisting of one sentence with a parenthetical clarification. Every word adds value, and the most important action ('Look up an identity') is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple lookup tool with no output schema, the description provides the return fields and a caveat, but it does not explain what an 'identity' is, how the identityId is used, or the nature of the disclaimer. It is adequate but incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 0%, but the description does not explain the identityId parameter at all. It only mentions return values, leaving the parameter's meaning and format entirely unspecified.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Look up' and the resource 'an identity', distinguishing it from sibling tools like agent_identity_record or agent_identity_badge. It also specifies the return values (signatureValid and disclaimer), adding clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like agent_identity_activities or agent_identity_badge. The description does not mention prerequisites, context, or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_identity_recordBInspect
Append a hash-chained activity record (owner only). Optional provenance (repo/version/config) is self-reported.
| Name | Required | Description | Default |
|---|---|---|---|
| content | No | ||
| identityId | Yes | ||
| provenance | No | Self-reported origin of the activity (NOT verified) | |
| activityType | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so description carries the burden. It discloses that the operation appends (mutation) and that provenance is self-reported and not verified. However, it omits details like whether the record is immutable, what authorization exactly is needed, and any rate limits or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, each carrying essential information: the core action and the optional nature of provenance. Word-efficient and front-loaded. No wasted text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 4 parameters including a nested object, no output schema, and no annotations, the description is insufficient. It does not explain valid activity types, content structure, or prerequisites like an existing identity. An agent would lack confidence in constructing a correct call.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is only 25% (provenance has a description). The description adds that provenance is self-reported but does not explain the content, identityId, or activityType fields. With low coverage, the description should provide more detail for those parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool appends a hash-chained activity record, owned only by the user. The resource and action are specific, and it distinguishes from siblings like agent_identity_register (which creates identities) and agent_identity_activities (which queries activities).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions 'owner only' which implies a usage constraint but does not clarify when to use this tool versus alternatives like agent_memory_store or other identity tools. No explicit guidance on prerequisites or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_identity_registerBInspect
Register an agent and get a unique identity ID + issuer-signed badge. agent_name/metadata are self-reported and unverified.
| Name | Required | Description | Default |
|---|---|---|---|
| metadata | No | ||
| agentName | Yes | ||
| publicKey | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses that inputs are self-reported and unverified, but lacks details on destructive actions, authentication requirements, or rate limits. Without annotations, the description partially fulfills the transparency burden.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two efficient sentences with front-loaded purpose. Every word adds value. No redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description explains return value (ID and badge) but no output schema exists. It doesn't cover prerequisites, permissions, or how to use the badge. Adequate but not comprehensive for a 3-parameter tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so description must compensate. It explains agentName and metadata but omits publicKey entirely. The metadata object type is not elaborated. Only two of three parameters receive any semantic help.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool registers an agent and returns an identity ID and badge. It specifies that agent_name/metadata are self-reported and unverified, which helps differentiate from lookup or record tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like agent_identity_lookup or agent_identity_badge. No 'when not to use' information provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_memory_deleteCInspect
Delete a memory or all memories in a namespace
| Name | Required | Description | Default |
|---|---|---|---|
| key | No | ||
| agentId | Yes | ||
| namespace | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations, so description must cover behavior. It only states 'delete' with no details on irreversibility, authorization needs, or side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence is concise and front-loaded, but lacks structure to clarify parameter usage.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a deletion tool with 3 undocumented parameters and no output schema, the description is incomplete; lacks explanation of when key vs. namespace deletes all.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% and description adds no meaning beyond parameter names. Ambiguity about key optionality and namespace behavior.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the action (delete) and resource (memory or all memories in a namespace), distinguishing it from sibling tools like get, search, store.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No usage guidance provided; does not specify when to use this tool vs alternatives, nor any prerequisites or caveats.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_memory_getCInspect
Retrieve a stored memory by key
| Name | Required | Description | Default |
|---|---|---|---|
| key | Yes | ||
| agentId | Yes | ||
| namespace | No | default |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description does not disclose any behavioral traits such as idempotency, authorization requirements, or side effects. For a simple retrieval, basic behavioral context is missing.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence with no waste, but it is too minimal to be fully useful. It could be expanded with parameter details while remaining concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the absence of an output schema and param descriptions, the description is incomplete. It does not explain return values or parameter roles, leaving the agent with insufficient context to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema coverage and no parameter descriptions in the input schema, the description adds no meaning to the three parameters. It only mentions 'by key' but fails to explain agentId or namespace.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Retrieve a stored memory by key' clearly states the verb 'retrieve', the resource 'memory', and the method 'by key'. It distinguishes this tool from siblings like agent_memory_store and agent_memory_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like agent_memory_search. There is no mention of prerequisites or context for invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_memory_searchCInspect
Search memories by prefix, tags, or type
| Name | Required | Description | Default |
|---|---|---|---|
| tags | No | ||
| type | No | ||
| limit | No | ||
| agentId | Yes | ||
| keyPrefix | No | ||
| namespace | No | default |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description does not disclose behavioral traits such as pagination, result limits (though limit parameter exists), or whether full memories or summaries are returned. It fails to compensate for the lack of annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, making it concise and front-loaded with the verb. However, it is too brief for a tool with 6 parameters and multiple sibling tools, and it does not earn its place by adding substantial guidance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (6 parameters, no output schema, multiple sibling tools), the description is incomplete. It lacks information on return values, pagination, sorting, or how search differs from agent_memory_get. The required agentId is not highlighted.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the tool description adds minimal meaning beyond parameter names. It mentions 'prefix, tags, or type' but does not explain their format, usage, or the required 'agentId' parameter. The description does not clarify 'namespace' or 'limit' defaults.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Search' and the resource 'memories', and specifies filtering criteria (prefix, tags, or type). However, it does not distinguish from sibling tools like agent_memory_get, which may retrieve a single memory by key, lacking explicit differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., agent_memory_get for individual lookups). The description only states what the tool does without any context on selection criteria or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_memory_storeBInspect
Store a memory for an AI agent (key-value, with TTL and metadata)
| Name | Required | Description | Default |
|---|---|---|---|
| key | Yes | ||
| value | Yes | Any JSON value | |
| agentId | Yes | Agent identifier | |
| metadata | No | ||
| namespace | No | default | |
| ttlSeconds | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full behavioral burden. It states 'store' (indicating mutation) and mentions TTL/metadata features, but does not disclose potential side effects (e.g., overwriting existing keys), authorization needs, size limits, or return behavior. Significant gaps for a write operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single sentence of 14 words, front-loading the core action and features. No extraneous information. Efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 6 parameters, no output schema, and no annotations, the description covers the basic functionality but omits key details about agentId and namespace. It also does not explain the return value or behavior on success/failure. Adequate but not fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is low (33%). The description adds meaning by calling it 'key-value' and referencing TTL and metadata, which maps to key, value, ttlSeconds, and metadata. However, it does not explain the agentId or namespace parameters, nor does it clarify that key is required. Partial compensation but leaves gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'store', the resource 'memory for an AI agent', and the key-value nature with TTL and metadata. This distinguishes it from sibling tools like agent_memory_get, agent_memory_delete, and agent_memory_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives. No mention of prerequisites or when not to use it. The description is purely functional without usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_proxy_fetchBInspect
Fetch a URL via a rotating proxy (region/type selectable). robots.txt enforced.
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | ||
| body | No | ||
| type | No | ||
| method | No | ||
| region | No | ||
| headers | No | ||
| sessionId | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must fully disclose behavior. Only mentions robots.txt enforcement; lacks details on rate limits, error handling, or what happens on blocked requests. Insufficient for a 7-parameter tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loads core action, adds key details efficiently. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool has 7 parameters, nested objects, no output schema. Description is too brief—omits return format, error scenarios, and usage of headers/method. Incomplete for safe and effective invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%; description only adds meaning for 'region' and 'type' via 'region/type selectable'. Fails to explain 'body', 'method', 'headers', 'sessionId'. Parameters largely undocumented.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Fetch a URL via a rotating proxy' with specifics on region/type selectability and robots.txt enforcement. Distinguishes well from siblings like agent_proxy_session.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Mentions rotating proxy and robots.txt enforcement, implying usage for web scraping with proxy selection. However, no explicit when-to-use or when-not-to-use guidance compared to alternatives like direct fetch or captcha solving.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_proxy_sessionBInspect
Create a sticky proxy session (same IP for multiple requests)
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | ||
| region | No | ||
| ttlSeconds | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It adds the key behavioral trait of 'stickiness' (same IP for multiple requests), but fails to disclose other behaviors such as session expiration (though hinted by ttlSeconds parameter), how to terminate the session, or whether the session is reusable across different requests.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence (8 words) that immediately conveys the core purpose. No superfluous information or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has three parameters, no output schema, and no annotations, the description is insufficient. It does not explain what the session returns (e.g., a session ID token), how to use the session in subsequent requests, or any configuration details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning the input schema provides no explanations. The description does not mention any of the three parameters (type, region, ttlSeconds), leaving the agent to guess their purpose and valid values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Create' and the resource 'sticky proxy session', and parenthetically explains what 'sticky' means (same IP for multiple requests). This effectively distinguishes the tool from sibling 'agent_proxy_fetch', which likely fetches a single non-sticky proxy.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like 'agent_proxy_fetch'. The description does not mention any prerequisites, exclusions, or context in which the tool is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_tempmail_createCInspect
Create a temporary email address (auto-expires)
| Name | Required | Description | Default |
|---|---|---|---|
| ttlSeconds | No | ||
| preferredPrefix | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose behavioral traits. It mentions 'auto-expires' but fails to specify the expiration duration (though the parameter ttlSeconds hints at it), return value, or any side effects. The description is too brief for a creation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise, with only one sentence. While brevity is good, it sacrifices clarity on parameters and behavior. It could add a brief sentence about parameter usage without becoming verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has two optional parameters that are not described, and no output schema is provided. The description does not explain valid values for ttlSeconds or preferredPrefix, nor what the tool returns. For a creation tool, this is insufficient for an agent to use correctly without further inference.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description provides no information about the parameters (ttlSeconds, preferredPrefix). Since schema coverage is 0%, the description fails to add meaning to the schema, leaving the agent guessing about parameter formats and usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb-resource combination: 'Create a temporary email address'. The '(auto-expires)' adds lifecycle context. This clearly distinguishes it from sibling tools like agent_tempmail_get (retrieve), agent_tempmail_list (list), agent_tempmail_wait (wait for email).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for creating temporary email addresses but does not provide explicit guidance on when to use this tool versus siblings, nor does it mention prerequisites or alternative approaches. Usage is inferred from the context of the sibling tool names.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_tempmail_getCInspect
Get full message content with extracted verification links/codes
| Name | Required | Description | Default |
|---|---|---|---|
| mailboxId | Yes | ||
| messageId | Yes | ||
| includeRaw | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description bears full responsibility for behavioral transparency. It does not disclose side effects, idempotence, rate limits, or any other behavioral traits beyond stating what it does.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (one sentence), but it is too minimal and lacks structure. It could be improved by adding parameter descriptions or usage context without becoming verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 3 parameters, no annotations, no output schema, and no nested objects, the description fails to provide sufficient context. It does not explain return values, the format of extracted links/codes, or any error handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate by explaining parameters. It does not mention any parameter (mailboxId, messageId, includeRaw) or their semantics, leaving the agent to infer from names alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the action ('Get') and the resource ('full message content'), and adds value with 'extracted verification links/codes', distinguishing it from siblings like agent_tempmail_list which lists messages, and agent_tempmail_wait which waits for new messages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. Does not mention that it should be used after obtaining a message ID from agent_tempmail_list, nor does it specify prerequisites or context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_tempmail_listCInspect
List received messages in a mailbox
| Name | Required | Description | Default |
|---|---|---|---|
| after | No | ||
| limit | No | ||
| mailboxId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full behavioral disclosure burden. It only states the basic action, omitting important traits such as authentication requirements, idempotency, error behavior (e.g., mailbox not found), or whether it mutates state.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence is front-loaded and concise, but may be too brief. While no wasted words, the description could be expanded slightly to add value without losing conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 3 parameters, no output schema, and no annotations, the description is incomplete. It fails to explain parameter roles (e.g., pagination with 'limit', date filtering with 'after') or return value structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description adds no semantic meaning to the parameters ('after', 'limit', 'mailboxId'). The agent gets no additional context beyond the schema itself.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'List received messages in a mailbox' uses a specific verb ('List') and resource ('messages in a mailbox'), making the action and object clear. It distinguishes from sibling tools like 'agent_tempmail_get' (get one message) and 'agent_tempmail_create' (create mailbox).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus siblings like 'agent_tempmail_get' or 'agent_tempmail_wait'. The description does not provide context for appropriate usage or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_tempmail_waitBInspect
Wait for an incoming message (long polling, max 60s)
| Name | Required | Description | Default |
|---|---|---|---|
| mailboxId | Yes | ||
| fromContains | No | ||
| timeoutSeconds | No | ||
| subjectContains | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses 'long polling, max 60s' which is a key behavioral trait. However, no annotations are provided, and the description does not cover what happens on timeout (e.g., empty response or error), nor whether it is destructive. Additional context on behavior would improve this.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One short sentence, front-loaded with purpose. Concise but slightly underspecified. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 4 parameters, no output schema, and no annotations, the description is insufficient. It does not explain filtering parameters (fromContains, subjectContains) or the timeout behavior in detail. A more complete description would include what the tool returns (e.g., message object) and how the parameters affect waiting.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description does not explain any of the 4 parameters. For example, 'fromContains' and 'subjectContains' are left entirely undefined. The description mentions max 60s but the default timeoutSeconds is 30, creating minor inconsistency.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verb 'Wait' and resource 'incoming message', with method 'long polling, max 60s'. It clearly distinguishes from sibling tools like agent_tempmail_get or agent_tempmail_list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage for waiting for new messages, but lacks explicit when-to-use or when-not-to-use compared to alternatives like agent_tempmail_get or agent_tempmail_list. No guidance on prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_trust_batchBInspect
Get trust scores for multiple subjects in one call (max 100)
| Name | Required | Description | Default |
|---|---|---|---|
| subjects | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must carry burden. It mentions the max batch size, but does not disclose behavior like idempotency, rate limits, error handling, or read-only nature. Minimal transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with no fluff, front-loaded with purpose and constraint. Every word is necessary.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, no annotations, and no parameter detail. The description only provides a high-level purpose and a limit. Lacks essential information for an agent to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, and description adds no explanation of the 'subjects' parameter's structure (the 'type' and 'value' fields). Without guidance, an agent may not know what values to provide.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the verb 'Get' and the resource 'trust scores for multiple subjects', distinguishing it from sibling tools like agent_trust_score which likely handles single subjects. The batch nature and limit are explicit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for batch queries with a maximum of 100 subjects, but does not explicitly state when to use this tool vs agent_trust_score or provide exclusions/alternatives. Context from sibling tools suggests differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_trust_feedbackCInspect
Submit feedback about an agent/wallet (positive or negative)
| Name | Required | Description | Default |
|---|---|---|---|
| rating | Yes | ||
| category | Yes | ||
| evidence | No | ||
| subjectType | Yes | ||
| subjectValue | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose all behavioral traits. It states 'Submit feedback' but does not clarify whether this is a write operation, if it updates existing entries, requires authentication, or has rate limits. The effect on the trust system is undefined, making the behavior opaque.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single short sentence with no fluff, but it lacks crucial information about parameters and behavior. It is concise but at the expense of completeness. A slightly longer description with key details would be more useful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations, no output schema, and 0% parameter description coverage, the description is severely incomplete. It does not explain return values, side effects, or parameter semantics. The tool's full context is missing, leaving the agent with insufficient information to invoke it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must add meaning. It hints that rating reflects positive/negative sentiment but does not explain the -2 to 2 scale, the meaning of category, evidence, or subjectType. The agent must infer from parameter names and enums, which is risky. More detail is needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Submit feedback' and the resource 'agent/wallet', and mentions positive/negative, which conveys the tool's core purpose. However, it does not explicitly distinguish it from sibling trust tools like agent_trust_score or agent_trust_batch, but the action is distinct enough.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It does not mention prerequisites, limitations, or differentiate it from agent_trust_batch (batch trust actions) or agent_trust_score (checking trust). The description is silent on usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_trust_scoreCInspect
Get trust score for a wallet, agent card URL, or domain
| Name | Required | Description | Default |
|---|---|---|---|
| subjectType | Yes | ||
| subjectValue | Yes | ||
| includeDetails | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations present, so description must cover behavioral traits. It only states it 'gets trust score' without mentioning read-only nature, rate limits, error handling, or data freshness. Minimal information.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is extremely concise (one sentence) and front-loaded, but at the cost of missing essential information. Appropriate length for a simple tool, but incomplete.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 3 parameters, no output schema, and no annotations, the description provides almost no context. It fails to explain return values, parameter details, or error conditions, making it insufficient for correct use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% and description adds no meaning to parameters. subjectType's enum is mentioned but not elaborated, subjectValue lacks format hints, and includeDetails is unspecified despite default value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the verb 'Get' and resource 'trust score' for specific subject types (wallet, agent card URL, domain). It distinguishes from siblings like agent_trust_batch by implying a single query, but lacks explicit differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives such as agent_trust_batch or agent_trust_feedback. No prerequisites or context for usage provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_webhook_createCInspect
Create a webhook endpoint that relays requests to your agent
| Name | Required | Description | Default |
|---|---|---|---|
| agentId | No | ||
| pushUrl | No | ||
| ttlSeconds | No | ||
| description | No | ||
| deliveryMode | Yes | ||
| transformRules | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, and the description only says 'create' without disclosing side effects, idempotency, authentication, rate limits, or what happens on duplicate webhook. The behavioral implications of creation are not explained.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single sentence is concise but fails to provide necessary detail. The description is under-specified and does not earn its place given the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 6 parameters (1 required), no output schema, and no annotations, the description is severely incomplete. It omits parameter explanations, return values, and usage context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, and the description does not mention or explain any of the 6 parameters (e.g., deliveryMode, ttlSeconds). The description adds zero value beyond the raw schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Create a webhook endpoint that relays requests to your agent,' which is a specific verb+resource. It distinguishes from sibling webhook tools (list_requests, poll, replay) by indicating creation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., poll vs create). No prerequisites or context are provided about the necessity of creation before other operations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_webhook_list_requestsCInspect
List requests received by a webhook endpoint
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| offset | No | ||
| endpointId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must fully disclose behavior. Only states 'list requests' without indicating idempotency, authentication needs, or return format. The agent lacks essential behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Very concise single sentence, front-loaded with main purpose. No wasted words. However, for a tool with multiple parameters, slightly more detail could be warranted.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Incomplete given 3 parameters and no output schema/annotations. Does not describe what 'requests' are, ordering, or response structure. Lacks context for effective use compared to similar tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so description should explain parameters. It does not define endpointId, limit, or offset. While names are somewhat self-explanatory, the absence of explanation for pagination (limit/offset) reduces usefulness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool lists requests for a webhook endpoint. It uses specific verb 'list' and resource 'requests', and the target 'webhook endpoint' is clear from name and description. Distinguishes from sibling tools like agent_webhook_create and agent_webhook_replay.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives. Does not mention that endpointId must come from agent_webhook_create, nor does it explain pagination (limit/offset) or any prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_webhook_pollBInspect
Poll for new webhook requests (long polling, max 60s)
| Name | Required | Description | Default |
|---|---|---|---|
| after | No | ||
| limit | No | ||
| timeout | No | ||
| endpointId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so description carries full burden. It mentions long polling and a max timeout, but fails to disclose blocking behavior, return on timeout, or any side effects. Minimal behavioral context beyond the basic operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that efficiently conveys the core action and key behavioral trait (long polling, max 60s). No filler words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 4 parameters, no output schema, and no annotations, the description lacks crucial context such as what 'after' filters, what the response contains, and how to interpret results. It only hints at the polling mechanism.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 0% description coverage for its 4 parameters, and the tool description adds no meaning to 'after', 'limit', 'timeout', or 'endpointId'. The only hint is 'max 60s' which loosely relates to timeout but does not specify default or unit alignment.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Poll for new webhook requests') and resource, and includes key characteristics ('long polling, max 60s'). It distinguishes from sibling webhook tools (create, list, replay).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for waiting on new webhook requests, but does not explicitly guide when to use this versus alternatives like list_requests. No when-not-to-use or alternative names given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
agent_webhook_replayCInspect
Replay a stored webhook request
| Name | Required | Description | Default |
|---|---|---|---|
| toUrl | No | ||
| requestId | Yes | ||
| endpointId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry full behavioral disclosure. 'Replay' implies resending a request, but it does not explain side effects (e.g., duplicate submissions), idempotency, or authentication requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise (one sentence), which is a positive for conciseness but sacrifices comprehensiveness. It could be expanded slightly without losing brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 3 parameters, no output schema, and no annotations, the description is insufficient. It does not explain return values or the replay process, leaving an agent with inadequate information to invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description adds zero meaning to parameters. Terms like toUrl, requestId, and endpointId are unexplained; the description does not clarify their roles or defaults.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Replay a stored webhook request' clearly states the action (replay) and resource (stored webhook request). However, it does not differentiate from sibling tools like agent_webhook_create or agent_webhook_list_requests, limiting its clarity for distinguishing use cases.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., agent_webhook_create, agent_webhook_poll). No context on prerequisites, such as needing a stored request first.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bid_watch_getBInspect
Get a bid notice detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so the description bears full responsibility. It discloses two return fields (firstSeenAt and ledgerVerified) but omits any behavioral details such as authentication, rate limits, or side effects. The read-only nature is implied but not explicit.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise, consisting of two short sentences that convey the core purpose and key return fields. No unnecessary words or information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, no output schema), the description remains incomplete. It does not explain what a 'bid notice' is, the structure of the timeline, or how to use the returned fields. More context is needed for autonomous use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0% description coverage for the sole parameter 'itemId', and the description provides no clarification on its meaning, format, or how to obtain it. This leaves the agent with no guidance on what constitutes valid input.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool retrieves 'bid notice detail plus full event timeline', clearly identifying the resource and scope. It distinguishes from siblings like bid_watch_recent_changes and bid_watch_search by specifying the comprehensive nature of the output.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit when-to-use or alternative guidance is provided. The description implies usage for obtaining full details and timeline, but fails to clarify when one should use bid_watch_timeline or bid_watch_verify_ledger instead.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bid_watch_recent_changesAInspect
Recent appearance / deadline-move / close / cancel / award events across all bid notices since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| since | Yes | ||
| entity | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description must cover behavioral traits. It discloses that output includes firstSeenAt and ledgerVerified and lists event types. However, it does not mention ordering, pagination behavior, error conditions, or authentication requirements, which are important for a polling tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the core functionality, and contains no redundant information. Every phrase adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a change-list tool with no output schema, the description provides essential event types and output fields. However, it lacks details on pagination, error handling, and the optional 'entity' parameter. This is adequate but not fully comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It explains the 'since' parameter but does not describe 'limit' (pagination) or 'entity' (filtering scope). This leaves two of three parameters unclear, which is a significant gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the tool's purpose: listing recent change events (appearance, deadline-move, close, cancel, award) for bid notices since a given timestamp. It distinguishes from sibling tools like bid_watch_get and bid_watch_search by focusing on events and time-based filtering.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for polling recent changes with a timestamp, but does not explicitly state when to use this tool versus alternatives like bid_watch_timeline or bid_watch_get. No exclusions or alternatives are mentioned, leaving some ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bid_watch_searchCInspect
Search Japanese public-procurement bid notices (kkj.go.jp). Each hit includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | No | ||
| since | No | ||
| entity | No | 調達機関 (partial match) | |
| status | No | ||
| bidType | No | 一般競争入札 / 指名 / 随意 等 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full responsibility. It discloses the data source (kkj.go.jp) and that results include firstSeenAt and ledgerVerified, but it does not state whether the tool is read-only, any required permissions, rate limits, or side effects. For a search tool, this is insufficient transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is short and front-loaded with the core purpose. However, it could be more concise by omitting the second sentence about result fields if they are documented elsewhere, or it could expand slightly to add value. Overall, it is efficiently structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 6 parameters, no output schema, and no annotations, the description is notably incomplete. It does not explain how to construct effective queries, what the response format is beyond two fields, or how results are ordered or paginated. An agent would lack sufficient context to use the tool optimally.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is only 33% (descriptions provided for entity and bidType). The tool description adds no further explanation of the parameters beyond the schema. It mentions fields in results (firstSeenAt, ledgerVerified) but does not clarify how parameters like query, since, or status affect the search, leaving agents to guess.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches Japanese public-procurement bid notices from kkj.go.jp, and specifies that hits include firstSeenAt and ledgerVerified. It distinguishes itself from sibling tools by targeting a specific domain (bids on kkj.go.jp), but does not explicitly differentiate from other search tools like grant_watch_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for searching bid notices but provides no explicit guidance on when to use this tool versus alternative tools like bid_watch_recent_changes or bid_watch_timeline. No conditions for when not to use are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bid_watch_timelineBInspect
Time-ordered events only for a bid notice (the differentiator: when it appeared, deadline moved, closed, was cancelled or awarded). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose behavioral traits. It states the tool returns time-ordered events and mentions fields (firstSeenAt, ledgerVerified), but does not clarify read-only nature, authentication needs, pagination, or error behavior. For a simple read tool, the lack of explicit read-only indication is a gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (two sentences) and front-loaded with the core purpose. However, it omits parameter information, which would have justified a perfect score.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description should fully describe what the tool returns. It mentions event types and fields but does not specify output structure (list vs. object), pagination, or that itemId is required. Completeness is inadequate for a tool with no annotations or output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has a single required string parameter 'itemId' with 0% description coverage. The tool description does not explain what itemId refers to, leaving the agent to infer it identifies the bid notice. This adds no value beyond the schema itself.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it returns time-ordered events for a bid notice, listing specific event types (appearance, deadline change, closure, cancellation, award). This distinguishes it from sibling tools like bid_watch_get (which likely returns full notice details) and bid_watch_recent_changes (which returns recent changes, not full timeline).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for obtaining a bid notice's event timeline but does not explicitly contrast with alternatives like bid_watch_get, bid_watch_search, or bid_watch_recent_changes. No when-not-to-use guidance is provided; usage is implied but not explicitly guided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bid_watch_verify_ledgerBInspect
Verify the hash-chain integrity of a bid notice (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so the description must bear the full burden. It discloses return fields (chainValid, brokenAt, etc.) and mentions tamper detection, but omits behavioral details such as read-only nature, side effects, authentication requirements, or potential costs. This leaves an information gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is brief and front-loaded: the first sentence states the core purpose, the second lists return fields. Every sentence contributes value; no redundancy exists. However, the structure could be slightly improved by grouping behavioral and parameter info.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple single-parameter schema and no output schema, the description partially covers return fields but lacks context on when to invoke this tool (e.g., after fetching a bid notice). The absence of output schema is partially mitigated by listing return fields, but depth is missing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The sole parameter 'itemId' is not described beyond its name and type. With 0% schema description coverage, the description adds no semantic context (e.g., format, source, or relationship to bid notices). This forces the agent to infer or guess the parameter's meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Verify the hash-chain integrity') and the specific resource ('a bid notice'). It distinguishes itself from sibling verify_ledger tools by specifying 'bid notice', ensuring proper differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance is provided on when to use this tool versus alternatives (e.g., other bid_watch tools or verify_ledger tools for different entity types). The description only implies tamper detection without clarifying prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
grant_watch_getCInspect
Get a grant call detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so the description bears full responsibility. It only mentions returned fields, omitting behavioral traits like authentication requirements, rate limits, destructive potential, or response structure. Minimal behavioral disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, efficient but lacks crucial details. It is front-loaded with purpose but too brief to stand alone. Better structuring could include parameter explanation and more context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite low complexity (single parameter, no output schema), the description omits essential details: what itemId is, full response structure, and any usage conditions. It is incomplete for effective tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage and only one parameter (itemId), the description should clarify what itemId represents (e.g., a grant identifier, format). It adds no meaning beyond the schema's type definition.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves a grant call detail plus full event timeline, specifying two returned fields. This distinguishes it from sibling tools like grant_watch_search (search) and grant_watch_recent_changes (recent changes), and the resource-specific naming (grant_watch_get vs. bid_watch_get) makes the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., grant_watch_search for filtering, grant_watch_timeline for timeline-only). The description fails to give usage context or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
grant_watch_recent_changesBInspect
Recent appearance / deadline-move / close / close-early events across all grant calls since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| field | No | ||
| limit | No | ||
| since | Yes | ||
| funder | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses event types and included fields (firstSeenAt, ledgerVerified) but omits details on pagination, ordering, rate limits, or whether it's a read-only operation. The limit parameter suggests pagination, but this is not mentioned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences are concise and front-load key information (event types and time filter). However, the structure could be improved by listing parameters or output fields more systematically.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite 4 parameters and no output schema, the description omits essential context: what 'field' and 'funder' control, how limit affects results, and the full return structure. This makes the tool harder to invoke correctly without additional knowledge.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description only explains the 'since' parameter (ISO8601 timestamp). It fails to clarify the purpose of 'field', 'limit' (though default shown), and 'funder', leaving the agent with incomplete understanding of how to use these parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns 'Recent appearance / deadline-move / close / close-early events across all grant calls' since a timestamp, and distinguishes it from siblings like grant_watch_get (single call) and grant_watch_search (search criteria) by the scope 'across all grant calls'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives like grant_watch_get, grant_watch_search, or grant_watch_timeline. The description only implies use for recent changes across all calls without stating exclusions or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
grant_watch_searchBInspect
Search Japanese research-grant calls-for-proposals. Each hit includes firstSeenAt and ledgerVerified (hash-chain integrity).
| Name | Required | Description | Default |
|---|---|---|---|
| field | No | 研究分野 | |
| limit | No | ||
| query | No | ||
| since | No | ||
| funder | No | 配分機関 (JST/AMED/NEDO 等) | |
| status | No | ||
| amountMin | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden for behavioral disclosure. It adds transparency by stating that each hit includes firstSeenAt and ledgerVerified (hash-chain integrity), hinting at immutability verification. However, it does not disclose other behavioral traits like rate limits, authentication needs, or whether the search is exhaustive. The description is adequate but not thorough.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: two sentences that front-load the action and include key output details. Every sentence conveys essential information without extraneous content. It is well-structured for efficient parsing.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 parameters, no output schema, and no annotations, the description provides sufficient domain context and output constraints (firstSeenAt, ledgerVerified) but lacks details on parameter semantics and comprehensive behavioral traits. It's minimally complete for a search tool but leaves gaps that could confuse the agent, especially with Japanese-only parameter descriptions. The sibling tool context is addressed by the domain specificity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is only 29% (2 of 7 parameters have descriptions, both in Japanese). The description does not explain the meaning or usage of the remaining 5 parameters (limit, query, since, status, amountMin). Although the domain context helps with 'query' and 'field', the absence of explicit parameter semantics leaves the agent guessing. The description adds marginal value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it searches Japanese research-grant calls-for-proposals, using the specific verb 'search' and resource 'Japanese research-grant calls-for-proposals'. This distinguishes it from sibling tools like grant_watch_get and other *watch_search tools by specifying the domain (Japanese research grants) and mentioning hit fields (firstSeenAt, ledgerVerified).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites, exclusions, or when to choose grant_watch_search over similar sibling tools like bid_watch_search or grant_watch_get. The usage context is only implied by the domain.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
grant_watch_timelineAInspect
Time-ordered events only for a grant call (the differentiator: when it opened, deadline moved, closed, or closed early). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It mentions included fields (firstSeenAt, ledgerVerified) and time-ordering, but lacks details on sorting order, pagination, error conditions, or permissions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with core purpose, no extraneous words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers the main purpose and hints at output fields, but lacks details on the full output structure or event types, which is noticeable given no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description does not explain the sole parameter 'itemId' (e.g., what it represents or how to specify it), failing to add meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns 'time-ordered events only for a grant call' and lists specific event types (opened, deadline moved, closed, closed early), which differentiates it from siblings like grant_watch_get or grant_watch_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage by highlighting the differentiator and included fields, but does not explicitly state when not to use it or name alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
grant_watch_verify_ledgerBInspect
Verify the hash-chain integrity of a grant call (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses return fields (chainValid, brokenAt, etc.) but does not mention side effects, authentication needs, or whether it is read-only. With no annotations, more behavioral detail is needed, but the description provides basic output info.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is one sentence that conveys purpose and key return values. It is front-loaded and contains no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description covers purpose and return fields but lacks parameter explanation and usage context. More detail about when to use and what the parameter means would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description does not explain the itemId parameter. Schema coverage is 0%, and the description adds no meaning to the single required parameter (e.g., what a grant call identifier is or how to obtain it).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: verifying hash-chain integrity for a grant call for tamper detection. It lists return fields, distinguishing it from siblings like grant_watch_get or other verify_ledger tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., other grant_watch tools or other verify_ledger tools). The description only explains what it does, not the context of use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
license_watch_getBInspect
Get a license registration detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided. The description mentions return fields (firstSeenAt, ledgerVerified) but fails to disclose behavioral traits like idempotency, side effects, authorization requirements, or data freshness for a read operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two short sentences, front-loading the purpose and key return fields without unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, yet the description only mentions two return fields and a timeline. It lacks details on structure, pagination, errors, or parameter semantics, leaving users underinformed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, and the description does not explain the 'itemId' parameter, providing no additional meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and the resource ('license registration detail plus full event timeline'), distinguishing it from sibling tools like license_watch_timeline (which only returns timeline) and license_watch_recent_changes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when needing both detail and timeline, but lacks explicit guidance on when to use this tool versus alternatives such as license_watch_timeline or license_watch_search.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
license_watch_recent_changesBInspect
Recent appearance / revoked / suspended events across all license ledgers since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| since | Yes | ||
| registry | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It notes changes are across all ledgers and items include 'firstSeenAt' and 'ledgerVerified', but omits critical traits like read-only behavior, pagination, rate limits, or what happens with invalid timestamps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with no redundancy: first states purpose and key input, second adds return detail. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, no annotations, and 3 parameters, the description lacks necessary information on usage boundaries, error behavior, and full parameter semantics, making it incomplete for reliable tool selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so description must explain all parameters. It explains 'since' as 'ISO8601 timestamp' but does not clarify 'limit' (presumably page size) or 'registry' (possibly a filter). The description adds minimal value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists 'recent appearance / revoked / suspended events across all license ledgers' since a timestamp, distinguishing it from sibling tools like license_watch_get (single entity) and license_watch_search (query-based). The verb 'watch recent changes' is specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implicitly suggests usage when needing recent changes since a given timestamp, but lacks explicit guidance on when not to use it (e.g., for specific ledgers or historical queries) and does not mention alternatives among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
license_watch_searchBInspect
Search Japanese license / registration ledgers (FSA menkyo: 金融商品取引業者, 預金取扱金融機関 …). Each hit includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | No | 業者名・登録番号・部分一致 | |
| since | No | ||
| status | No | ||
| licensor | No | 許認可権者 (関東財務局長・内閣総理大臣(金融庁)等) | |
| registry | No | 名簿種別 (fsa-kinyushohin / fsa-ginkou など) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided. The description adds that each hit includes 'firstSeenAt' and 'ledgerVerified', which is helpful. However, it does not disclose rate limits, authentication needs, pagination behavior, or search matching details (exact vs fuzzy).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that efficiently conveys the core purpose and a key result detail. It is front-loaded but could benefit from slightly more structure (e.g., listing supported registries).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 6 parameters, no output schema, and no annotations, the description is incomplete. It fails to explain return value structure beyond two fields, parameter usage, or expected behavior when filters are combined. Essential context is missing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 50%, and the description adds no value to parameters beyond the schema. It does not explain 'limit', 'since', 'status', 'registry', or their defaults/formats. The description only mentions result fields, not parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches Japanese license/registration ledgers with specific examples (FSA menkyo). It distinguishes itself from sibling watch_search tools (e.g., bid_watch_search, grant_watch_search) by specifying the Japanese regulatory context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for Japanese license searches but provides no explicit guidance on when to use this versus other watch_search tools. No exclusion criteria or alternative suggestions are given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
license_watch_timelineBInspect
Time-ordered events only for a license registration (the differentiator: when it appeared, when it was revoked / expired / suspended). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behavioral traits. It mentions included fields but does not describe read-only nature, error handling, performance, or authentication needs. The agent cannot assess side effects or return completeness.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no extraneous words. Front-loaded with purpose and differentiator.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given a single parameter and no output schema, the description is minimally adequate. It explains purpose and included fields but lacks details on output structure or potential errors.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description should explain the itemId parameter. It indirectly implies itemId is a license registration ID but adds no specifics about format or validation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns 'time-ordered events only for a license registration' and specifies the differentiator: events like when it appeared, revoked, expired, or suspended. It distinguishes from sibling tools like license_watch_get (single record) and license_watch_search (filtered list).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies this tool is for viewing timeline events, but does not explicitly state when to use it over alternatives like license_watch_recent_changes or license_watch_search. No exclusions or prerequisites are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
license_watch_verify_ledgerBInspect
Verify the hash-chain integrity of a license registration (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must cover behavioral traits. It lists return fields but does not disclose whether the operation is read-only, requires authentication, or has any side effects. The description lacks important transparency for a verification tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, with one sentence and a list of return fields. It is well-structured and avoids unnecessary text, though it could be slightly more detailed without losing conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has one parameter and no output schema, the description covers the basic purpose and return fields. However, it lacks context on error handling, prerequisites, or when to use the tool among many similar siblings.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description does not explain the single parameter 'itemId'. It only hints at 'license registration', but does not specify what the identifier is or how it should be formatted.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: verifying hash-chain integrity of a license registration for tamper detection. It distinguishes from sibling tools like other verify_ledger tools (different domains) and other license_watch tools (different actions).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives. While the description implies use for tamper detection, it does not specify exclusions or provide context for choosing among similar verify_ledger tools or other license_watch tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ordinance_watch_getCInspect
Get an ordinance detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must convey all behavioral traits. It only mentions that the tool returns firstSeenAt and ledgerVerified, but does not disclose whether it is read-only, requires authentication, has rate limits, or any side effects. The behavioral profile is under-specified.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single short sentence, which is concise but omits critical details. It front-loads the purpose but sacrifices completeness. Every sentence should earn its place; here, brevity leads to gaps.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (one parameter, no output schema), the description is incomplete. It does not specify the return structure beyond two fields, possible error conditions, or expected input format. More context is needed for reliable use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It provides no explanation of the itemId parameter, its format, or constraints. The tool description only names the parameter but adds no semantic value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool gets an ordinance detail plus full event timeline and returns specific fields (firstSeenAt, ledgerVerified). It distinguishes from sibling tools in other domains (e.g., bid_watch_get) but does not explicitly differentiate from the sibling ordinance_watch_timeline tool, which may cause confusion about which to use for timeline-related queries.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like ordinance_watch_timeline, ordinance_watch_recent_changes, or others. The description lacks any context about prerequisites, appropriate scenarios, or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ordinance_watch_recent_changesBInspect
Recent appearance / amendment / repeal events across all ordinances since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| since | Yes | ||
| issuerCode | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description should fully disclose behavioral traits. It mentions output fields but omits pagination, ordering, error handling, and whether the operation is read-only (though implied). Does not clarify that issuerCode filters the 'all ordinances' claim.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded single sentence with clear action and scope, plus a second sentence adding output detail. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and 0% schema description coverage, the description lacks details on return format, pagination, ordering, and edge cases. It also fails to reconcile the 'all ordinances' phrasing with the optional issuerCode filter.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must explain all parameters. It explains 'since' (ISO8601 timestamp) but not 'limit' (default 100) or 'issuerCode' (filtering scope). Only partial compensation for the coverage gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs (appearance/amendment/repeal events) and resource (ordinances), clearly distinguishing from sibling tools like ordinance_watch_get or ordinance_watch_timeline, and from other watch_recent_changes tools for different entities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives (e.g., ordinance_watch_timeline for full history, ordinance_watch_get for a single entity). The description implies use for recent changes but does not state exclusions or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ordinance_watch_searchBInspect
Search Japanese national laws / ordinances (e-Gov 法令検索 v2; Stage 1 covers the national level only — 自治体例規 ships in a follow-up). Each hit includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | No | 法令名 / 法令番号 / 略称・部分一致 | |
| since | No | ||
| status | No | ||
| issuerCode | No | 自治体コード JIS X 0401/0402 (国は不要) | |
| jurisdiction | No | '国' (Stage 1 only) | |
| ordinanceType | No | 法律 / 政令 / 省令 / 勅令 / 規則 / 憲法 … |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses the output fields (firstSeenAt, ledgerVerified) and the national-only stage limitation. However, it omits behavioral details like authentication requirements, rate limits, pagination, or error handling. It is moderately transparent but incomplete.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the core function, and contains no redundant information. Every part serves a purpose, making it efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 7 parameters, no output schema, and no annotations, the description is insufficient. It lacks details on response format, pagination, error codes, and parameter interactions. The national-only warning helps, but overall completeness is low.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds little beyond the schema's parameter descriptions. For limit, since, and status, which have no schema descriptions, the description provides no additional meaning. It mentions output fields but does not clarify parameter usage or constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it searches Japanese national laws/ordinances using e-Gov, distinguishing it from sibling tools like get, recent_changes, timeline, verify_ledger. It specifies the scope (Stage 1 national only) and what each hit includes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide explicit guidance on when to use this tool versus alternatives such as ordinance_watch_get or ordinance_watch_timeline. It only mentions the scope but no when-to-use or when-not-to-use instructions for an AI agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ordinance_watch_timelineBInspect
Time-ordered events only for an ordinance (the differentiator: when it appeared / was amended / was repealed). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behavioral traits. It mentions data fields (firstSeenAt and ledgerVerified) and implies read-only, but does not clarify idempotency, error handling, rate limits, or safety profile.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded with the core purpose and differentiator. No redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description covers the purpose and key fields. However, it lacks details on pagination, ordering, or event types, leaving some gaps for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has one parameter (itemId) with no description, and the description does not explain what itemId represents or how to obtain it. With 0% schema coverage, the description should compensate but fails to add meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it provides time-ordered events for an ordinance, specifically when it appeared, was amended, or repealed. This distinguishes it from sibling tools like ordinance_watch_get (get details) and ordinance_watch_recent_changes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for timeline events but does not explicitly state when to use this versus other tools like ordinance_watch_search. The phrase 'the differentiator' gives context but lacks explicit when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ordinance_watch_verify_ledgerAInspect
Verify the hash-chain integrity of an ordinance record (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It describes return values but does not state whether the operation is read-only/idempotent, any rate limits, or side effects. The description adds context beyond what annotations would provide, but misses behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that concisely states the purpose and lists return fields. No wasted words, front-loaded with main verb.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 1 parameter and no output schema, the description covers purpose and output fields. However, it could mention the tool's safety profile (likely read-only) for completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so description should compensate. The description mentions 'ordinance record' and 'hash-chain integrity', which gives context for the itemId parameter, but does not add format or constraints. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool verifies hash-chain integrity for tamper detection, and lists specific return fields. It distinguishes from siblings (other verify_ledger tools) by specifying 'ordinance record'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool vs alternatives (e.g., other verify_ledger tools for different domains). Usage is implied by the tool's domain (ordinance) but no exclusions or prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pharma_watch_getCInspect
Get a pharmaceutical record detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries the full burden. It implies a read operation but does not explicitly state read-only behavior, permissions, rate limits, or side effects. The description is minimal.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose and return fields. No redundancy, appropriate length for a simple get tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description mentions return fields but does not describe the full structure or timeline format. With no output schema and no annotations, the description is insufficient for complete understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description does not explain the itemId parameter at all. No information about expected format or value range is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get' and identifies the resource as 'pharmaceutical record detail plus full event timeline'. It also specifies return fields. However, it does not explicitly distinguish from other watch_get tools, though the name provides enough context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like pharma_watch_search or pharma_watch_timeline. The description only states what it does without any contextual recommendations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pharma_watch_recent_changesBInspect
Recent approval / NHI-listed / price-revised events across all pharmaceutical records since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| since | Yes | ||
| category | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, so the description must carry the full burden. It only mentions that items include firstSeenAt and ledgerVerified, but omits whether the tool is read-only, if pagination is needed, or any rate limits. Missing critical behavioral disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero filler. Front-loaded with purpose and output details. Efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema and no annotations, the description should cover all necessary context. It fails to explain the output format beyond two fields, how to paginate (limit), or what category values are valid. Incomplete for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must explain all parameters. It only addresses 'since' implicitly via 'ISO8601 timestamp', but completely omits 'limit' (default 100) and 'category' (string). No additional meaning provided for any parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists recent approval/NHI-listed/price-revised events since a given ISO8601 timestamp, distinguishing it from sibling tools like pharma_watch_get (retrieve specific) or pharma_watch_search (search by criteria).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for recent events but provides no explicit when-not or alternative guidance. The sibling context exists but the description itself lacks direction for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pharma_watch_searchBInspect
Search Japanese pharmaceutical approvals (PMDA) and NHI-listed drugs (MHLW yakka). Each hit includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | No | 販売名 / 成分名・部分一致 | |
| since | No | ||
| status | No | ||
| category | No | PMDA分野 (第1 等) / MHLWセグメント (内用薬 等) | |
| applicant | No | 製造販売業者 / メーカー |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It only mentions that results include 'firstSeenAt' and 'ledgerVerified', but lacks details on read-only behavior, pagination, ordering, or confidentiality. Insufficient for a search tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences that front-load the core purpose and key result fields, with no wasted words or repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 6 parameters, no output schema, and no annotations, the description is incomplete. It does not explain the meaning of 'firstSeenAt' or 'ledgerVerified', nor details on query semantics (e.g., partial match). The agent lacks crucial context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 50%; the tool description adds no parameter info beyond what schema already provides. Parameters like limit, since, and status are not explained in the description, leaving gaps for the agent.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it searches Japanese pharmaceutical approvals (PMDA) and NHI-listed drugs (MHLW yakka), distinguishing it from sibling tools like pharma_watch_get or pharma_watch_timeline by specifying the generic search function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for searching but does not explicitly state when to use this tool over alternatives, such as retrieving a specific drug (pharma_watch_get) or viewing history (pharma_watch_timeline). No exclusions or contextual guidance provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pharma_watch_timelineCInspect
Time-ordered events only for a pharma record (the differentiator: when it was approved / NHI-listed / price-revised). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must carry full behavioral disclosure. It mentions included fields (firstSeenAt, ledgerVerified) but does not state whether the operation is read-only, has side effects, requires authentication, or has rate limits. Lacks critical behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences efficiently convey purpose and key fields. Front-loaded with the main function. No wasted words, though the parameter info is missing.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, so the agent lacks return structure info beyond mentioned fields. One parameter undocumented. The description is insufficient to fully use the tool correctly without prior knowledge. Sibling tools (e.g., pharma_watch_get) with more context suggest this is incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has one required parameter (itemId) with 0% schema description coverage. The description does not explain what itemId represents or how to obtain it. No additional semantics provided beyond the parameter name.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it returns time-ordered events for a pharma record, specifying the differentiator (approval, NHI-listed, price-revised). This distinguishes it from other pharma_watch siblings like get, search, or verify_ledger. The verb 'events' is slightly vague but overall clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implied usage: use when you need the timeline of a specific pharma record. However, no explicit guidance on when not to use it or alternatives among siblings like pharma_watch_get or pharma_watch_recent_changes. The 'differentiator' hint helps but is not fully explicit.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pharma_watch_verify_ledgerAInspect
Verify the hash-chain integrity of a pharma record (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses that the tool is for tamper detection and lists return fields (chainValid, brokenAt, etc.), but does not mention side effects, authentication needs, or whether the operation is read-only. The description is minimally adequate for a verification operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is one sentence with a clear action and a list of return fields. It is front-loaded with the verb and resource, and every word adds value. No extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, no output schema), the description adequately covers the purpose and return values. However, it lacks usage guidelines and error conditions, which would improve completeness for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has one parameter (itemId) with no description coverage. The tool description does not elaborate on the parameter's format, constraints, or example values. However, the name 'itemId' is inherently clear, and the single parameter structure reduces ambiguity.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'verify' and the specific resource 'hash-chain integrity of a pharma record' with tamper detection. It distinguishes from sibling tools (e.g., bid_watch_verify_ledger, grant_watch_verify_ledger) by specifying the pharma domain.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not explicitly state when to use this tool vs alternatives, nor does it provide exclusion criteria. While the name implies verification, there is no guidance on when verification is appropriate compared to other pharma watch tools like get, search, or timeline.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubcom_watch_getCInspect
Get a public-comment notice detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, so the description must cover behavior. It states it returns data but does not clarify if the operation is read-only, if authentication is needed, or any side effects. The mention of specific return fields is helpful but incomplete.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is short with two sentences. It is concise, but it sacrifices completeness. It front-loads the main purpose but lacks detail on parameters and behavioral context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple lookup tool with one parameter and no output schema, the description provides the essential purpose but omits details about the event timeline and the meaning of returned fields. It is adequate but not thorough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has one parameter (itemId) with 0% description coverage. The description does not explain what itemId is, its format, or how to obtain it. Since schema coverage is low, the description should compensate, but it fails to do so.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Get', the resource 'public-comment notice detail plus full event timeline', and mentions specific fields returned. This distinguishes it from sibling tools like pubcom_watch_search or pubcom_watch_timeline.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It does not mention scenarios like searching, listing recent changes, or verifying ledger, nor does it give conditions for use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubcom_watch_recent_changesBInspect
Recent appearance / deadline-move / close / result-published events across all notices since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| since | Yes | ||
| agency | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. It mentions that items include firstSeenAt and ledgerVerified, but does not disclose key behaviors such as pagination, ordering, rate limits, potential for empty results, or whether the events list is exhaustive. The read-only nature is implied but not explicit.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences long, front-loading the core functionality and then adding detail about returned fields. It is concise and avoids redundancy, though it could be slightly more structured with parameter hints.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema and low parameter coverage, the description is insufficient. It does not explain the output format beyond two fields, lacks information on sorting, event types' representation, or whether results are limited by default. Additional context about typical use cases or limitations would be valuable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It explains the required 'since' parameter as an ISO8601 timestamp, but does not explain the optional 'limit' (default 100) or 'agency' parameters. Two out of three parameters lack any contextual explanation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it returns recent events (appearance, deadline-move, close, result-published) across all notices since a timestamp. It distinguishes from sibling tools by specifying the resource (pubcom notices) and the operation (recent changes), and from other pubcom tools like pubcom_watch_get or pubcom_watch_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies it is for polling recent changes after a given timestamp, but does not explicitly state when to use this tool versus alternatives like pubcom_watch_get, pubcom_watch_search, or pubcom_watch_timeline. No 'when not to use' guidance is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubcom_watch_searchCInspect
Search e-Gov public-comment notices. Each hit includes firstSeenAt and ledgerVerified (hash-chain integrity).
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | No | ||
| since | No | ||
| agency | No | 所管府省・行政機関 | |
| status | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must fully disclose behavior. It mentions output fields (firstSeenAt, ledgerVerified) but does not state read-only nature, authentication needs, rate limits, or any side effects. The tool is likely a read search, but this is not explicit.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is short (two sentences) but sacrifices essential parameter information. While front-loads the purpose and output info, the lack of parameter elaboration makes it under-specified rather than concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 5 parameters, low schema coverage, no output schema, and similar sibling tools, the description is incomplete. It does not explain how to use parameters, constraints (e.g., date formats), or the full output structure. The tool's simplicity is not fully captured.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is only 20% (only 'agency' has a description). The description adds no parameter guidance: it does not explain 'query', 'since', 'limit', or 'status' beyond what the schema provides. This is insufficient for a 5-parameter tool with low schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches e-Gov public-comment notices, distinguishing it from sibling tools that search other categories like bids or grants. The verb 'Search' and resource 'public-comment notices' are specific, and the mention of output fields adds clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Usage is implied by the resource type: use for public comment notices. However, no explicit guidance on when to use versus alternatives (e.g., pubcom_watch_get for single notice, pubcom_watch_recent_changes for changes). No exclusions or prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubcom_watch_timelineCInspect
Time-ordered events only for a notice (the differentiator: when it opened, deadline moved, closed, or result was published). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must disclose behavioral traits. It states included fields but omits whether the tool is read-only, authentication needs, rate limits, or side effects. For a watch tool, read-only is implied but not confirmed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with front-loaded key information. No redundancy. Could be slightly more structured but remains efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequate for a simple tool with one parameter and no output schema. Describes purpose and key fields but lacks output format details, ordering, or differentiation from sibling timeline tools. Additional context would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% and description adds minimal meaning above the schema. It mentions 'for a notice' but does not explicitly state that itemId is the notice identifier. The parameter definition is not enhanced.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it provides time-ordered events for a notice, listing specific event types (opened, deadline moved, closed, result published) and fields (firstSeenAt, ledgerVerified). The tool name and context differentiate it from sibling tools like pubcom_watch_get or pubcom_watch_recent_changes, though it could more explicitly state it returns a list of timeline events.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives. The description implies it is for a notice's timeline but does not distinguish from pubcom_watch_recent_changes or pubcom_watch_get. Lacks when-to-use, when-not-to-use, or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pubcom_watch_verify_ledgerBInspect
Verify the hash-chain integrity of a notice (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, description carries full burden. Lists return fields but does not disclose side effects, performance, or error behavior (e.g., what happens if integrity fails). Barely adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One sentence plus a list of return fields. Front-loaded and free of fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a 1-parameter verification tool with no output schema, the description lists return fields but lacks explanation of how to interpret them (e.g., what 'brokenAt' means). Adequate but not thorough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% and description does not describe the single parameter 'itemId' at all. Agent receives no guidance on its meaning or expected format.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'Verify' and resource 'hash-chain integrity of a notice', with explicit return fields. While sibling differentiation is implicit via naming, the purpose is unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like pubcom_watch_get or other verify_ledger tools. No prerequisites or exclusions provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recall_watch_getCInspect
Get a recall detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It only mentions return fields (firstSeenAt, ledgerVerified) but does not disclose any behavioral traits like read-only, authentication, or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise at two sentences, front-loading the main purpose and adding key return fields with no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple get tool, the description is incomplete: it lacks parameter semantics, usage context, and does not cover potential error scenarios or prerequisites despite having no output schema to compensate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage for the parameter 'itemId', and the description adds no meaning about its format, source, or constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action 'Get' and the resource 'recall detail plus full event timeline', distinguishing it from sibling tools like recall_watch_search or recall_watch_timeline by adding 'full event timeline' and specific return fields.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs. alternatives such as recall_watch_search or recall_watch_recent_changes. The description lacks any when-to-use or when-not-to-use information.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recall_watch_recent_changesCInspect
Recent appearance / severity-escalated events across all recalls since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| since | Yes | ||
| agency | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided. The description reveals that the tool returns events with firstSeenAt and ledgerVerified, but does not disclose read-only nature, rate limits, or any side effects. Basic behavioral context is missing.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, concise and front-loaded with key information. However, it could be improved by structuring parameter details separately. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, so the description must provide sufficient context. It mentions two fields (firstSeenAt, ledgerVerified) but omits others. Parameter descriptions are missing. The tool is incomplete for an agent to use correctly without additional knowledge.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so the description must explain parameters. It mentions 'since ISO8601 timestamp' but does not explicitly link to the 'since' parameter. 'limit' and 'agency' are not described at all, leaving the agent with no semantic understanding of how to use them.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists recent appearance/severity-escalated events across all recalls since an ISO8601 timestamp, differentiating it from other recall watch tools by focusing on 'recent changes' and severity escalation. It mentions output fields (firstSeenAt, ledgerVerified), but does not fully distinguish from recall_watch_get or recall_watch_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives like recall_watch_search or recall_watch_timeline. The description implies usage for recent changes, but fails to exclude other scenarios or contrast with sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recall_watch_searchBInspect
Search Japanese product / food recall notices (consumer-affairs-agency aggregator). Each hit includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | No | 商品名 / 事業者名・部分一致 | |
| since | No | ||
| agency | No | 所管 (消費者庁 等) | |
| status | No | ||
| recallClass | No | リコール区分 (返金/回収 / 回収命令 / 注意喚起 等) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It only mentions output fields (firstSeenAt, ledgerVerified) but does not disclose whether the tool is read-only, requires authentication, has rate limits, or any other behavioral traits. The description is insufficient for a 6-parameter tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise – two sentences that front-load the purpose. Every sentence adds value with no redundancy. It efficiently communicates the tool's core function and key output fields.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having 6 parameters and no output schema, the description only mentions two output fields. It lacks guidance on parameter usage (e.g., how limit works, what since expects, how status and recallClass filter results). Essential context for effective use is missing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 50% (3 of 6 parameters described). The description adds no additional parameter information beyond the schema; it fails to explain the undocumented parameters (limit, since, status) or their usage. The mention of output fields does not compensate for the missing parameter semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifies it searches Japanese product/food recall notices from a consumer-affairs-agency aggregator, clearly stating the resource and action. It also mentions output fields (firstSeenAt, ledgerVerified), distinguishing it from sibling tools like bid_watch_search or grant_watch_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies this tool is for searching recall notices, but does not explicitly state when to use it over alternatives like recall_watch_get or recall_watch_timeline. No exclusions or prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recall_watch_timelineCInspect
Time-ordered events only for a recall (the differentiator: when it appeared, when severity escalated, when it was completed). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions output fields (firstSeenAt, ledgerVerified) but does not disclose behavioral traits such as whether it is read-only, destructive, requires authentication, or has rate limits. This is insufficient for a tool with no annotation support.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, consisting of two sentences. It front-loads the main purpose and lists key fields. However, it could be slightly more structured, but overall it is efficient and to the point.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input schema, lack of annotations, and no output schema, the description is incomplete. It does not explain the output format, whether results are paginated, or what ledgerVerified means. More context is needed for the agent to effectively use this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The only parameter, itemId, is not explained in the description. Schema coverage is 0%, so the description should clarify what itemId represents (e.g., the recall ID). Without this, the agent may not know what value to provide.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states that the tool returns time-ordered events for a recall, specifying the types of events (appearance, severity escalation, completion). It distinguishes from sibling timeline tools by specifying 'for a recall' and from other recall_watch tools by focusing on timeline. However, it could be more precise about the exact output.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides minimal guidance on when to use this tool. It mentions 'the differentiator' but does not explicitly state when to use it over alternatives like recall_watch_search or recall_watch_get. No exclusions or when-not-to-use are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recall_watch_verify_ledgerAInspect
Verify the hash-chain integrity of a recall record (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It lists return fields (chainValid, brokenAt, etc.) which gives insight into behavior, but it does not disclose whether the tool is read-only, requires authentication, or has side effects. Adequate but not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence followed by a clear list of return fields. It is front-loaded with the purpose and wastes no words. Ideal for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (1 parameter, no output schema), the description covers the core function and return values. However, it lacks context on error conditions, prerequisites, or the meaning of the integrity check, which could impact an agent's ability to use it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage, and the description does not explain the parameter 'itemId' beyond being a required string. The context of the tool name implies it is a recall record ID, but no format, source, or meaning is provided. The description fails to add value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'verify' and the resource 'hash-chain integrity of a recall record', with a parenthetical '(tamper detection)' that further clarifies. This distinguishes it from sibling *verify_ledger tools for other record types and other recall_watch tools that perform different operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for tamper detection but does not explicitly state when to use this tool versus alternatives like recall_watch_get or recall_watch_timeline. No direct guidance on prerequisites or exclusion criteria is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
sanction_watch_getCInspect
Get a sanction detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden for behavioral disclosure. It mentions returns but does not disclose prerequisites (e.g., authentication), error handling, rate limits, or whether the operation is read-only. The description is too minimal for a tool with no annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single short sentence, which is concise but lacks necessary details. It could be considered under-specified rather than efficiently concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (one parameter, no output schema), the description is incomplete. It mentions two return fields but not the full structure of the response. More context about the timeline format or error responses would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, meaning the parameter 'itemId' is undocumented in the schema. The description does not explain what 'itemId' is or how to obtain it, adding no semantic value beyond the field name.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and the resource ('sanction detail plus full event timeline'), and mentions specific return fields ('firstSeenAt and ledgerVerified'). This is specific and distinguishes from sibling tools for different watch types.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like sanction_watch_search or other watch_get tools. There are many sibling tools with similar patterns, but no when-to-use or when-not-to-use information.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
sanction_watch_recent_changesAInspect
Recent appearance / lift events across all sanctions since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| since | Yes | ||
| regulator | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that the tool returns 'appearance / lift events' and includes specific fields (firstSeenAt, ledgerVerified), indicating a read-only operation. However, it lacks details on auth requirements, rate limits, pagination, or side effects. The behavioral disclosure is adequate but not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the key action and constraints. Every sentence provides value with no redundancy or filler. It is appropriately concise for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low schema coverage (0%) and no output schema, the description should be more complete. It covers the 'since' parameter and output fields but omits details on 'limit' behavior, 'regulator' filtering, pagination, and result structure. For a tool with 3 parameters and no annotations, this is insufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It adds meaning for 'since' (ISO8601 timestamp) and output fields, but does not explain 'limit' (default 100) or 'regulator' (purpose unclear). Two out of three parameters are underspecified, leaving the agent with insufficient semantic guidance.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool retrieves 'Recent appearance / lift events across all sanctions since the given ISO8601 timestamp.' It specifies the output includes 'firstSeenAt and ledgerVerified.' This distinguishes it from sibling tools like sanction_watch_get (specific sanction), sanction_watch_search (search), and sanction_watch_timeline (timeline).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for recent changes but does not explicitly state when to use this tool versus alternatives. No when-not-to-use guidance or comparison with sibling tools is provided. While the name and context suggest it, explicit guidelines are missing.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
sanction_watch_searchBInspect
Search Japanese administrative sanctions (FSA jirei archive). Each hit includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | No | 被処分者名・部分一致 | |
| since | No | ||
| status | No | ||
| regulator | No | 処分庁 (FSA など) | |
| sanctionType | No | 業務改善命令 / 業務停止 等・部分一致 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses that each hit includes 'firstSeenAt' and 'ledgerVerified', giving some return value details. However, with no annotations, it fails to disclose pagination, rate limits, or authentication needs, leaving gaps in behavioral transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences without extraneous words. Every sentence adds value: one states the action and source, the other mentions key fields. Efficient and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 6 parameters, no output schema, and a family of sibling tools, the description is too brief. It lacks parameter guidance, usage context, and differentiation, making it incomplete for an agent to use effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds minimal meaning beyond the schema, only referencing the FSA jirei archive context. Schema coverage is 50% (some params have descriptions), but the description does not explain undocumented parameters like 'limit' or 'since', failing to compensate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches Japanese administrative sanctions from the FSA jirei archive, with a specific verb 'search' and resource 'sanctions'. It distinguishes from sibling tools like sanction_watch_get implicitly, but lacks explicit differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like sanction_watch_get or sanction_watch_recent_changes. The description does not mention context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
sanction_watch_timelineCInspect
Time-ordered events only for a sanction (the differentiator: when it appeared and when it was lifted). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided. The description mentions output fields but does not disclose whether the tool is read-only, requires authentication, or has any side effects. For a tool with zero annotation coverage, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loads the core purpose, and avoids any unnecessary words. It is appropriately sized for the tool's simplicity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description partially covers return values (time-ordered events with firstSeenAt and ledgerVerified). However, it lacks detail on the event structure and does not explain the input parameter. It is adequate but not thorough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The only parameter, itemId, is not explained in the description or schema (0% coverage). The description implies it identifies a sanction by the tool's context but does not clarify what itemId represents or how to obtain it.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it returns 'time-ordered events only for a sanction' and highlights the differentiator (when it appeared and lifted) and included fields. This distinguishes it from other timeline tools like bid_watch_timeline, though it doesn't explicitly contrast with them.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use for sanction timeline events but provides no guidance on when to use this tool over alternatives such as sanction_watch_get or sanction_watch_search. No when-not or selection criteria are given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
sanction_watch_verify_ledgerAInspect
Verify the hash-chain integrity of a sanction record (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose all behavioral traits. It lists return fields (chainValid, brokenAt, etc.) but does not explicitly state it is read-only, mention auth requirements, or describe side effects. The term 'verify' implies a read operation, but more transparency on safety profile would improve the score.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence followed by a list of return fields. It is concise, front-loaded with the core purpose, and every word adds value. No redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple verification tool with one parameter, the description covers the purpose and return fields adequately. However, it omits explanation of the itemId parameter and does not mention error handling, rate limits, or prerequisites. While functional, it leaves minor gaps that reduce completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'itemId' is not described in the tool description at all. Schema coverage is 0%, and the description fails to explain what itemId represents or its format. This is a significant gap, as the agent must rely solely on the parameter name and context, which may be insufficient.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'verify' and the resource 'hash-chain integrity of a sanction record', with explicit mention of 'tamper detection'. It distinguishes itself from sibling tools through domain specificity (sanction vs bid, grant, etc.), making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context that this tool is for sanction records, implying usage for ledger verification. However, it lacks explicit when-not-to-use or alternative tool references. The domain context is sufficient for an AI agent to infer appropriate usage among similar verify_ledger siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subsidy_watch_getBInspect
Get a subsidy program detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| programId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses return fields (firstSeenAt, ledgerVerified) which adds value beyond the name, but does not describe other behavioral aspects like rate limits, permissions, or whether the timeline is complete.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that is efficient and front-loaded with the core purpose. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple get-by-ID tool, the description provides core purpose and return fields, but lacks any mention of prerequisites, error handling, or scope of the timeline (e.g., all events or recent?).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description does not explain the programId parameter beyond its name. The return field mention is helpful but does not compensate for lack of parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool retrieves a subsidy program detail and full event timeline, and mentions specific return fields (firstSeenAt, ledgerVerified). Distinguishes from sibling watch_get tools by specifying the resource type.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus siblings like subsidy_watch_search or subsidy_watch_timeline. Does not specify that it retrieves by programId, which is the only parameter.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subsidy_watch_recent_changesBInspect
Recent appearance / change / close events across all programs since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| since | Yes | ||
| category | No | ||
| issuerCode | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description bears full burden. It mentions firstSeenAt and ledgerVerified fields, but does not disclose pagination behavior, ordering, or other side effects. For a read-only tool, this is minimally adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two clear, front-loaded sentences with no redundant information. Every word serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While the purpose is clear, the description is incomplete for a 4-parameter tool with no output schema and no annotations. It fails to explain parameters, ordering, or how to interpret results. Given many sibling tools, more detail is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%. The description only mentions the 'since' parameter context. It does not explain the 'category' or 'issuerCode' parameters, which could be misleading given the 'across all programs' phrasing. The agent lacks crucial parameter semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists recent appearance/change/close events across all programs since a timestamp. It distinguishes from sibling tools like subsidy_watch_get (specific program) and subsidy_watch_search (search), but does not explicitly contrast with subsidy_watch_timeline.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies the required parameter (since) but offers no guidance on when to use this tool versus similar sibling tools (e.g., subsidy_watch_timeline). No exclusions or alternatives are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subsidy_watch_searchCInspect
Search the current state of subsidy programs. Each hit includes firstSeenAt and ledgerVerified (hash-chain integrity).
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | No | ||
| status | No | ||
| category | No | ||
| amountMin | No | ||
| issuerCode | No | JIS X 0401/0402 自治体コード |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It mentions that hits include firstSeenAt and ledgerVerified, but does not disclose behavior like pagination, auth needs, rate limits, or what happens with empty results.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose. Could be considered efficient, but lacks structure like bullet points for parameters. Still, every sentence is useful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 6 parameters, no output schema, and no annotations, the description is too sparse. It mentions two return fields but does not cover parameter semantics, search behavior, or error conditions. Incomplete for a search tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is only 17% (issuerCode described). The description adds no parameter-level meaning beyond listing fields in the schema. For example, it doesn't explain the role of 'query' or 'status'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states 'Search the current state of subsidy programs,' which clearly identifies the verb and resource. However, it does not differentiate from sibling tools like bid_watch_search, though the name helps.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., other watch_search tools). The description lacks any context about appropriate use cases or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subsidy_watch_timelineBInspect
Time-ordered events only for a program (the differentiator: when it appeared, changed, closed). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| programId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It states the tool returns 'Time-ordered events' and includes fields, but does not explicitly label it as a read-only operation, mention authentication needs, or describe any side effects. The behavioral profile is partially implied but not fully disclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that front-loads the core purpose ('Time-ordered events only for a program'), then adds the differentiator and key fields. No redundant words; every element earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description adequately states what it does and mentions two fields. However, it lacks details on output structure (e.g., full list of fields, ordering, pagination) and the scope (e.g., events for a specific program, but not if it covers all or limited). Adequate but with gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has one required parameter (programId) with 0% description coverage. The description only indirectly references it by saying 'for a program', which is minimal. It adds no format, example, or additional meaning beyond the schema, failing to compensate for the low coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool provides 'Time-ordered events only for a program' and cites the differentiator: 'when it appeared, changed, closed'. This clearly identifies the verb (retrieve timeline) and resource (program events), distinguishing it from sibling tools like subsidy_watch_get (full record) or subsidy_watch_search (search).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by mentioning 'the differentiator', but does not explicitly state when to use this tool over alternatives (e.g., 'use subsidy_watch_get for full record details'). There is no mention of prerequisites or exclusions, leaving the agent to infer usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subsidy_watch_verify_ledgerCInspect
Verify the hash-chain integrity of a program (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| programId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It discloses the main behavior and return fields (chainValid, brokenAt, etc.) but does not mention side effects, authentication needs, or rate limits. It is adequate but not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that front-loads the action and then lists return fields. It is concise without fluff, though additional structure for clarity could improve it slightly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input schema and no output schema, the description covers the primary purpose and return values. However, it lacks details on error handling, programId format, or usage constraints, making it incomplete for a fully seamless agent interaction.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description does not elaborate on the only parameter 'programId', despite 0% schema description coverage. It mentions 'program' in the purpose but does not connect it to the parameter, leaving the agent to infer its meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool verifies hash-chain integrity for tamper detection, with a specific verb and resource. It distinguishes from sibling tools by focusing on 'program' integrity, though the name 'subsidy_watch_verify_ledger' provides domain specificity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like other verify_ledger tools or other watch operations. The description lacks context for appropriate usage scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tos_watch_getCInspect
Get a ToS snapshot detail plus full event timeline. Returns firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It implies a read operation but does not state whether authentication is required, rate limits, or behavior on invalid itemId. Only mentions return fields, missing key behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is a single sentence with clear verb-object structure and output fields. It is short but not wasteful. Could be slightly improved by removing redundancy or adding parameter hint.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given one parameter and no output schema, description should provide complete context. It fails to explain itemId, what a 'ToS snapshot' entails, or any side effects. The tool is part of a family, but the description alone is insufficient for full understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% and description provides no extra meaning for parameter 'itemId'. It does not explain what itemId represents (e.g., ToS snapshot ID) or constraints. This lack of semantics hinders correct invocation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states 'Get a ToS snapshot detail plus full event timeline', clearly identifying the action and resource. It distinguishes from siblings like 'tos_watch_search' and 'tos_watch_timeline' by implying this retrieves a specific snapshot, but does not explicitly differentiate it from 'tos_watch_timeline' which might also return timelines.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives. With many sibling tools (e.g., tos_watch_recent_changes, tos_watch_search, tos_watch_timeline), the description offers no usage context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tos_watch_recent_changesBInspect
Recent revised events across all SaaS ToS documents since the given ISO8601 timestamp. Each item includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| since | Yes | ||
| vendor | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. It discloses that the tool returns events across all documents and includes firstSeenAt and ledgerVerified, but lacks details on pagination, rate limits, or behavior for invalid timestamps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is one sentence (18 words) and front-loads the core purpose. It contains no fluff and is efficiently structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 3 parameters and no output schema, the description covers the purpose and key output fields but omits usage of limit and vendor. It is adequate for a simple watch tool but not fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Description mentions 'since the given ISO8601 timestamp' which corresponds to the required 'since' parameter, but does not explain 'limit' or 'vendor' parameters. With 0% schema description coverage, the description should compensate, but it adds minimal value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it returns recent revised events across SaaS ToS documents since a timestamp, and mentions output fields firstSeenAt and ledgerVerified. However, it does not explicitly differentiate from sibling tools like tos_watch_get or tos_watch_search, but the name 'recent_changes' provides context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives such as tos_watch_timeline or tos_watch_search. The description only states what the tool does, not when it is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tos_watch_searchBInspect
Search Japanese / English-language SaaS Terms of Service snapshots (Stripe / Anthropic / AWS / Google Cloud / GitHub …). Stage 1 covers 'terms' docs. Each hit includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | No | タイトル / 本文先頭抜粋・部分一致 | |
| vendor | No | 'stripe' / 'anthropic' / 'aws' / 'gcp' / 'github' … | |
| docType | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that hits include firstSeenAt and ledgerVerified, and that Stage 1 covers 'terms' docs. No mention of read-only status, auth needs, or query behavior. Lacks detail expected without annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words. Purpose is front-loaded. Efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Missing important context: query syntax, pagination, full vendor list, other docTypes (privacy, pricing, sla) exist in schema but not mentioned. Incomplete for a search tool with 4 parameters and no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Description adds context that search covers 'terms' docs and result fields, but limit and query semantics are not explained beyond schema. Schema coverage is 50% so description partially compensates.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it searches Terms of Service snapshots, specifies language and examples. Distinguishes from get/timeline tools but not from other watch search tools like bid_watch_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use search vs get/recent_changes/timeline. No alternatives mentioned. Agent must infer from context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tos_watch_timelineBInspect
Time-ordered events only for a ToS document (the differentiator: when it appeared and each revision since). Includes firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries full burden. It discloses that the tool returns timeline events with specific fields, but does not mention whether it is read-only, any side effects, pagination, or other behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is short and to the point, with two sentences. It avoids unnecessary words but could be slightly more structured by explicitly linking the parameter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, no output schema), the description lacks essential details about the parameter and return format, making it incomplete for effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description provides no explanation for the sole parameter 'itemId', leaving its meaning entirely unclear. With 0% schema coverage, the description fails to compensate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool returns time-ordered events for a ToS document, specifically its first appearance and each revision since. It distinguishes itself from sibling timeline tools by focusing on 'ToS document' and mentions included fields.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when a timeline of ToS document events is needed, but it does not explicitly contrast with other ToS tools like tos_watch_get or tos_watch_recent_changes, nor does it provide when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tos_watch_verify_ledgerBInspect
Verify the hash-chain integrity of a ToS document (tamper detection). Returns chainValid, brokenAt (if any), checked event count, firstSeenAt and ledgerVerified.
| Name | Required | Description | Default |
|---|---|---|---|
| itemId | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses return fields but does not state if it is read-only, requires authentication, or has side effects. Minimal but no contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One sentence plus a list of return fields. Extremely concise with no filler, and key information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Tool is simple with 1 parameter and no output schema. Description covers return fields but omits parameter explanation. Adequate but not fully comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'itemId' is not explained in the description or schema. With 0% schema description coverage, the description should clarify what this ID represents, but it does not, leaving the agent to guess.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool verifies hash-chain integrity of a ToS document for tamper detection, using a specific verb and resource. It distinguishes from sibling tools like tos_watch_get or tos_watch_search by focusing on integrity verification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives. Does not specify prerequisites, typical scenarios, or contrast with other watch tools like tos_watch_timeline or tos_watch_recent_changes.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!