gateway

by ai.duvera

Server Details

Governed AI actions with signed, verifiable receipts: free keyless reads, human-approved writes.

Status: Healthy
Last Tested: 2026-07-23 01:31
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

B3.3/5.0

Tool DescriptionsB

Average 3.6/5 across 52 of 52 tools scored. Lowest: 2.2/5.

Server CoherenceA

Disambiguation5/5

Each tool is prefixed with its service name, and within each service, tools have distinct actions (e.g., mail_read vs. availability_find). The duvera tools cover different subdomains like dev, finance, and food with no overlap, making selection unambiguous.

Naming Consistency4/5

All tools follow the pattern service__action_object or service__category_action, using lowercase with underscores. The order of verb and noun varies slightly (e.g., package_track vs. boardingpass_show), but the naming is highly predictable and readable.

Tool Count4/5

With 52 tools, the server is large but justified as a gateway aggregating many external services. Each tool corresponds to a common task for its service, so no tool feels extraneous, though the total number is high.

Completeness3/5

The tool surface covers a wide array of services but only provides one or two basic operations per service (mostly read-only). While this suits a quick-lookup gateway, deeper workflows (e.g., creating or updating resources) are missing, leaving gaps for many use cases.

Available Tools

52 tools

amazon__package_trackBInspect

[amazon · risk:low] Look up the delivery status of an Amazon order (read-only)

ParametersJSON Schema

Name	Required	Description	Default
`order_id`	No	The order or tracking id to look up (defaults to most recent)

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must convey behavioral traits itself. It states the tool is 'read-only' and 'risk:low', but does not disclose other behavioral aspects such as authentication requirements, rate limits, or what happens when the order ID is invalid or not found. The description is insufficient for full transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that includes the risk label upfront. It is extremely concise with no extraneous information, earning top marks for efficiency.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple lookup tool with one optional parameter and no output schema, the description is adequate but minimal. It covers the basic purpose and read-only nature, but does not describe the return format or any edge cases. Given the lack of annotations, slightly more detail would be beneficial.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage: the single parameter 'order_id' has a clear description explaining its purpose and default behavior. The tool description adds no additional meaning beyond what the schema already provides, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: looking up delivery status of an Amazon order. The verb 'look up' and resource 'delivery status of an Amazon order' are specific. While it doesn't explicitly differentiate from sibling order-tracking tools, the domain and name make the distinction clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes 'read-only' and 'risk:low', which give some usage context. However, it does not provide explicit guidance on when to use this tool versus alternatives (e.g., other delivery tracking tools) or when not to use it. The usage is implied but not elaborated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

applehealth__heart_rate_readBInspect

[applehealth · risk:medium] Read heart-rate data from Apple Health

ParametersJSON Schema

Name	Required	Description	Default
`end`	No	ISO-8601 end of the query window (defaults to now)
`start`	No	ISO-8601 start of the query window (defaults to today)

Tool Definition Quality

B3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose behavioral traits such as data limits, authorization requirements, or what the response contains. The tag '[risk:medium]' is ambiguous and not explained.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short (one sentence) but includes a non-standard tag '[applehealth · risk:medium]' that is not part of the tool's purpose. This detracts from conciseness and clear structure.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite simplicity and absence of output schema, the description lacks completeness. It does not explain what kind of heart-rate data is returned (e.g., time series, average), nor any constraints like data recency or permissions, leaving the agent underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the description is not required to add parameter details. However, it adds no value beyond the schema, merely repeating the parameters' existence without additional context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Read heart-rate data from Apple Health' using a specific verb ('Read') and resource ('heart-rate data from Apple Health'), effectively distinguishing it from siblings like 'steps_read' or 'sleep_read'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus its siblings (e.g., 'applehealth__steps_read'). There is no mention of prerequisites, alternatives, or scenarios where it is appropriate to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

applehealth__sleep_readBInspect

[applehealth · risk:medium] Read sleep-analysis data from Apple Health

ParametersJSON Schema

Name	Required	Description	Default
`end`	No	ISO-8601 end of the query window (defaults to now)
`start`	No	ISO-8601 start of the query window (defaults to last night)

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It only mentions 'Read' (implying non-destructive) and a 'risk:medium' tag, but does not explain the read behavior, such as whether it requires permissions, how data is returned, or any side effects. The tag is vague and not actionable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words. It is front-loaded with the key action and data type.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 optional params, no output schema), the description is minimally adequate but lacks completeness. It does not mention the default time windows implied by parameter defaults ('last night' for start, 'now' for end) or what the response looks like. More context would improve utility.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (both parameters are described with ISO-8601 format). The description adds no additional meaning beyond what the schema already provides, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Read sleep-analysis data from Apple Health' using a specific verb ('Read') and resource ('sleep-analysis data'), which distinguishes it from sibling tools like applehealth__heart_rate_read or applehealth__steps_read that read other types of health data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It lacks any context about prerequisites, typical scenarios, or when not to use it, which would help an agent decide between this and other Apple Health read tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

applehealth__steps_readCInspect

[applehealth · risk:medium] Read step-count data from Apple Health

ParametersJSON Schema

Name	Required	Description	Default
`end`	No	ISO-8601 end of the query window (defaults to now)
`start`	No	ISO-8601 start of the query window (defaults to today)

Tool Definition Quality

C2.7/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavioral traits, but it only says 'Read step-count data'. No information about historical scope, permissions, rate limits, or data granularity is provided.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is overly concise; a single sentence fails to convey necessary context. While it is front-loaded with the risk tag, it sacrifices completeness for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with no output schema, the description lacks completeness. It omits return format, date handling, and behavioral constraints, which are relevant given the presence of sibling health tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and parameter descriptions are adequate. The tool's description adds no extra meaning beyond the schema, so baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific verb 'Read' and resource 'step-count data' from Apple Health, distinguishing it from sibling tools like heart_rate and sleep reads.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, nor any prerequisites or exclusions. The risk tag 'medium' is too vague to serve as usage guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bluesky__trending_getAInspect

[bluesky · risk:low] Fetch the current trending topics on Bluesky. Returns topic names, display labels, and links to the topic feed. No account or API key required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description takes full responsibility. It discloses that no authentication is needed and describes the return data. It does not mention rate limits or data freshness, but these are not critical for such a simple read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, using one sentence to state the purpose, return values, and authentication requirements. It is front-loaded with the purpose and contains no superfluous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has no parameters and no output schema, the description adequately covers its behavior and return values. It could mention error handling or response format in more detail, but the current level is sufficient for such a simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has no parameters, so the baseline is 4. The description does not need to add parameter information because there are none.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Fetch' and the resource 'current trending topics on Bluesky', along with the specific return values. It is distinct from any sibling tools, as no other Bluesky tool is present.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states that no account or API key is required, providing clear guidance on when to use this tool. It implicitly indicates it is for current trends, but lacks explicit alternatives or when-not-to-use guidance, though not needed given the simplicity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

chase__balance_checkBInspect

[chase · risk:low] Read the current balance of a Chase account (read-only)

ParametersJSON Schema

Name	Required	Description	Default
`account`	No	The Chase account to check (defaults to primary)

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It only states 'read-only' but omits details on authentication, rate limits, error states, or what happens if the account is invalid. This is insufficient for a tool with no annotation support.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, consisting of a single sentence plus a tag. Every element serves a purpose with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read operation with 0 required params and no output schema, the description is barely adequate. It does not explain the return format or any side effects, but the tool's simplicity partly compensates.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the description adds no extra meaning beyond the schema's own description of the 'account' parameter. The tag '[chase · risk:low]' provides context but does not enhance parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Read' and the resource 'current balance of a Chase account', and explicitly marks it as read-only. It distinguishes the tool from siblings by specifying the financial institution.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. The description does not mention any prerequisites, conditions, or scenarios where other tools would be preferred.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

datadog__logs_viewBInspect

[datadog · risk:low] Search and read logs from Datadog over a time range

ParametersJSON Schema

Name	Required	Description
`to`	No	End of the time range (ISO 8601 or relative)
`from`	No	Start of the time range (ISO 8601 or relative)
`query`	Yes	Datadog log search query

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description only states it is a read/search operation. Lacks disclosure of behavioral traits such as authentication needs, rate limits, or error handling, which is insufficient for a safe agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single-sentence description with no unnecessary words. Front-loaded with risk level tags, efficiently conveying purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Description does not specify return format, pagination, default time range behavior, or how to handle no results. Incomplete for a search tool with no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with descriptions for all parameters. The description adds minimal context ('over a time range') but does not significantly enhance understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly specifies action (search and read), resource (logs from Datadog), and scope (over a time range). It effectively distinguishes from sibling tools which are for other services.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The description does not mention prerequisites, nor when not to use it, leaving the agent to infer from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delta__boardingpass_showBInspect

[delta · risk:low] Pull up the boarding pass for a Delta flight

ParametersJSON Schema

Name	Required	Description	Default
`passenger`	No	Passenger last name
`confirmation_code`	Yes	Booking confirmation code

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It indicates a low-risk read operation via the risk tag, but does not disclose prerequisites like authentication, data availability, or any potential side effects. The description is too minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that conveys the essential purpose without any wasted words. It is appropriately sized and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity, the description lacks context about return format, authentication requirements, or any limitations. For a straightforward tool, more completeness would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already documents both parameters sufficiently. The description adds no additional meaning beyond what is in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Pull up' and the resource 'boarding pass', and specifies 'Delta flight', which distinguishes it from generic boarding pass tools like wallet__boardingpass_show. The risk tag provides additional context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While the description implies use for Delta flights by naming the airline, it does not explicitly state when to use this tool over alternatives or when not to use it. No exclusions or alternative references are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__dev_npm_packageAInspect

[duvera · risk:low] Look up the latest version, description, license, and dependencies of an npm package. Works for scoped packages too (e.g. "@types/node"). No account required.

ParametersJSON Schema

Name	Required	Description	Default
`package`	Yes	npm package name, e.g. "express" or "@types/node".

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description carries full burden. It explains return fields (version, description, license, dependencies) and indicates read-only (low risk). Misses details like rate limits or caching, but adequate for simple tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, no redundancy. Prefix with risk level is helpful. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description lists return fields and notes no account required. Single parameter is well-documented. Complete for a simple lookup tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with example. Description adds value by explicitly mentioning scoped packages (e.g., '@types/node'), reinforcing parameter usage beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it looks up npm package details (version, description, license, dependencies). It distinguishes from siblings like duvera__dev_pypi_package by specifying npm, and mentions scoped packages.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States 'No account required' implying public packages only, but does not explicitly say when not to use or compare with alternatives. Sibling tools provide implicit differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__dev_pypi_packageAInspect

[duvera · risk:low] Look up the current version, summary, license, and homepage of a Python package on PyPI. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`package`	Yes	PyPI package name, e.g. "requests" or "fastapi".

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description indicates a read-only operation ('Look up') and requires no authentication. It does not cover error handling or rate limits, but for a simple lookup this is adequate. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, uses two sentences, and front-loads useful metadata. Every sentence is necessary and informative, with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema), the description adequately covers the return fields. It could mention the return format (likely JSON) but is otherwise complete for a lookup tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with the parameter 'package' having a clear description and examples. The tool description adds no additional meaning beyond the schema, meriting the baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool looks up a Python package on PyPI and lists the specific fields returned (version, summary, license, homepage). It distinguishes itself from sibling tools like duvera__dev_npm_package by specifying 'Python package' and mentioning PyPI.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes 'No account required' which guides on prerequisites. It implicitly tells when to use this tool (for Python package info) versus siblings for other domains, though it does not provide explicit when-not-to-use or fallback scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__dev_stackoverflow_searchAInspect

[duvera · risk:low] Search Stack Overflow questions by keyword. Returns top 5 by relevance with title, link, score, and answer status. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search terms, e.g. "goroutine leak detection" or "pandas merge on index".

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description reveals key behaviors: returns top 5 results by relevance, includes specific fields (title, link, score, answer status), and requires no account. It does not mention potential rate limits or error handling, but covers the essential behavior for a read-only search.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, fitting the key information into a single sentence. Every word serves a purpose, including the risk label and the output specifics.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description adequately explains what the user gets back (top 5 results with fields). It covers the core functionality for a simple search tool, though it omits details like error handling or result count variation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'query' has a detailed schema description with examples. The tool description reinforces 'by keyword' but adds no new meaning beyond the schema. With 100% schema coverage, the baseline is 3, and the description does not elevate it further.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it searches Stack Overflow questions by keyword, specifying the resource (Stack Overflow) and action (search). It distinguishes itself from siblings like duvera__github_search_repos or duvera__news_search by naming the platform explicitly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for Stack Overflow queries with 'No account required' indicating ease of use. However, it does not explicitly mention when to use this tool over alternatives (e.g., other search tools in the sibling list), though the platform specificity provides implicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__finance_crypto_priceAInspect

[duvera · risk:low] Current spot price for one or more cryptocurrencies (CoinGecko ids like "bitcoin,ethereum,solana") in a fiat currency (default USD). Read-only market data. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`ids`	Yes	Comma-separated CoinGecko coin ids, e.g. "bitcoin" or "bitcoin,ethereum,solana".
`vs_currencies`	No	Comma-separated fiat/quote currencies (default "usd"), e.g. "usd,eur".

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears full burden. It states read-only and no account required, but lacks details on rate limits, data freshness, or error behavior. Adequate but minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence containing key information (purpose, example, safety). It is efficient but could benefit from structured formatting (e.g., bullet points) for readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with two parameters and no output schema, the description covers purpose, parameters (via schema), and safety. No critical missing information for basic usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The description adds an example and default currency note, but does not significantly enhance understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool gets current spot prices for cryptocurrencies in fiat, with examples and default currency. It distinguishes from siblings like duvera__finance_exchange_rates by specifying crypto prices.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions read-only and no account required, implying safe usage, but does not explicitly state when to use this tool versus alternatives or provide exclusions. With many sibling tools, some guidance would improve clarity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__finance_exchange_ratesAInspect

[duvera · risk:low] Live USD exchange rates against major currencies. Read-only, no account required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explicitly states 'Read-only, no account required,' which discloses key behavioral traits (safety, no authentication) beyond what annotations would provide. Since no annotations exist, the description adequately covers behavioral transparency for this simple tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: a tag followed by a single sentence stating purpose and key traits. Every word is meaningful and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no parameters and no output schema, the description provides sufficient information: purpose, read-only nature, and no account requirement. It could mention which major currencies are covered, but overall it is complete enough for selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has no parameters, so the baseline is 4. The description adds no parameter information, but none is needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides live USD exchange rates against major currencies. It uses a specific verb (provides) and resource (exchange rates), and the scope (USD vs major currencies) distinguishes it from sibling tools like duvera__finance_crypto_price.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not explicitly state when to use this tool over alternatives or provide conditions for use. While the context implies it's for retrieving fiat exchange rates, no guidance on when not to use or what alternatives exist is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__food_recipe_searchAInspect

[duvera · risk:low] Search recipes by dish name (e.g. "arrabiata", "pad thai"). Returns ingredients, instructions, category, and cuisine. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Dish name to search for, e.g. "carbonara".

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses 'No account required' and implies read-only operation. With no annotations, description adds useful behavioral context beyond schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first defines purpose and input, second specifies output and access requirement. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given simplicity (1 param, no output schema), description covers what tool does, what it returns, and notable behavior (no account). Complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema already describes 'query' parameter (100% coverage). Description adds concrete examples ('arrabiata', 'pad thai'), enhancing understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states action 'search', resource 'recipes', input 'dish name', and outputs 'ingredients, instructions, category, and cuisine'. Differentiates from sibling 'duvera__food_restaurant_search'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for finding recipe details by dish name, but no explicit comparison to alternatives or when-not-to-use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__food_restaurant_searchAInspect

[duvera · risk:low] Search for restaurants and food places by name or location using OpenStreetMap. Read-only, no account required.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search query, e.g. "pizza in San Francisco" or "sushi near me".

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses nondestructive nature and no-auth requirement, but lacks details on rate limits, data freshness, or error behavior. With no annotations, description carries full burden; basic transparency provided.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences cover tool identity, risk level, purpose, data source, and behavioral traits. No redundancy; every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema), the description fully addresses what the agent needs to know: what it does, how to query, and that it's safe to invoke.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'query' is fully described with examples ('pizza in San Francisco', 'sushi near me') and context from the main description confirms it accepts name/location. Schema coverage is 100% and description adds meaningful guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action (Search) and resource (restaurants and food places) with location/name qualifier and data source (OpenStreetMap). It distinguishes from sibling 'duvera__food_recipe_search' by focusing on eateries rather than recipes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States 'Read-only, no account required' implying safe usage context, but does not explicitly mention when to prefer this over sibling tools or note any limitations like search result volume.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__geo_geocodeAInspect

[duvera · risk:low] Convert a city or place name (e.g. "Berlin", "San Francisco") to latitude/longitude, country, timezone, and population. Use this before weather.current or weather.air-quality when you only have a place name. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Place name to look up, e.g. "Berlin" or "Springfield, Illinois".

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses read-only behavior, no account required, and returns specific outputs (lat/lng, country, timezone, population). No annotations provided, so description adequately covers behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, no wasted words. Front-loaded with purpose, followed by usage guidance. Efficiently conveys all necessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Single-parameter tool with well-documented input. Description specifies all output fields and provides context for integration with weather tools. Adequate for the tool's low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a good description for the parameter. The tool description adds usage context and examples but does not significantly enhance meaning beyond the schema. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool converts a city/place name to latitude/longitude, country, timezone, and population. Specific verb 'Convert' with listed outputs. Distinguishes from sibling weather tools by being a prerequisite.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises to use this tool before weather.current or weather.air-quality when only a place name is known. Provides clear context on when it is needed, though does not explicitly list alternative approaches.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__github_latest_releaseAInspect

[duvera · risk:low] Get the latest published release (tag, name, notes, date) of a public GitHub repository. Returns 404 for repos that publish no releases. No auth required.

ParametersJSON Schema

Name	Required	Description	Default
`repo`	Yes	Repository name, e.g. "express".
`owner`	Yes	Repository owner, e.g. "expressjs".

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description discloses key behaviors: risk level (low), no auth required, and 404 for repos without releases. However, it omits potential rate limits or time constraints. The disclosure is adequate for a simple read tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no fluff. It front-loads the core action and includes essential details without unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity and lack of output schema, the description covers the return fields and failure case. It is complete enough for an agent to understand what to expect.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description adds no new meaning beyond the parameter descriptions in the schema. The description does not elaborate on parameter usage or values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool gets the latest release of a public GitHub repo, specifying the exact fields returned (tag, name, notes, date). It distinguishes from sibling duvera tools like duvera__github_read_file and duvera__github_search_repos by focusing on releases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when it's not applicable (repos with no releases) and notes no auth is required. It does not explicitly contrast with sibling tools but implicitly defines its scope.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__github_read_fileBInspect

[duvera · risk:low] Read a file from a public GitHub repository. Read-only.

ParametersJSON Schema

Name	Required	Description
`path`	No	File path within the repository, e.g. "README.md". Defaults to README.md.
`repo`	Yes	Repository name, e.g. "Hello-World".
`owner`	Yes	Repository owner (username or org), e.g. "octocat".

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Only discloses 'Read-only' behavior. No annotations provided, so description carries full burden but omits rate limits, authentication, error handling, or response details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise with no wasted words. Front-loaded with risk indicator and purpose. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Simple tool but lacks details on return format, max file size, or encoding. No output schema, so description should compensate but doesn't. Adequate but not complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so description adds no extra meaning beyond what schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states verb 'read', resource 'file from a public GitHub repository', and notes 'Read-only'. Distinguishes from sibling tools like duvera__github_latest_release and duvera__github_search_repos.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. Does not mention prerequisites, limitations, or when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__github_search_reposAInspect

[duvera · risk:low] Search public GitHub repositories by keyword. Returns top 5 results by stars. No auth required.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search query (e.g. "machine learning python", "react component library").

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses no auth required, top 5 results by stars, and risk level. Adds value beyond schema, but does not mention rate limits or output details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise with one sentence plus risk label. No wasted words, front-loaded with essential info.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description explains what is returned (top 5 by stars). Sufficient for a simple search tool, though lacks format details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for the single 'query' parameter with examples. Tool description does not add further semantics beyond what schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states verb 'Search' and resource 'public GitHub repositories'. Distinct from sibling tools like duvera__github_latest_release and duvera__github_read_file.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for searching repos but lacks explicit when-to-use/alternatives guidance. Does not mention when another tool would be more appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__news_searchAInspect

[duvera · risk:low] Search Hacker News stories by keyword, ranked by relevance. Returns title, URL, points, author, and comment count. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search terms, e.g. "model context protocol" or "postgres performance".

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description includes a risk label ('risk:low') and explicitly states no account is required, which adds behavioral transparency beyond the schema. However, it does not disclose any rate limits, pagination, or result count limits, which would further improve transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, efficient and front-loaded with the core purpose. Every word adds value, no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has only one parameter and no output schema, the description adequately covers what the tool does and what it returns. Minor gap: no mention of pagination or result count, but the simplicity of the tool makes it sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema already describes the query parameter. The description adds a couple of examples and notes that results are ranked by relevance, which is useful but not essential. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it searches Hacker News stories by keyword, ranked by relevance, and specifies exactly what is returned (title, URL, points, author, comment count). This differentiates it from sibling tools like hackernews__stories_top, which likely retrieves top stories without keyword search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives context that no account is required, but does not explicitly state when to use this tool versus alternatives like duvera__news_top_stories or when not to use it. The usage guidance is implied but not explicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__news_top_storiesBInspect

[duvera · risk:low] Top stories from Hacker News. Read-only, no account required.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavior. It states read-only and no account required, but omits details about rate limits, pagination, output format, or how many stories are returned. The agent lacks critical behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise—one sentence that front-loads the risk label and purpose. Every word adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description should clarify what is returned (e.g., list of story titles, URLs). It only states 'top stories' without describing the response structure, leaving the agent underinformed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are zero parameters and 100% schema coverage; the description adds no parameter-specific information but also does not need to. According to guidelines, the baseline for 0 params is 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides top stories from Hacker News and is read-only with no account required. However, it does not distinguish itself from the sibling tool 'hackernews__stories_top' which likely serves a similar purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus alternatives like 'duvera__news_search' or 'hackernews__stories_top'. The agent receives no context for choosing this over similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__reference_dictionaryAInspect

[duvera · risk:low] Definitions, phonetics, part of speech, and examples for an English word. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`word`	Yes	English word to define, e.g. "governance".

Tool Definition Quality

A3.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description must convey behavioral traits. It mentions 'risk:low' and 'No account required' but does not disclose if it's read-only, error handling, or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the main purpose. Every word adds value. Minimal and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple one-parameter tool, the description covers the output types. However, it lacks details on error handling, word form handling, or supported languages, which would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema describes the 'word' parameter as 'English word to define'. The description adds value by listing output components but does not significantly enhance parameter semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides definitions, phonetics, part of speech, and examples for an English word. This distinguishes it from sibling reference tools like wiki_search or news_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for English word lookups and notes no account required, but does not explicitly exclude alternative tools or state when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__reference_public_holidaysAInspect

[duvera · risk:low] Public holidays for a given year and ISO country code (e.g. 2026 + "US"). Includes national and regional holidays with dates and names. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`year`	Yes	Calendar year, e.g. 2026.
`country`	Yes	ISO 3166-1 alpha-2 country code, e.g. "US", "DE", "JP".

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full behavioral burden. It states the tool returns national and regional holidays with dates and names, and includes a risk label 'risk:low'. It does not mention potential limitations like year range, rate limits, or result size, but the core behavior is disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with a brief prefix, immediately front-loading the tool's purpose and an example. Every word adds value, with no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity and lack of output schema, the description adequately explains what the tool returns (dates and names of national/regional holidays). It does not cover error handling or edge cases, but for a straightforward lookup tool, it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already defines the parameters. The description adds an illustrative example and confirms the country code format, which is helpful but not essential. It does not add new semantic information beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it provides public holidays for a given year and ISO country code, including national and regional holidays with dates and names. The example and clear verb 'includes' leave no ambiguity about the tool's purpose, and it's distinct from sibling tools like dictionary or time.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear example (2026 + 'US') and explicitly notes 'No account required,' which guides usage. However, it does not discuss when to use this tool versus alternatives (e.g., other holiday or reference tools) or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__time_nowAInspect

[duvera · risk:low] Current date and time in an IANA timezone (e.g. "America/New_York", "Asia/Tokyo"). Includes day of week and DST status. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`timezone`	Yes	IANA timezone name, e.g. "America/New_York", "Europe/Berlin", "Asia/Tokyo".

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It declares the tool is low risk and requires no account, which are helpful behavioral hints. It also specifies output includes day of week and DST status. However, it does not mention idempotency or rate limits, though for a simple read operation this is acceptable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two sentences. The first sentence front-loads risk level and core functionality, the second adds detail. No wasted words, perfectly sized for a simple tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description helpfully mentions the output includes day of week and DST status. It could be more complete by specifying time format or example outputs, but for a straightforward tool, it is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema already describes the 'timezone' parameter with examples. The description repeats the parameter info and provides more examples, but does not add deeper semantic meaning beyond what the schema offers. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns the current date and time for a given IANA timezone, including day of week and DST status. It uses specific verbs ('current date and time') and resource ('IANA timezone'), distinguishing it from any potential siblings by specifying the unique output details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'No account required' which guides readiness, but lacks explicit when-to-use or when-not-to-use guidance relative to other tools. It does not name alternatives, though the sibling list includes many unrelated tools, so differentiation is not needed here.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__travel_flight_searchAInspect

[duvera · risk:low] List aircraft currently airborne within a radius of a point (adsb.lol ADS-B data). Use geo.geocode to get a city's coordinates first. Returns callsign, altitude, speed, and position. No account required.

ParametersJSON Schema

Name	Required	Description
`latitude`	Yes	Latitude of the search center (e.g. 40.64 for JFK).
`longitude`	Yes	Longitude of the search center (e.g. -73.78 for JFK).
`radius_nm`	Yes	Search radius in nautical miles (1-250), e.g. 50.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses that no account is required, the data source (adsb.lol), and the returned fields (callsign, altitude, speed, position). It could mention rate limits or limitations, but overall transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise—three sentences—with the main action front-loaded. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains return values. The tool is simple: three required parameters, no nested objects. The description covers its purpose and usage completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description does not add additional parameter semantics beyond what the schema provides; it only reinforces the geocode usage hint.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists aircraft currently airborne within a radius, using ADS-B data. It distinguishes itself from similar tools like opensky__flights_live by specifying the data source and usage pattern.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises using geo.geocode to get coordinates first, providing clear context for when to use this tool. It does not mention alternatives or when not to use, but the guidance is specific and helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__weather_air_qualityAInspect

[duvera · risk:low] Current air quality (US AQI, PM2.5, PM10, ozone) for a latitude/longitude. Use geo.geocode first if you only have a place name. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`latitude`	Yes	Latitude of the location (e.g. 37.77 for San Francisco).
`longitude`	Yes	Longitude of the location (e.g. -122.42 for San Francisco).

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully covers behavior: it's a read-only, current-data retrieval tool, requiring only lat/lng, with no authentication or special permissions needed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the key purpose and metrics, no unnecessary words. Efficient and easy to scan.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description lists return components (AQI, PM2.5, PM10, ozone) and clarifies 'Current air quality'. Covers prerequisites and limitations adequately for a simple read tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The tool description adds value by hinting at the coordinate source (geocode), but goes slightly beyond baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns current air quality data (US AQI, PM2.5, PM10, ozone) for a lat/lng, distinguishing it from siblings like duvera__weather_current (weather) and geo.geocode (place name to coordinates).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises to use geo.geocode first if only a place name is available—this differentiates from related tools and provides clear usage context. Also notes 'No account required' for barrier-free use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__weather_currentAInspect

[duvera · risk:low] Current temperature, conditions, wind, and humidity for a latitude/longitude (Open-Meteo). Use geo.geocode first if you only have a city or place name. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`latitude`	Yes	Latitude of the location (e.g. 37.77 for San Francisco).
`longitude`	Yes	Longitude of the location (e.g. -122.42 for San Francisco).

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description discloses that no account is required (no authentication) and implies a read-only operation. It does not mention rate limits or data source refresh, but the core behaviors are clear.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the risk tag and core functionality. Every word contributes value, with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately lists the returned data fields (temperature, conditions, wind, humidity). It does not specify response format or data freshness, but for a simple current weather tool, this is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds minimal extra meaning beyond the schema, just noting that coordinates are for a location and mentioning geocode. It does not elaborate on parameter constraints or formatting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool returns 'Current temperature, conditions, wind, and humidity' for a latitude/longitude pair. It distinguishes itself from sibling tools like 'open_meteo__weather_forecast' and 'duvera__weather_air_quality' by specifying 'Current' and listing returned elements.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises using 'geo.geocode first if you only have a city or place name', which is a clear usage instruction. However, it does not explicitly mention alternatives like 'open_meteo__weather_forecast' for forecast data, leaving minor ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__wiki_searchAInspect

[duvera · risk:low] Search Wikipedia articles by keyword. Returns matching page titles, keys, and excerpts. Pass a result's key to wiki.summary for the article summary. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search terms, e.g. "zero trust architecture" or "Ada Lovelace".

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must carry behavioral disclosure. It mentions low risk, no account required, and return structure (titles, keys, excerpts). However, lacks details on pagination, result limits, or error behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with clear front-loading: first sentence states purpose, second adds usage context and sibling reference. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-parameter search tool with no output schema, the description adequately explains input (keyword), output (titles, keys, excerpts), and follow-up action. Could note whether results are ranked or paginated, but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and parameter description already defines usage with examples. Description adds minimal extra value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Search Wikipedia articles by keyword' with specific verb and resource. It distinguishes from sibling tool wiki_summary by instructing users to pass keys to that tool for summaries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage guidance: no account required, and tells how to proceed with results by using wiki_summary. This differentiates when to use this tool vs the summary tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

duvera__wiki_summaryAInspect

[duvera · risk:low] Get the lead summary of a Wikipedia article by title (e.g. "Zero_trust_architecture"). Use wiki.search first to find the exact page key. No account required.

ParametersJSON Schema

Name	Required	Description	Default
`title`	Yes	Article title or page key, e.g. "Ada Lovelace" or "Zero_trust_architecture" (from wiki.search results).

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

While annotations are absent, the description compensates by stating the risk level (low), that no account is needed, and that only the lead summary is returned. It does not detail error handling or rate limits, but for a simple read tool with a single parameter, this is adequate transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loads the risk label and core action, and every word contributes meaning. There is no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one required parameter, no output schema, high schema coverage), the description covers all essential aspects: what it does, how to invoke it, prerequisites, and access requirements. No gaps are present.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema already describes the 'title' parameter. The description adds value by explaining the tool's dependency on wiki.search for obtaining the exact page key, providing an underscore example, and reinforcing the parameter's purpose.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it retrieves the lead summary of a Wikipedia article by title, using the specific verb 'Get' and resource 'lead summary'. It differentiates from its sibling 'duvera__wiki_search' by instructing to use search first, establishing clear separation of concerns.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly instructs to use wiki.search first to find the exact page key, provides an example title format, and notes that no account is required. This gives clear guidance on when and how to use the tool, including prerequisites and access requirements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gmail__mail_readBInspect

[gmail · risk:low] Read recent emails from the Gmail inbox

ParametersJSON Schema

Name	Required	Description	Default
`label`	No	Optional label/folder to read from (defaults to INBOX)

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description only says 'read recent emails' without specifying how many, pagination, or any behavioral details like authentication requirements or limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise at one sentence, but lacks critical information about the tool's behavior and usage context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity (1 optional parameter, no output schema), the description is somewhat complete but misses details like the number of recent emails or default behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the parameter description already present. The description adds no additional meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool reads recent emails from Gmail inbox, with a specific verb and resource. It distinguishes from sibling tools like google_workspace__mail_search which focuses on search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives (e.g., mail_search). No mention of prerequisites or context for use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

googlecalendar__availability_findBInspect

[googlecalendar · risk:low] Find open time slots within a date range in Google Calendar

ParametersJSON Schema

Name	Required	Description
`end`	Yes	End of the search window in ISO 8601 format
`start`	Yes	Start of the search window in ISO 8601 format
`duration_minutes`	No	Desired slot length in minutes

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden for behavioral disclosure. It only includes a risk tag ('risk:low') but does not describe side effects, authentication needs, rate limits, or whether the operation is idempotent. For a read-like tool, more transparency is expected.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise, one sentence with a tag. It fronts the core purpose and avoids unnecessary words. However, the brevity borders on underspecification, missing context that could be added without much length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description should explain the return format (e.g., list of time ranges) but does not. The tool has three parameters but no guidance on output. For a search tool, the description is incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds minimal meaning beyond the schema: it explains that the parameters define a 'date range' and 'desired slot length', which aligns with the schema descriptions. No additional constraints or examples are given.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('Find') and resource ('open time slots within a date range in Google Calendar'). It distinguishes from sibling tools like microsoft365__calendar_list, which lists events, not availability.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. It does not mention prerequisites, when not to use, or suggest alternative tools for related tasks. The context is implied but not elaborated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

google_workspace__mail_searchBInspect

[google-workspace · risk:low] Search Mail via Gmail, governed by Duvera.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search query, e.g. 'invoices from Acme last month'.

Tool Definition Quality

B3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavior. It only notes 'risk:low', but lacks details on authentication, rate limits, or whether it returns full emails or summaries.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence front-loaded with a risk tag. No unnecessary words, though it could be expanded slightly to add value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema and no annotations. The description does not explain return format, pagination, or authentication requirements, leaving significant gaps for agent execution.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'query' has a description with an example ('invoices from Acme last month'), adding concrete guidance beyond the schema's type definition. Schema coverage is 100%.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Search Mail via Gmail', clearly indicating the verb and resource. It distinguishes from sibling tools like 'gmail__mail_read' and 'microsoft365__mail_search' by specifying the service, though it could be more specific about search scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like 'microsoft365__mail_search' or other search tools. No exclusions or prerequisites mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

grubhub__order_trackAInspect

[grubhub · risk:low] Check the live status and ETA of a Grubhub order

ParametersJSON Schema

Name	Required	Description	Default
`order_id`	Yes	Identifier of the order to track

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so description carries the full burden. It explicitly states it checks live status and ETA, implying a read-only operation. The prefix '[grubhub · risk:low]' adds context about risk level. However, it does not mention potential error conditions or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words. It includes a useful prefix for risk context. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple one-parameter read tool with no output schema, the description covers the core functionality (check live status and ETA). It omits details about the response format but is otherwise sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description for the sole parameter ('Identifier of the order to track'). The description does not add additional meaning beyond the schema, which is adequate. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verb 'Check' and clearly identifies the resource as 'live status and ETA of a Grubhub order'. It distinguishes from sibling tools like 'amazon__package_track' by specifying the service and what is tracked.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not state prerequisites, when-not-to-use, or mention sibling tools for comparison.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hackernews__stories_topAInspect

[hackernews · risk:low] Fetch the current list of top-story ids on Hacker News. Returns an array of item ids (resolve details via the item endpoint). No account or API key required; the hacker-news.firebaseio.com endpoint is open and unmetered.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses return type (array of ids) and need to resolve details via item endpoint. Notes open and unmetered. Without annotations, this adequately describes behavior for a simple read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose, no wasted words. Concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Completely covers purpose, return type, and usage details. No output schema needed; tool is simple and description suffices.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters, so description adds value by explaining operation. Baseline 4 for 0 parameters; no additional param info needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it fetches top-story ids from Hacker News, with specific verb 'fetch' and resource. Distinguishes from siblings by specifying Hacker News, not other news sources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States no account or API key required and endpoint is open/unmetered, guiding ease of use. Lacks explicit comparison to sibling news tools, but context implies it's for HN.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

instacart__menu_searchCInspect

[instacart · risk:low] Search for grocery items and stores in the Instacart app

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search terms, e.g. "organic bananas"
`store`	No	Store name or id to scope the search

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It only states a risk label and the search action, but lacks details on authentication, rate limits, result format, or behavior on no results.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise single sentence plus risk tag. However, it is slightly too minimal, missing some useful context that could be included without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a search tool with no output schema, the description is incomplete. It does not indicate pagination, result structure, or error handling, leaving the agent without essential behavioral information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds only general context and the risk label, not significantly more than the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Search', the resource 'grocery items and stores', and the context 'Instacart app'. It is specific enough to distinguish from unrelated sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. No when-not-to-use or prerequisites are mentioned. The implied usage is basic.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

maps__eta_shareCInspect

[maps · risk:low] Share your estimated time of arrival with a contact from Apple Maps

ParametersJSON Schema

Name	Required	Description	Default
`contact`	Yes	Contact name or number to share ETA with
`destination`	No	Destination address or place name for the active trip

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description only says 'share', which implies a mutation but does not disclose effects (e.g., sends a message, requires permissions, modifies any state). Little behavioral transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single short sentence, no wasted words. However, it is so concise that it omits useful context; still efficient in form.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, and the description does not explain what happens after sharing (e.g., confirmation, success indicator). Missing prerequisites like active trip. Incomplete for a mutation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with descriptions for both parameters. The description adds no extra meaning beyond the schema, so baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (share ETA) and resource (with a contact from Apple Maps), distinguishing it from other tools. It is specific but does not specify that an active trip is required.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives, nor prerequisites such as having an active navigation session. The description implies usage but lacks context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

microsoft365__calendar_listCInspect

[microsoft365 · risk:low] List Calendar via Outlook Calendar, governed by Duvera.

ParametersJSON Schema

Name	Required	Description	Default
`range`	No	Date range, e.g. 'this week'.

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. The description does not disclose read/write behavior, authentication requirements, rate limits, or return format. The phrase 'governed by Duvera' is vague about any restrictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise (one line with a tag and a phrase). No extra words, but lacks front-loading of the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (one optional param, no output schema), the description fails to state what is returned (events or calendars), default behavior, or when to use. It is incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single 'range' parameter. The description says 'Date range, e.g. 'this week'', which matches the schema but does not add additional semantics like default behavior if omitted or exact syntax.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description says 'List Calendar via Outlook Calendar', which is ambiguous: it could mean listing calendars or listing events. The optional 'range' parameter suggests events, but the resource is not clearly defined. Distinguishes from siblings only by platform, not by function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like googlecalendar__availability_find. No mention of prerequisites, scope, or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

microsoft365__mail_searchBInspect

[microsoft365 · risk:low] Search Mail via Outlook / Exchange, governed by Duvera.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search query, e.g. 'invoices from Acme last month'.

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must convey behavioral traits. It states risk:low and governance by Duvera, but lacks details on read-only behavior, result format, pagination, authentication, or rate limits. The description is insufficient for an agent to understand side effects or constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is short (one sentence plus risk tag) and to the point. It front-loads the risk level, which is helpful. While concise, it could include more detail without becoming verbose. It avoids wordiness, earning a high score for efficiency.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single parameter, no output schema, no nested objects), the description is minimally adequate. It states the main action and system. However, it omits return value behavior, error scenarios, and differentiation from similar siblings, leaving gaps for complex queries.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (one parameter with description and example). The tool description itself adds no further parameter semantics beyond the schema. Baseline at 3 is appropriate as the schema does the job without additional elaboration from the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Search Mail via Outlook / Exchange'. The verb 'Search' and resource 'Mail' are specific, and the system (Outlook/Exchange) is named, distinguishing it from sibling tools like google_workspace__mail_search or gmail__mail_read.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives (e.g., when to choose this over google_workspace__mail_search or slack__message_search). It does not mention prerequisites, limitations, or context for appropriate use. The 'governed by Duvera' hint is vague.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

notion__notes_searchCInspect

[notion · risk:low] Search pages and notes in Notion by query

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search query string

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description only includes a static 'risk:low' tag. Does not disclose pagination, rate limits, authentication needs, or result details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Short, front-loaded sentence. Efficient but could be more informative without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple search tool with one parameter and no output schema, the description is adequate but lacks details on search behavior (e.g., full-text, advanced operators).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Single parameter 'query' is fully described in schema. Descriptions adds no extra meaning beyond schema coverage, so baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it searches pages and notes in Notion by query. However, it does not differentiate from siblings like obsidian__notes_search, but the platform is indicated in the name.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool over alternatives. The description only implies its use for Notion search but no exclusions or context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

obsidian__notes_searchBInspect

[obsidian · risk:low] Search notes in an Obsidian vault by query

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search query string

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose any behavioral traits beyond the core search action. It lacks information on performance, limitations, authentication requirements, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words, but it is minimally informative. For a simple tool, this is acceptable but could provide slightly more context without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity, the description lacks important context such as search scope (entire vault or specific folders), result format, pagination, or whether it's full-text or title-only search. This leaves an agent without enough information to determine appropriateness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the parameter 'query' described as 'Search query string'. The description adds no additional meaning beyond the schema, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'search', resource 'notes in an Obsidian vault', and mechanism 'by query'. It is specific and distinguishes from sibling search tools like slack__message_search or twitter__tweets_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, nor does it mention any prerequisites or exclusions. It only states the basic function.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

open_meteo__weather_forecastCInspect

[open-meteo · risk:low] Current temperature, conditions, humidity, and wind for a latitude/longitude, from Open-Meteo.

ParametersJSON Schema

Name	Required	Description	Default
`latitude`	Yes	Latitude of the place (e.g. 37.77 for San Francisco).
`longitude`	Yes	Longitude of the place (e.g. -122.42 for San Francisco).

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full responsibility for behavioral disclosure. It states 'risk:low' but does not clarify that this is a read-only operation, nor does it mention error handling, rate limits, or authentication requirements. The behavioral traits are insufficiently documented.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise, using a single sentence to convey the core purpose. The bracketed risk indicator is a minor structural distraction, but overall the description is efficient and front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple 2-parameter schema and no output schema, the description provides adequate context about the output (temperature, conditions, humidity, wind) but omits details such as units, time resolution, and what constitutes 'current'. It is minimally complete for a straightforward tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema provides 100% coverage with descriptions for both latitude and longitude. The description adds no additional parameter meaning beyond what is already in the schema, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly lists the output (current temperature, conditions, humidity, wind) and specifies the input (latitude/longitude) and source (Open-Meteo). However, it does not explicitly distinguish from sibling tools like duvera__weather_current, which may offer similar current weather data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus alternatives. The description lacks context about suitable scenarios, prerequisites, or exclusions, which is problematic given the sibling list includes multiple weather tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

opensky__flights_liveAInspect

[opensky · risk:low] List aircraft currently airborne within a latitude/longitude box, from OpenSky's live ADS-B feed.

ParametersJSON Schema

Name	Required	Description
`lamax`	Yes	North edge of the box — maximum latitude (e.g. 40.9 for NYC).
`lamin`	Yes	South edge of the box — minimum latitude (e.g. 40.6 for NYC).
`lomax`	Yes	East edge of the box — maximum longitude (e.g. -73.7 for NYC).
`lomin`	Yes	West edge of the box — minimum longitude (e.g. -74.1 for NYC).

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It indicates low risk and lists the data source, but does not disclose response format, pagination, rate limits, or whether the operation is read-only. It adds some value beyond the schema but lacks depth.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single efficient sentence that front-loads the risk level and source. Every word adds value, making it concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the moderate complexity (4 required params, no output schema), the description is minimal. It explains what the tool does but omits information about the response format, typical fields returned, or any limitations like maximum box size. Adequate but not complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema provides 100% coverage with clear descriptions for all four bounding box parameters (e.g., 'North edge of the box'). The description adds no extra meaning beyond confirming the box is lat/lon. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists currently airborne aircraft within a latitude/longitude box, using a specific verb ('List') and resource ('aircraft'). This distinguishes it from sibling tools that focus on flight status, package tracking, or other domains.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context (live ADSB feed within a geographic box) but does not explicitly mention when to use this tool versus alternatives like flight status tools or other search tools. The context is clear but lacks exclusionary guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

postgres__sql_readAInspect

[postgres · risk:low] Execute a read-only SQL query against a Postgres database

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Read-only SQL query (SELECT) to execute
`database`	Yes	Target database name

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries the burden. It discloses the tool is read-only and low risk, which covers the most critical behavioral trait. However, it omits other details like connection handling, query limits, error behavior, or result size constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with embedded risk tag. Extremely concise with no fluff. Information is front-loaded and every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with two well-documented parameters. No output schema is needed. The description, combined with schema, fully defines the tool's behavior for a read-only SQL executor. Could optionally note result format, but not essential for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers both parameters with descriptions. The tool description adds no additional semantic value beyond restating that the query is read-only. Since schema coverage is 100%, baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action ('Execute'), the resource ('a read-only SQL query'), and the target ('Postgres database'). The 'read-only' qualifier distinguishes it from write operations. Among siblings, it is the only SQL tool, so purpose is unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides no guidance on when to use this tool versus alternatives. It only states it is read-only and low risk, but does not mention prerequisites, when not to use it, or how it compares to other database tools (if any exist in the toolset).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

shazam__audio_identifyBInspect

[shazam · risk:low] Identify the song currently playing using Shazam

ParametersJSON Schema

Name	Required	Description	Default
`duration_ms`	No	How long to listen before identifying, in milliseconds

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It only states 'risk:low' but does not explain behavioral aspects like whether it requires microphone access, how long listening takes relative to duration_ms, or what happens if identification fails. With zero annotation coverage, description is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single short sentence, front-loaded with the bracket prefix. Every word serves a purpose, though the prefix could be integrated. Concise without being overly sparse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no output schema and no annotations, the description lacks details on return format, expected behavior on failure, and whether the tool is blocking or async. It is missing critical context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter 'duration_ms' already documented. The description adds no additional explanation about the parameter's meaning, units, or default behavior. Baseline of 3 is appropriate as schema does the work.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action: 'Identify the song currently playing using Shazam'. The verb 'identify' and resource 'song' are specific, and no sibling tool overlaps, making purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. While no direct sibling tool exists for audio identification, the description does not provide context on prerequisites (e.g., must be near audio) or when to choose this over other search tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

slack__message_searchBInspect

[slack · risk:low] Search recent messages across Slack channels

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Search query string

Tool Definition Quality

B3.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full responsibility for behavioral disclosure. It labels risk as 'low' and implies read-only behavior, but does not explain rate limits, authentication needs, response format, or what 'recent' means in terms of time range.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently communicates the tool's purpose and risk level. Every word earns its place, with no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple search tool with one parameter and no output schema, the description is minimally adequate but lacks details on result count limits, pagination, or search syntax. While low complexity reduces the burden, gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'query' is already described in the schema as 'Search query string'. With 100% schema coverage, the description adds minimal extra meaning (only context that it searches messages). Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Search' and the resource 'recent messages across Slack channels', making the purpose and scope obvious. It distinguishes well from sibling tools, which are mostly unrelated domains (e.g., finance, weather, travel).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, no prerequisites, and no exclusions. It does not mention what constitutes 'recent' or how to refine searches.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

southwest__flight_statusCInspect

[southwest · risk:low] Check the status of a Southwest flight

ParametersJSON Schema

Name	Required	Description	Default
`date`	No	Flight date (YYYY-MM-DD)
`flight_number`	Yes	Flight number, e.g. WN1234

Tool Definition Quality

C2.9/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description bears full responsibility for disclosing behavioral traits. It only states 'Check the status' without explaining what status entails (e.g., delays, cancellations, gates) or any side effects like data freshness requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence that efficiently states the airline and action. While concise, it could incorporate more context without becoming verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of an output schema, the description should explain the nature of the returned status (e.g., on-time, delayed, gate info). This is missing, making the tool incomplete for agents needing to interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% as both parameters (flight_number, date) have clear schema descriptions. The tool description adds no extra detail beyond what the schema already provides, earning a baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Check' and resource 'status of a Southwest flight', which precisely identifies the tool's function. It also includes the airline name 'southwest' to distinguish it from sibling tools like united__flight_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives, such as when a specific airline flight status is needed versus general flight tracking. There are no use cases, prerequisites, or exclusions mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

telegram__message_readCInspect

[telegram · risk:low] Read recent messages from a Telegram chat

ParametersJSON Schema

Name	Required	Description	Default
`chat`	Yes	Chat id or @username to read from

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description only notes 'risk:low' without explaining read-only nature, authentication requirements, or behavior on errors. Minimal behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with no redundant words. However, the brevity sacrifices some essential details that could fit in a single sentence.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of an output schema, the description should explain what the tool returns (e.g., message list, formatting) and any limits on 'recent.' It falls short on these aspects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema already fully describes the single parameter 'chat' with a clear description. The description adds context by specifying 'recent messages,' but this is marginal improvement.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Read') and the resource ('recent messages from a Telegram chat'). It distinguishes the tool from siblings by specifying Telegram, but does not elaborate on what constitutes 'recent'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives no indication of when to use this tool versus alternatives like Slack search or Gmail read. No when-not-to-use or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

twitter__tweets_searchAInspect

[twitter · risk:low] Search for recent tweets matching a keyword, hashtag, or phrase using the Twitter v2 API. Pass your Twitter Bearer Token via X-Duvera-Service-Token header — Duvera never stores it. Get a free Bearer Token at developer.twitter.com.

ParametersJSON Schema

Name	Required	Description	Default
`query`	Yes	Twitter search query. Supports operators like "from:user", "#hashtag", "-word".
`max_results`	No	Number of tweets to return (10-100, default 10).

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description mentions risk:low, uses Twitter API, and states token is not stored. Does not cover rate limits, result format, or limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no redundancy. Purpose and authentication details are efficiently presented.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks output schema and does not describe return values or pagination. Adequate for a simple search tool with two parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%. Description adds minor value by mentioning operators like 'from:user' for query parameter. Parameters are adequately described in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it searches for recent tweets by keyword, hashtag, or phrase using Twitter v2 API. It is specific and distinct from unrelated sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides instructions for authentication (Bearer Token via header) and a link to get a token. No explicit when-not-to-use or alternatives, but sufficient given no similar siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

uber__fare_estimateBInspect

[uber · risk:low] Estimate the fare for a trip in the Uber app

ParametersJSON Schema

Name	Required	Description
`pickup`	No	Pickup address (defaults to current location)
`product`	No	Ride product, e.g. UberX, Comfort, Black
`destination`	Yes	Drop-off address or place name

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full responsibility for behavioral disclosure. It only mentions 'estimate a fare' but does not state that the operation is read-only, whether it requires authentication, rate limits, or what happens with invalid inputs. This is insufficient for an agent to gauge side effects or safety.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded with the core purpose. The risk tag adds context. However, it is extremely brief and could be expanded without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description lacks information about the output format (e.g., estimated fare range, currency, surge pricing) and does not mention any limitations. Given that there is no output schema, the description should compensate, but it fails to do so.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters have descriptions in the input schema, achieving 100% coverage. The description adds no additional semantic value beyond what the schema already provides, so the baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('estimate'), the resource ('fare'), and the context ('Uber app'). It distinguishes the tool from the wide range of sibling tools, none of which are related to fare estimation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives, prerequisites (e.g., whether an Uber account or location services are needed), or exclusions. The single sentence does not address usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

united__flight_statusAInspect

[united · risk:low] Check the status of a United flight

ParametersJSON Schema

Name	Required	Description	Default
`date`	No	Flight date (YYYY-MM-DD)
`flight_number`	Yes	Flight number, e.g. UA123

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It only states 'Check the status' and includes a risk label, but does not disclose what 'status' means (e.g., departure/arrival, delays), whether it's real-time, or how errors are handled.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no fluff. It includes a risk tag for quick assessment. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with two parameters and no output schema, the description is adequate but leaves gaps: it does not specify the date format, default behavior if date is omitted, or the kind of status details returned. Still, it covers the core action.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the parameters are well-documented in the schema. The description does not add any extra meaning beyond the schema; it merely restates the tool's purpose. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Check the status of a United flight', with a specific verb ('Check') and resource ('United flight'). It distinguishes this tool from sibling tools like 'southwest__flight_status' by naming the airline, making selection unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when a United flight status is needed, but lacks explicit guidance on when not to use it or alternatives. The sibling list provides context, but the description itself offers no when/when-not statements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

venmo__payment_requestAInspect

[venmo · risk:low] Request a payment from a contact via Venmo (no money leaves your account)

ParametersJSON Schema

Name	Required	Description
`note`	No	Optional note explaining the request
`recipient`	Yes	The Venmo handle, phone, or contact to request from
`amount_cents`	Yes	Amount being requested, in cents

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description solely carries the burden of behavioral disclosure. It correctly notes that no money leaves the account (safe operation) and includes a risk label. However, it omits other behaviors like reversibility, limits, or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence with an informative prefix ('venmo · risk:low'). Every word serves a purpose, and it is front-loaded with key context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with 3 parameters and no output schema, the description covers the core action and a critical constraint. It could mention that the request is sent to the recipient's Venmo account, but overall it is sufficient for the low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the parameter descriptions are clear. The tool description does not add new meaning beyond what the schema already provides, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Request a payment'), the resource (a contact via Venmo), and a key distinguishing trait ('no money leaves your account'). It also includes a risk label, making the purpose highly specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not explicitly state when to use this tool versus alternatives, nor does it mention prerequisites (e.g., needing a Venmo account). The purpose implies it's for requesting payments, but no contextual guidance is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

wallet__boardingpass_showCInspect

[wallet · risk:low] Pull up a boarding pass stored in Apple Wallet

ParametersJSON Schema

Name	Required	Description	Default
`pass_name`	No	Name or description of the pass to display
`flight_number`	No	Flight number associated with the pass

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description only says 'pull up' with a cryptic '[wallet · risk:low]' tag. Does not disclose side effects, permissions, or behavior when pass is not found.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with minimal waste. The prefix '[wallet · risk:low]' is extra but does not detract significantly. Purpose is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema and no annotations, but the tool is simple with two optional params. Lacks information on return value or error handling, making it barely adequate for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers both parameters with descriptions (100% coverage). Description adds no additional meaning beyond schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool pulls up a boarding pass from Apple Wallet, using specific verb and resource. It distinguishes from siblings like delta__boardingpass_show by specifying Apple Wallet, though it doesn't explicitly contrast.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives such as delta__boardingpass_show or other flight-related tools. Missing context on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?