DataNexus MCP
Server Details
Public data intelligence for AI agents — CVE, compliance, patents, contracts, domains.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.6/5 across 55 of 55 tools scored. Lowest: 3.6/5.
Tools are grouped into clear domain prefixes (compliance, domain, frontend_security, etc.) with distinct purposes. Minor overlap exists between frontend_security_detect_typosquatting and security_detect_typosquatting, but descriptions clarify the different scope.
Most tools follow a consistent verb_noun pattern with snake_case. Irregularities like 'fetch' vs 'audit' and two 'detect_typosquatting' tools exist, but overall naming is predictable within domains.
55 tools is high for a single server given the breadth of domains. Some redundancy (e.g., two typosquatting tools) suggests possible trimming, but the count is justified by the wide coverage.
The tool surface covers key operations across domains like compliance, domain, security, legal, and nonprofit. Minor gaps exist, such as limited frontend audit beyond package.json and no general-purpose code scanning.
Available Tools
55 toolsapikeys_generate_api_keyAInspect
Generate a DataNexus API key for the given email address. Anonymous callers get 10 free lookups/week; a registered free key unlocks 100/week. Store the returned key — it is shown only once. Pass it as the X-Api-Key header on future requests. Rate limit: 3 keys per IP per 24 hours.
| Name | Required | Description | Default |
|---|---|---|---|
| Yes | Email address to associate with the new API key. Used for delivery and repeat-signup lookup. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds behavioral context beyond annotations: key is shown only once, different lookup limits, and rate limit of 3 keys per IP per 24 hours. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, each providing distinct value: purpose, limits, storage instruction, and rate limit. No redundant or unnecessary content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple single-parameter tool with an output schema, the description covers purpose, behavioral traits, rate limits, and usage context thoroughly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The parameter 'email' is already well-described in the input schema (100% coverage). Description does not add further meaning beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it generates a DataNexus API key for a given email address. It distinguishes from sibling tools (revoke, rotate) which handle key lifecycle differently.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains the consequence of anonymous vs registered usage and rate limits, implying this is the initial key generation step. However, it does not explicitly contrast with sibling tools, though the sibling names make it clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
apikeys_revoke_api_keyADestructiveIdempotentInspect
⚠️ DESTRUCTIVE — requires human confirmation before use in automated pipelines. Permanently revoke a DataNexus API key. The key will stop working immediately. This action cannot be undone — generate a new key if access is needed again.
| Name | Required | Description | Default |
|---|---|---|---|
| key | Yes | API key (dnx_...) to permanently revoke. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations already indicate destructiveHint=true and idempotentHint=true. The description adds specific behavioral details: the key stops working immediately and the action is irreversible. It also mentions the need for human confirmation, which goes beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise, consisting of three short sentences that front-load the most critical information (destructive warning). Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that the tool has a single parameter with 100% schema coverage, an output schema, and annotations that cover destructive and idempotent hints, the description is complete. It explains the effect, irreversibility, and safety requirements without missing essential context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has only one parameter 'key', which is fully described in the input schema (100% coverage). The description does not add any additional semantics about the parameter beyond what the schema provides, so it meets the baseline of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: permanently revoke a DataNexus API key. It uses strong verbs like 'revoke', 'will stop working', and 'cannot be undone'. It distinguishes from sibling tools like apikeys_generate_api_key by implicitly contrasting with generating a new key.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description includes a strong warning that human confirmation is required before use in automated pipelines, and it emphasizes irreversibility. However, it does not explicitly mention when to use an alternative like rotate instead of revoke, so it falls short of perfect guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
apikeys_rotate_api_keyADestructiveInspect
⚠️ DESTRUCTIVE — requires human confirmation before use in automated pipelines. Revoke the current API key and issue a replacement. Returns the new key once — store it immediately. Pass keys as the X-DataNexus-Key header.
| Name | Required | Description | Default |
|---|---|---|---|
| current_key | Yes | Existing active API key (dnx_...) to revoke and replace. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond the destructiveHint annotation, the description adds crucial behavioral context: requires human confirmation, returns the new key only once, and specifies the header for passing keys. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise: two sentences plus a warning prefix. Each sentence carries essential information without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter) and presence of an output schema, the description covers all necessary aspects: destructive nature, usage context, and output behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema already describes the 'current_key' parameter with 100% coverage. The description adds value by instructing to pass keys via the X-DataNexus-Key header and reinforcing the key format (dnx_...).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it revokes and replaces an API key, distinguishing it from siblings like apikeys_revoke_api_key (revokes only) and apikeys_generate_api_key (creates new). The verb 'rotate' is well-defined.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly warns about human confirmation and immediate storage of the new key, indicating when to use (need to rotate) and not use (if only revoke). It could name the sibling tools explicitly but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compliance_check_sam_exclusionARead-onlyIdempotentInspect
Check whether an entity is on the US federal exclusions list (debarred from government contracts). Read-only. No side effects. Idempotent. US only. name_or_ein: Entity name or 9-digit EIN with or without dash e.g. Acme Corp or 13-1234567. Required. Name match is fuzzy — verify EIN for exact results. Returns excluded: true/false, exclusion type, and exclusion dates if found. Use this before awarding federal contracts or grants. Use govcon_search_contract_awards instead to find what contracts an entity has won. Verified source: SAM.gov. 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="compliance_check_sam_exclusion", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| name_or_ein | Yes | Entity name or EIN to check SAM exclusions. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds beyond annotations: 'Read-only. No side effects. Idempotent. US only. Verified source: SAM.gov. 24-hour cache.' Also describes fuzzy matching and exact via EIN. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with clear sections, but slightly verbose. Every sentence adds value, but could be more compact. Still above average.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple check tool with 1 parameter, description fully covers purpose, input, output, usage, source, cache, and feedback. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. Description adds examples and details: fuzzy match vs exact via EIN, format with dash, which is valuable beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'check' and resource 'SAM exclusion list'. Explicitly distinguishes from sibling govcon_search_contract_awards by stating to use that instead for finding contracts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'before awarding federal contracts or grants'. Also provides alternative and context: 'US only'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compliance_fetch_finra_brokerARead-onlyIdempotentInspect
Fetch FINRA BrokerCheck registration for a US broker or investment adviser by CRD number. Read-only. No side effects. Idempotent. US only. crd_number: Central Registration Depository number as a string of digits e.g. 1234567. Required. CRD number only — name lookup is not supported. Returns registration status, qualifications, disclosure history, and employment history. Use this when you have the CRD number. Use compliance_search_npi_by_name instead for healthcare providers, not financial advisers. Verified source: FINRA BrokerCheck. 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="compliance_fetch_finra_broker", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| crd_number | Yes | FINRA CRD number e.g. 149777. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, idempotentHint, and destructiveHint. The description adds context: 'Read-only. No side effects. Idempotent. US only. 24-hour cache.' and lists specific return fields. It does not contradict annotations. However, it could mention the openWorldHint (additional fields may appear) but is still valuable beyond structured fields.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph but each sentence contributes distinct info: purpose, properties, usage, alternatives, source/cache, fallback. It is front-loaded and concise given the amount of content. Could be slightly more structured (e.g., bullet points) but is effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter, annotations covering safety/idempotency, and an output schema present, the description covers: purpose, input format, usage guidance, alternatives, source credibility, cache, and fallback. No major gaps; it is fully adequate for an AI agent to select and invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3. The description adds 'Central Registration Depository number as a string of digits e.g. 1234567. Required. CRD number only — name lookup is not supported.' This provides format, example, and constraints, adding meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (fetch FINRA BrokerCheck registration), the resource (FINRA BrokerCheck for US broker or investment adviser), and the input (CRD number). It distinguishes from sibling tools like compliance_search_npi_by_name, which is for healthcare providers. This meets the standard for a specific verb+resource+scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('Use this when you have the CRD number') and when not to, with a named alternative ('Use compliance_search_npi_by_name instead for healthcare providers'). Also provides a fallback via report_feedback if the tool fails. This is comprehensive guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compliance_fetch_npi_providerARead-onlyIdempotentInspect
Fetch NPI registration details for a US healthcare provider by NPI number. Read-only. No side effects. Idempotent. US only. npi_number: 10-digit NPI number e.g. 1003000126. Required. Do not include dashes or spaces. Returns provider name, credential type, speciality taxonomy, practice address, and active status. Use this when you have the exact 10-digit NPI. Use compliance_search_npi_by_name instead when you only have the provider name. Verified source: NPPES NPI Registry (CMS). 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="compliance_fetch_npi_provider", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| npi_number | Yes | 10-digit NPI number e.g. 1003000126. No dashes. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint, idempotentHint, openWorldHint, destructiveHint=false. Description adds no side effects, 24-hour cache, verified source (NPPES), and feedback mechanism. Adds value beyond annotations without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single paragraph with front-loaded purpose. Every sentence provides necessary guidance: requirement, alternatives, caching, feedback. No redundant or missing information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given one required parameter, output schema present, and annotations, the description covers source, caching, feedback, parameter format, return fields, and usage alternatives. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. Description reinforces parameter details: '10-digit NPI number e.g. 1003000126. Required. Do not include dashes or spaces.' Adds example and clarification on formatting, slightly exceeding schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Fetch NPI registration details for a US healthcare provider by NPI number.' Specific verb (fetch), resource (NPI registration details), and constraints (US, read-only, idempotent). Distinguishes from sibling tool compliance_search_npi_by_name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use: 'Use this when you have the exact 10-digit NPI. Use compliance_search_npi_by_name instead when you only have the provider name.' Also includes caching and feedback fallback instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compliance_search_npi_by_nameARead-onlyIdempotentInspect
Search the NPPES NPI Registry by provider name with optional state and speciality filters. Read-only. No side effects. Idempotent. US only. Returns up to 10 matches. name: Full or partial provider name. Required. state: Two-letter US state code e.g. CA. Optional. speciality: Speciality keyword e.g. Cardiology. Optional. Returns NPI number, name, speciality, and address for each match. Use this when you do not have the NPI number. Use compliance_fetch_npi_provider instead when you have the exact 10-digit NPI. Verified source: NPPES NPI Registry (CMS). 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="compliance_search_npi_by_name", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Full or partial provider name. Required. | |
| state | No | Two-letter US state code e.g. CA. Optional. | |
| speciality | No | Speciality keyword e.g. Cardiology. Optional. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations, description adds read-only, idempotent, US-only, 24-hour cache, verified source, and a fallback feedback mechanism. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured and front-loaded, but slightly long due to unnecessary repetition of parameter info and feedback instructions. Could be trimmed slightly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given simple 3-parameter tool, schema coverage 100%, output schema, and rich annotations, the description covers all relevant aspects including source, cache, and fallback.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, description repeats parameter details but adds usage context. Baseline 3 is appropriate; no additional parameter-specific insights beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches the NPPES NPI Registry by provider name with optional filters, and distinguishes it from compliance_fetch_npi_provider.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use this tool (when NPI number is unknown) and when to use the sibling (when NPI is known). Also mentions US-only scope and up to 10 matches.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
domain_check_email_securityARead-onlyIdempotentInspect
Check SPF, DMARC, and DKIM email authentication for a domain.
domain: Domain without protocol e.g. "google.com".
Returns: overall_grade (A–F), spf_score, dmarc_score, dkim_score (each 0–10), spf_record, dmarc_record, dkim_selectors_found. Scores reflect live DNS via Cloudflare DoH — no cache.
SPF: -all=10 (strict), ~all=7, ?all=4, none=2, +all=0 (open relay). DMARC: p=reject=10, p=quarantine=7, p=none=4, absent=0; +1 for rua set. DKIM: selector found=10, none=0. Checks 10 common selectors in parallel.
Example: check_email_security(domain="google.com")
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain without protocol e.g. google.com. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds value beyond annotations by detailing live DNS via Cloudflare DoH with no cache, and explains the scoring criteria (e.g., SPF -all=10, ~all=7). Annotations already indicate readOnly, idempotent, non-destructive, so the description provides additional behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a clear purpose, parameter explanation, return fields, and scoring details. It is concise but informative; the example at the end is slightly redundant but not harmful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has annotations and an output schema, the description adequately explains the return fields and scoring. It covers the necessary context for a domain email security check, though it could mention authorization or rate limits.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (domain parameter described). The description adds 'Domain without protocol e.g. google.com' which largely duplicates the schema. No extra meaning beyond the schema, so baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool checks SPF, DMARC, and DKIM email authentication for a domain. The verb 'Check' and the specific resource (email authentication records) are precise, and the purpose distinguishes it from similar tools like domain_fetch_dns_records.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when email security info is needed but lacks explicit guidance on when to use this tool versus alternative sibling tools (e.g., domain_fetch_dns_records). No when-not or alternative recommendations are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
domain_fetch_dns_recordsARead-onlyIdempotentInspect
Fetch current DNS records for a domain via Cloudflare DNS over HTTPS. Read-only. No side effects. Idempotent. domain: Domain name without protocol e.g. cloudflare.com. Required. record_types: List of DNS record types to fetch. Required. Valid values: A, AAAA, MX, TXT, NS, CNAME, SOA. Example: ["A", "MX", "TXT"]. Returns all matching records currently in effect. Use this when you need live DNS resolution. Use domain_fetch_domain_rdap instead when you need registration metadata not DNS records. Verified source: Cloudflare DoH. 4-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="domain_fetch_dns_records", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain without protocol e.g. anthropic.com. Required. | |
| record_types | Yes | DNS record types e.g. ['A','MX','TXT']. Optional. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, destructiveHint. Description adds 'Read-only. No side effects. Idempotent.' and discloses a 4-hour cache, which adds value beyond annotations. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is compact and front-loaded with purpose. Each sentence serves a purpose, though the feedback mechanism sentence could be integrated elsewhere.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple fetch operation, full schema coverage, presence of output schema, and rich annotations, the description is complete. It includes cache detail and fallback feedback instruction.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% but schema's record_types property lacks items type. Description adds valid values ('A, AAAA, MX, TXT, NS, CNAME, SOA') and example. Domain parameter description adds 'without protocol'. This adds meaningful detail.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Fetch', resource 'current DNS records', and context 'via Cloudflare DNS over HTTPS'. It distinguishes from sibling tool domain_fetch_domain_rdap by specifying registration metadata vs DNS records.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use this when you need live DNS resolution' and 'Use domain_fetch_domain_rdap instead when you need registration metadata not DNS records'. Also provides fallback feedback mechanism.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
domain_fetch_domain_historyARead-onlyIdempotentInspect
Fetch historical SSL certificate issuance for a domain from Certificate Transparency logs. Read-only. No side effects. Idempotent. domain: Domain name without protocol e.g. example.com. Required. Returns all past certificates with issuer, validity dates, and SANs in reverse chronological order. Use this to detect domain hijacking or audit unexpected historical certificate issuance. Use domain_fetch_ssl_certificate_chain instead when you only need the current certificate chain. Verified source: crt.sh Certificate Transparency. 4-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="domain_fetch_domain_history", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain without protocol e.g. example.com. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, destructiveHint. Description adds 'Read-only. No side effects. Idempotent.' and mentions 4-hour cache and source (crt.sh), providing useful context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is clear and structured, but could be slightly more concise. Front-loaded with main action and properties.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given one parameter, rich annotations, and output schema, the description covers purpose, usage, parameter, source, cache, and error handling via report_feedback. Very complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and description restates the parameter definition without adding new meaning. No additional semantics beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states verb 'Fetch' and resource 'historical SSL certificate issuance for a domain', and distinguishes from sibling 'domain_fetch_ssl_certificate_chain' by specifying when to use each.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly describes use cases ('detect domain hijacking or audit unexpected historical certificate issuance') and provides alternative ('Use domain_fetch_ssl_certificate_chain instead for current chain'). Also instructs on feedback for gaps.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
domain_fetch_domain_rdapARead-onlyIdempotentInspect
Fetch domain registration details via IANA RDAP (the modern structured replacement for WHOIS). Read-only. No side effects. Idempotent. domain: Domain name without protocol e.g. example.com not https://example.com. Required. Returns registrar, registration date, expiry date, nameservers, and registrant info where publicly available. Use this when you need registration metadata. Use domain_fetch_ssl_certificate_chain instead when you need certificate history. Use domain_fetch_dns_records instead when you need live DNS resolution. Verified source: IANA RDAP. 4-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="domain_fetch_domain_rdap", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain without protocol e.g. example.com. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, idempotentHint, destructiveHint. Description adds 'No side effects', 'Idempotent', '4-hour cache', and 'Verified source: IANA RDAP', providing additional context beyond structured fields.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Each sentence provides necessary information: purpose, nature, parameter, returns, usage guidance, cache, and fallback instructions. Slightly long but every part earns its place, with front-loaded core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (1 parameter, output schema present), the description is thorough. It covers purpose, behavior, usage, return data, cache, and even error handling via report_feedback, leaving no gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter (domain) with 100% schema coverage. The description's guidance ('e.g. example.com not https://example.com') adds a bit more clarity than the schema's description, so it earns above baseline of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches domain registration details via IANA RDAP, and specifies the returned data (registrar, dates, nameservers, registrant info). It distinguishes itself from sibling tools by naming alternatives and their different purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use ('Use this when you need registration metadata') and when not to, providing specific alternative tools (domain_fetch_ssl_certificate_chain, domain_fetch_dns_records). Also notes read-only, idempotent, and cache behavior.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
domain_fetch_reverse_ipARead-onlyIdempotentInspect
Find domains co-hosted on the same IP address (reverse IP lookup). Read-only. No side effects. Idempotent. domain_or_ip: Domain name (e.g. shared.dreamhost.com) or IPv4 address (e.g. 1.2.3.4). Required. If a domain is given, it is first resolved to its IPv4 A record. IPv6-only domains are not supported. Returns list of co-hosted domains on the same IP. Useful for identifying shared hosting risk and mapping corporate infrastructure. Daily quota guard: 100 calls/day free tier. Verified source: HackerTarget API. 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="domain_fetch_reverse_ip", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| domain_or_ip | Yes | Domain e.g. shared.dreamhost.com or IPv4 e.g. 1.2.3.4. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes read-only, idempotent, no side effects, daily quota, caching, source, and domain resolution behavior. Adds significant context beyond annotations like readOnlyHint and idempotentHint.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with front-loaded purpose, but includes extra details like feedback instructions. Each sentence adds value, though slightly lengthy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers input, behavior, limitations, quota, source, cache, and feedback mechanism. With output schema present, return values are adequately summarized. Very complete for the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% but description adds resolution behavior (domain to IPv4) and example formats, providing meaning beyond the schema's description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Find domains co-hosted on the same IP address (reverse IP lookup).' Uses specific verb and resource, distinguishing it from sibling tools like domain_fetch_dns_records or domain_fetch_subdomains.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides context for use cases (shared hosting risk, infrastructure mapping) and input constraints (IPv4 only, domain resolution behavior). Lacks explicit comparison to alternatives or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
domain_fetch_ssl_certificate_chainARead-onlyIdempotentInspect
Fetch SSL certificate history for a domain from Certificate Transparency logs. Read-only. No side effects. Idempotent. domain: Domain name without protocol e.g. github.com. Required. Does not support IP addresses or wildcard domains. Returns issuer, subject, validity period, and Subject Alternative Names for each logged cert. Use this to detect unexpected certificate issuance or audit certificate history. Use domain_fetch_domain_rdap instead when you need registration data not certificate data. Verified source: crt.sh Certificate Transparency. 4-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="domain_fetch_ssl_certificate_chain", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain without protocol e.g. github.com. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint, idempotentHint, etc.), the description adds key behavioral traits: no support for IP addresses or wildcard domains, 4-hour cache, and specific return fields (issuer, subject, validity, SANs). It reinforces read-only and idempotent nature.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is front-loaded with core purpose and includes multiple valuable details. The feedback instruction is long but contributes to usability. Could be slightly more concise, but all sentences earn their place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter, the description covers purpose, usage, constraints, return values, source, cache, and error handling. With annotations and likely output schema present, it leaves no gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds extra semantics: 'Does not support IP addresses or wildcard domains' beyond the schema description. This provides useful constraint information.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Fetch SSL certificate history' and the resource 'from Certificate Transparency logs'. It distinguishes from sibling tool 'domain_fetch_domain_rdap' by specifying when to use each. It also clarifies scope limitations (no IP or wildcard domains).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use this tool: 'detect unexpected certificate issuance or audit certificate history'. Provides an alternative: 'Use domain_fetch_domain_rdap instead when you need registration data'. Also includes feedback instruction if the tool fails to serve the user's need.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
domain_fetch_subdomainsARead-onlyIdempotentInspect
Enumerate subdomains for a domain via Certificate Transparency logs. Read-only. No side effects. Idempotent. domain: Domain name without protocol e.g. anthropic.com. Required. Returns deduplicated list of known subdomains. Primary source: crt.sh Certificate Transparency (free). Fallback source: RapidDNS (free, passive CT + DNS) — used automatically when crt.sh is unavailable. Response includes source field indicating which source was used. Results are cached 24h — second call returns in under 500ms. First call may be slower (8s max per source). Circuit breaker trips after 3 timeouts or 5xx errors within 600s. Verified sources: crt.sh Certificate Transparency, RapidDNS. 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="domain_fetch_subdomains", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain without protocol e.g. anthropic.com. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Provides extensive behavioral details beyond annotations, including caching behavior, circuit breaker, fallback source, and deduplication, enabling agent understanding of performance and reliability.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with core purpose first, but the description is lengthy; while every sentence is relevant, it could be more concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all relevant aspects: sources, caching, performance, deduplication, error handling, and feedback mechanism, making it fully contextual for decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description repeats the parameter description without adding new semantic meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses the specific verb 'Enumerate' and clearly identifies the resource (subdomains) and method (Certificate Transparency logs), distinguishing it from sibling tools like domain_fetch_dns_records.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for subdomain enumeration but does not explicitly state when to use or not use this tool compared to siblings, lacking guidance on alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
frontend_security_audit_ci_pipelineARead-onlyIdempotentInspect
Scan GitHub Actions, Vercel, or Netlify CI configs for exposed secrets, missing lockfile enforcement, and unpinned dependencies. Paste your config content — no filesystem access required. config: Raw YAML/TOML content of your CI config. Required. 500 KB max. config_type: github_actions (full check suite), vercel, or netlify (secrets only in Sprint 8). Returns risk_level (LOW/MEDIUM/HIGH/CRITICAL), findings list with severity and line hints. NOTE: ${{ secrets.FOO }} and ${{ env.FOO }} references are NOT flagged — only literal secret values. Read-only. No side effects. Idempotent. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="frontend_security_audit_ci_pipeline", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| config | Yes | Raw YAML/TOML content of your CI config. Required. 500 KB max. | |
| config_type | No | CI config type: github_actions, vercel, or netlify. Default github_actions. | github_actions |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations already indicate read-only, idempotent, non-destructive behavior. The description confirms these and adds important behavioral details: what it does not flag (variable references), the output structure (risk_level, findings), and the Sprint 8 limitation for Vercel/Netlify. It provides additional context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph of about 6 sentences. It front-loads the main action and then provides details. Every sentence adds value, though some redundancy with annotations exists (e.g., 'Read-only. No side effects. Idempotent.'). It could benefit from bullet points but remains clear and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has two parameters with well-documented schemas and an output schema, the description covers purpose, usage, limitations, output fields, and fallback behavior. It also informs about the Sprint 8 limitation, making it complete for agent decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. The description adds meaning by specifying the behavior for each config_type value: 'github_actions (full check suite)' vs 'vercel, or netlify (secrets only in Sprint 8)'. It also imposes constraints like '500 KB max' which reinforces the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scans CI configs (GitHub Actions, Vercel, Netlify) for exposed secrets, missing lockfile enforcement, and unpinned dependencies. This distinguishes it from sibling tools like frontend_security_audit_manifest or other security tools focused on packages and SBOM.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to use the tool ('Paste your config content — no filesystem access required') and when not to use it (it does NOT flag variable references like ${{ secrets.FOO }}). It also provides a fallback instruction to call report_feedback if the result is insufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
frontend_security_audit_manifestARead-onlyIdempotentInspect
Audit a frontend package.json for security risks — returns a single SHIP/CAUTION/BLOCK verdict with licence risks and abandonment signals. Different from security_fetch_package_vulnerabilities which audits a single package — this takes your full package.json. manifest: Contents of package.json as a string. Required. 500 KB max. lockfile: Contents of package-lock.json or yarn.lock (optional). If provided, audits pinned versions; otherwise audits semver ranges. BLOCK: any critical CVE in direct deps OR GPL-3.0 in commercial context. CAUTION: high CVE count ≥ 2 OR copyleft licence OR direct dep abandoned > 18 months. Sources: OSV.dev (CVEs), deps.dev (licences), npm registry (abandonment). Read-only. No side effects. Idempotent. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="frontend_security_audit_manifest", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| lockfile | No | Contents of package-lock.json or yarn.lock. Optional. | |
| manifest | Yes | Contents of package.json as a string. Required. 500 KB max. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint, idempotentHint), describes verdict logic, data sources, and behavior based on lockfile presence. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Packs significant information but is slightly long (though each sentence is necessary). Front-loaded with purpose and verdict types.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given output schema exists, description adequately covers tool behavior, verdict criteria, sources, and feedback mechanism. No obvious gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Adds value beyond schema: specifies manifest size limit (500 KB), optional lockfile effect on version pinning, and clear required/optional status.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it audits a frontend package.json for security risks and returns a SHIP/CAUTION/BLOCK verdict. Distinguishes from sibling 'security_fetch_package_vulnerabilities' which audits single packages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly differentiates from a sibling tool and provides verdict criteria to guide usage. However, does not explicitly state when to avoid using it or mention prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
frontend_security_detect_typosquattingARead-onlyIdempotentInspect
Typosquatting detection optimised for the top 500 frontend packages (React, Vite, Axios, Lodash, etc.). Fewer false positives than a full npm scan. For backend packages, use security_detect_typosquatting instead. package_name: Package name to check. Required. ecosystem: npm or pypi — default npm. Uses Damerau-Levenshtein distance ≤ 2 against a curated frontend-package corpus. Returns is_likely_typosquat, closest_match, distance, and risk_level (LOW/MEDIUM/HIGH). Read-only. No side effects. Idempotent. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="frontend_security_detect_typosquatting", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| ecosystem | No | Package ecosystem: npm or pypi. Default npm. | npm |
| package_name | Yes | Package name e.g. requests. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint, idempotentHint, etc. Description adds algorithm details (Damerau-Levenshtein distance ≤2), curated corpus, and return fields, enriching beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with purpose, then usage guidance, parameters, algorithm, safety, and fallback. Each sentence serves a purpose though slightly verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete coverage for a detection tool: purpose, scope, alternative tool, input syntax, algorithm, output fields, safety, and fallback. No gaps identified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% so baseline 3. Description repeats schema info (required package_name, default ecosystem) but adds no new parameter semantics beyond examples.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it detects typosquatting for top 500 frontend packages. Distinguishes from sibling security_detect_typosquatting focused on backend packages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly specifies use for frontend packages only and directs to sibling for backend packages. Also includes feedback instructions for when tool response is inadequate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
frontend_security_fetch_package_risk_briefARead-onlyIdempotentInspect
SHIP/CAUTION/BLOCK risk brief for an npm package with frontend-specific context. Wraps security_fetch_package_risk_brief restricted to npm, and adds weekly_downloads and is_ui_component signals. package_name: npm package name. Required. version: Optional pinned version — latest resolved if omitted. Returns verdict, CVE counts, licence risk, maintainer health, weekly_downloads, is_ui_component. Use security_fetch_package_risk_brief for non-npm ecosystems. Read-only. No side effects. Idempotent. Sources: OSV.dev, deps.dev, npm registry. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="frontend_security_fetch_package_risk_brief", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| version | No | Package version e.g. 2.28.0. Optional. | |
| package_name | Yes | Package name e.g. requests. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint: true, idempotentHint: true, destructiveHint: false. Description reinforces these and adds context: sources (OSV.dev, deps.dev, npm registry), wrapper nature, and return fields. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with front-loaded summary. Contains all necessary info including usage, parameters, returns, sources, and feedback instruction. Slightly verbose but still concise overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given output schema exists, description sufficiently summarizes return fields. Covers npm-only scope, sources, and feedback mechanism. Complete for the tool's purpose.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3. Description adds minor context (e.g., 'npm package name', 'pinned version — latest resolved if omitted'). Does not significantly enhance beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it provides a SHIP/CAUTION/BLOCK risk brief for npm packages with frontend-specific context. Distinguishes from sibling by specifying 'Use security_fetch_package_risk_brief for non-npm ecosystems.'
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use this tool vs alternatives: 'Use security_fetch_package_risk_brief for non-npm ecosystems.' Also provides guidance on calling report_feedback if the response doesn't serve the user's need.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
govcon_fetch_open_solicitationsARead-onlyIdempotentInspect
Fetch currently open government contract solicitations matching a keyword. Read-only. No side effects. Idempotent. keyword: Description of goods or services sought e.g. cloud computing services. Required. Encode special characters — + becomes %2B. agency: Awarding agency name. Optional, defaults to all agencies. jurisdiction: One of US, EU, or UK. Optional. Default US. Returns solicitation title, agency, response deadline, estimated value, and NAICS code. Use this when looking for active bid opportunities. Use govcon_search_contract_awards instead when you need historical awards not open solicitations. Verified source: SAM.gov + USASpending.gov. 4-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="govcon_fetch_open_solicitations", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| agency | No | Awarding agency name. Optional, defaults to all agencies. | |
| keyword | Yes | Description of goods or services sought e.g. cloud computing. Required. | |
| jurisdiction | No | Jurisdiction: US, EU, or UK. Default US. Optional. | US |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnly, idempotent, and non-destructive. Description adds caching (4-hour cache) and source verification (SAM.gov + USASpending.gov), and feedback mechanism, enhancing trust and predictability.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured: purpose, traits, parameters, output, usage, alternative, source, cache, feedback. Every sentence adds value without repetition. Front-loaded with essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all key aspects: what, how, when, where (source), and fallback (feedback). Output schema exists, so return values are documented. Minor gap: no mention of result limits or pagination, but overall complete for effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3. Description adds encoding guidance for keyword (special characters) and clarifies default values and allowed values for jurisdiction, going beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Verb 'Fetch' clearly indicates retrieval, resource is 'open government contract solicitations' with keyword matching. Explicitly distinguishes from sibling 'govcon_search_contract_awards' for historical awards.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
States when to use: 'looking for active bid opportunities'. Explicit alternative: 'Use govcon_search_contract_awards instead when you need historical awards'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
govcon_fetch_vendor_contract_historyARead-onlyIdempotentInspect
Fetch the complete federal contract award history for a specific vendor. Read-only. No side effects. Idempotent. vendor_name: Company or organisation name e.g. Booz Allen Hamilton. Required. Fuzzy match used. jurisdiction: One of US, EU, or UK. Optional. Default US. Returns total award value, top awarding agencies, contract types, and recent awards with amounts and dates. Use this when researching a specific company's government contracting history. Use govcon_search_contract_awards instead when exploring a topic area without a specific vendor. Verified source: USASpending.gov. 4-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="govcon_fetch_vendor_contract_history", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| vendor_name | Yes | Vendor or company name to search e.g. Booz Allen Hamilton. Required. | |
| jurisdiction | No | Jurisdiction: US, EU, or UK. Default US. Optional. | US |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds context beyond annotations: 'Read-only', 'No side effects', 'Idempotent', fuzzy match for vendor_name, and 4-hour cache. All consistent with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise, front-loaded with main action. Each sentence adds value (source, cache, fallback). No fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists; description explains return structure (total award value, agencies, contract types, recent awards). Includes feedback fallback. Complete for tool complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. Description adds examples for vendor_name, notes fuzzy match, and lists jurisdiction options with default. Adds value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it fetches federal contract award history for a specific vendor. Distinguishes from sibling govcon_search_contract_awards by specifying 'for a specific vendor' vs exploring a topic area.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use (researching a specific company's history) and when not to (use alternative for topic exploration). Also mentions verified source and cache behavior.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
govcon_search_contract_awardsARead-onlyIdempotentInspect
Search government contract awards by keyword, agency, and date range.
keyword: Contract scope e.g. "cybersecurity software". agency: Awarding agency e.g. "Department of Defense". Optional. date_from: Earliest award date ISO 8601 e.g. "2024-01-31". Optional. jurisdiction: "US", "EU", or "UK". Default "US".
Returns: award amounts, recipient vendors, NAICS codes, award dates. Use govcon_fetch_vendor_contract_history for all contracts by a specific vendor. Use govcon_fetch_open_solicitations for active bids, not past awards. Source: USASpending.gov + SAM.gov. 4-hour cache.
Example: search_contract_awards(keyword="cybersecurity software", agency="Department of Defense")
| Name | Required | Description | Default |
|---|---|---|---|
| agency | No | Awarding agency name e.g. Department of Defense. Optional. | |
| keyword | Yes | Search terms describing the contract scope e.g. cybersecurity software. Required. | |
| date_from | No | Earliest award date ISO 8601 e.g. 2024-01-31. Optional. | |
| jurisdiction | No | Jurisdiction: US, EU, or UK. Default US. Optional. | US |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds valuable context: returns award amounts, recipient vendors, NAICS codes, award dates; sources (USASpending.gov + SAM.gov); and a 4-hour cache. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (10 lines) and well-structured: purpose line, parameter list, return summary, alternative tool recommendations, source note, and example. Every sentence serves a purpose with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description does not need to explain return values. It covers purpose, all parameters, return fields, alternatives, data source, and caching. This is complete for a search tool with good annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 100% coverage with descriptions, so baseline is 3. The description adds further nuance (e.g., 'keyword: Contract scope e.g. cybersecurity software', 'agency: Awarding agency e.g. Department of Defense', 'date_from: Earliest award date ISO 8601', 'jurisdiction: US, EU, or UK. Default US.') and includes an example call, which enhances understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches government contract awards by keyword, agency, and date range. It explicitly distinguishes from sibling tools by naming them and explaining their different use cases.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool versus alternatives: 'Use govcon_fetch_vendor_contract_history for all contracts by a specific vendor. Use govcon_fetch_open_solicitations for active bids, not past awards.' It also mentions data source and caching behavior.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
legal_fetch_inventor_portfolioARead-onlyIdempotentInspect
Fetch the patent portfolio for a named inventor with optional assignee filter. Read-only. No side effects. Idempotent. inventor_name: Inventor surname or full name e.g. Smith or John Smith. Required. Fuzzy match — common names may return many results. assignee: Company or organisation name to narrow results e.g. Apple Inc. Optional. Returns patent numbers, titles, filing dates, jurisdictions, and current status. Use this when researching an inventor's work or a company's patent portfolio. Use legal_search_patents_by_keyword instead when you need patents by topic not by inventor. Verified source: EPO OPS + USPTO. 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="legal_fetch_inventor_portfolio", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| assignee | No | Company name to filter results e.g. Apple Inc. Optional. | |
| inventor_name | Yes | Inventor surname or full name e.g. John Smith. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds behavioral context beyond the annotations: it states the tool is read-only, idempotent, has no side effects, uses fuzzy matching (which may return many results for common names), has a 24-hour cache, and cites verified sources (EPO OPS + USPTO). There is no contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a clear flow: purpose, safety, parameters, use cases, alternatives, source/caching, and fallback. Each sentence adds value, though it is slightly verbose. It is front-loaded with the core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers all essential context: what the tool returns (patent numbers, titles, filing dates, jurisdictions, status), safety (read-only, idempotent), caching, data sources, and fallback mechanism (report_feedback). With an output schema present, this is fully sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds meaning by explaining the inventor_name parameter format (surname or full name, required) and the assignee parameter (company/organization, optional). It also mentions fuzzy match behavior, which adds value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches a patent portfolio for a named inventor with an optional assignee filter. It uses specific verbs (Fetch) and resources (patent portfolio). It distinguishes itself from the sibling 'legal_search_patents_by_keyword' by specifying it is for inventor-based research, not topic-based.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says when to use this tool ('when researching an inventor's work or a company's patent portfolio') and when to use the alternative ('use legal_search_patents_by_keyword instead when you need patents by topic not by inventor'). It also provides guidance on what to do if the response is inadequate (call report_feedback).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
legal_fetch_patent_by_numberARead-onlyIdempotentInspect
Fetch full patent details by patent number and jurisdiction. Read-only. No side effects. Idempotent. patent_number: Patent number in EPODOC format e.g. EP1000000 for European, CN120586032 for Chinese, JP2020123456 for Japanese, WO2020123456 for PCT, US10000000 for US. Required. jurisdiction: Optional hint — one of EP, CN, JP, KR, US, WO, etc. Default EP. The tool normalises the patent number automatically; passing CN120586032 with jurisdiction EP is valid. Returns title, abstract, inventors, assignees, filing date, claims summary, and citation count. Use this when you have a specific patent number. Use legal_search_patents_by_keyword instead when you only have keywords and need to find patents. Verified source: EPO OPS. 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="legal_fetch_patent_by_number", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| jurisdiction | No | Patent office code: EP, US, WO. Default EP. Optional. | EP |
| patent_number | Yes | Patent number e.g. EP3456789 or US10123456. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true. Description reinforces these and adds that the tool normalizes patent numbers automatically, has a 24-hour cache, and instructs to call report_feedback if needed. Some redundancy with annotations reduces score slightly.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is a single paragraph that is well-organized and front-loaded with purpose. All sentences add value, though it's slightly verbose. Could benefit from bullet points but remains effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists; description lists return fields (title, abstract, inventors, etc.). Includes source (EPO OPS) and cache behavior. For a fetch tool, this is complete and no gaps identified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds significant meaning: explains EPODOC format with examples for patent_number, clarifies jurisdiction as an optional hint, and states that passing mismatched jurisdiction is valid due to normalization.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Fetch full patent details by patent number and jurisdiction' with a specific verb and resource. It distinguishes from the sibling tool legal_search_patents_by_keyword, aiding in correct tool selection.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit guidance provided: 'Use this when you have a specific patent number. Use legal_search_patents_by_keyword instead when you only have keywords.' Also explains optional jurisdiction and defaults.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
legal_fetch_patent_citationsARead-onlyIdempotentInspect
Fetch forward and backward citation chains for a specific patent. Read-only. No side effects. Idempotent. patent_number: Patent number in EPODOC format e.g. EP1000000 for European, CN120586032 for Chinese, JP2020123456 for Japanese, WO2020123456 for PCT, US10000000 for US. Required. jurisdiction: Optional hint — one of EP, US, WO, CN, JP, KR, etc. Default EP. The tool normalises the patent number automatically; passing CN120586032 with jurisdiction EP is valid. Returns citing patents (forward citations) and cited patents (backward citations) with filing dates and titles. Use this when building a prior art citation chain for a specific patent you already have. Use legal_search_patents_by_keyword instead when you need to find patents by topic not by citation. Verified source: EPO OPS. 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="legal_fetch_patent_citations", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| jurisdiction | No | Patent office code: EP, US, WO. Default EP. Optional. | EP |
| patent_number | Yes | Patent number e.g. EP3456789 or US10123456. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses read-only nature, no side effects, idempotency, automatic patent number normalization, 24-hour cache, and data source (EPO OPS). Adds significant context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with purpose and safety traits upfront. Slightly verbose as it reiterates read-only, no side effects, and idempotent which are already in annotations, but each sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers input format, behavior, caching, data source, and feedback fallback. Output schema exists so return values are documented elsewhere. Complete for a tool with two clear parameters.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Provides detailed examples of valid patent numbers in EPODOC format, explains jurisdiction as an optional hint with default EP, and describes normalization behavior. Adds substantial meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it fetches forward and backward citation chains for a specific patent, distinguishing it from the sibling tool legal_search_patents_by_keyword which searches by topic.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use this tool (building citation chains for a known patent) and when to use legal_search_patents_by_keyword (finding patents by topic). Also provides fallback instruction to report_feedback if the tool doesn't serve the user's need.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
legal_search_patents_by_keywordARead-onlyIdempotentInspect
Search patents by keyword across EPO, USPTO, or WIPO. Read-only. No side effects. Idempotent. Returns up to 10 matches. keywords: Search terms describing the invention e.g. neural network image classification. Required. jurisdiction: One of EP, US, or WO. Optional. Default EP. date_from: Earliest filing date in ISO 8601 format e.g. 2020-01-31. Optional, defaults to no lower bound. Returns patent numbers, titles, and filing dates. Use this when finding prior art or exploring a technology landscape without a specific number. Use legal_fetch_patent_by_number instead when you have the patent number already. Verified source: EPO OPS + USPTO. 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="legal_search_patents_by_keyword", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| keywords | Yes | Search keyword or phrase e.g. CRISPR gene editing. Required. | |
| date_from | No | Earliest filing date ISO 8601 e.g. 2020-01-31. Optional. | |
| jurisdiction | No | Patent office code: EP, US, WO. Default EP. Optional. | EP |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, etc. The description adds value by stating 'Read-only. No side effects. Idempotent. Returns up to 10 matches.' and discloses verification sources (EPO OPS + USPTO) and 24-hour cache. This goes beyond annotations to set expectations on result limits and data freshness.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is moderately detailed but every sentence serves a purpose: core function, behavioral traits, parameter guidance, return info, usage context, and feedback instructions. It is front-loaded with the key purpose. While a bit long, it does not waste words given the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 3 parameters (all documented), comprehensive annotations, and existence of an output schema, the description fully covers what the tool does, how to use parameters, behavioral constraints, and when to prefer alternatives. It also includes source verification and caching details, making it self-contained for informed use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all 3 parameters. The description adds practical examples for keywords ('neural network image classification') and date_from ('2020-01-31'), clarifies that keywords is required, and states the default jurisdiction (EP). This enhances usability beyond the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches patents by keyword across EPO, USPTO, or WIPO, and explicitly distinguishes it from the sibling legal_fetch_patent_by_number by specifying when to use each. The verb 'search' and resource 'patents by keyword' are precise, and the tool's read-only, idempotent nature further clarifies its role.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit when-to-use guidance: use this for prior art or exploring a technology landscape without a patent number, and use legal_fetch_patent_by_number when the number is known. It also includes a fallback instruction to call report_feedback if the response is insufficient, offering a clear alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nonprofit_fetch_charity_ukARead-onlyIdempotentInspect
Fetch UK registered charity details by charity number or organisation name. Read-only. No side effects. Idempotent. UK only. charity_number_or_name: UK registered charity number (7 digits, e.g. 1234567) or full/partial organisation name. Required. Returns registration status, income, expenditure, activities, and trustee count. Use this for UK charities. Use nonprofit_fetch_nonprofit_by_ein or nonprofit_search_nonprofits_by_name for US nonprofits. Verified source: UK Charity Commission OGL v3. 24-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="nonprofit_fetch_charity_uk", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| charity_number_or_name | Yes | UK charity number e.g. 1089464 or name substring. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark it as read-only and idempotent. Description adds cache duration and data source, providing useful behavioral context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise yet comprehensive. Purpose first, then safety, parameter details, return info, usage guidance, and fallback. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 1-param tool with output schema and good annotations, the description covers purpose, usage, parameters, return data, source, cache, and feedback mechanism. Fully adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema already describes the parameter with 100% coverage. The description adds specific format (7 digits) and examples, enhancing understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches UK registered charity details by charity number or name, with explicit differentiation from US nonprofit tools. Verb and resource are specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use (for UK charities) and when not to (use alternative tools for US). Also mentions verified source and cache behavior.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nonprofit_fetch_nonprofit_by_einARead-onlyIdempotentInspect
Fetch IRS 990 filing data for any US nonprofit by EIN. Read-only. No side effects. Idempotent. US only. ein: 9-digit Employer ID with or without dash, e.g. 46-5734087 or 465734087. Required. Returns name, revenue, expenses, assets, NTEE code, and mission from the most recent 990 filing. Use this when you have the exact EIN. Use nonprofit_search_nonprofits_by_name instead when you only have a name. Verified source: IRS EO BMF + IRS TEOS. 7-day cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="nonprofit_fetch_nonprofit_by_ein", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| ein | Yes | EIN in format XX-XXXXXXX e.g. 46-5734087. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, idempotentHint, destructiveHint. The description adds 'No side effects', 'Idempotent', 'US only', source verification, and cache duration (7-day cache), providing useful behavioral details beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded with the main action. It contains necessary details but is slightly verbose; however, every sentence serves a purpose (flags, format, alternatives, source, caching, feedback).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple tool with one parameter and existing output schema, the description is complete: it covers purpose, usage, alternatives, source reliability, caching, and error handling via report_feedback. No gaps identified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the description adds value by specifying both dashed and undashed EIN formats with examples, supplementing the schema's format description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it fetches IRS 990 filing data for a US nonprofit by EIN, using specific verbs and resource. It distinguishes itself from the sibling 'nonprofit_search_nonprofits_by_name' by specifying the exact identifier needed.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit usage guidance: 'Use this when you have the exact EIN. Use nonprofit_search_nonprofits_by_name instead when you only have a name.' Also includes instruction for reporting gaps via report_feedback, providing clear context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nonprofit_fetch_nonprofit_financial_trendsARead-onlyIdempotentInspect
5-year financial trend for any US nonprofit. Revenue growth, expense ratios, reserve trajectory, and health score history from IRS Form 990 data via ProPublica. Returns trend_direction (GROWING/STABLE/DECLINING/VOLATILE/INSUFFICIENT_DATA), CAGR, and year-by-year revenue, expense, and asset trends. years parameter: 1–10, default 5. Rate limit: 30/minute. No auth required. Complements nonprofit_fetch_nonprofit_full_profile by adding multi-year context. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="nonprofit_fetch_nonprofit_financial_trends", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| ein | Yes | EIN in format XX-XXXXXXX e.g. 46-5734087. Required. | |
| years | No | Number of years of trend data 1-10. Default 5. Optional. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint, openWorldHint, idempotentHint, and non-destructive. The description adds value by specifying the data source (IRS Form 990 via ProPublica), rate limit (30/minute), and auth requirements (none). It also describes the output format (trend_direction, CAGR, yearly trends), which is not covered by annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is highly concise with four sentences, front-loading the core purpose ('5-year financial trend for any US nonprofit'). Every sentence adds value: purpose, data source, output summary, parameter details, usage hints. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description correctly avoids redundant return value details. It covers purpose, parameter semantics, behavioral notes, and links to a sibling tool. The tool has only 2 parameters (one required), and the description adequately addresses both. The fallback instruction also enhances completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters described. The description adds practical details: default years value (5), range (1-10), and EIN format example (XX-XXXXXXX). This enhances the schema information, justifying a score above the baseline of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool provides '5-year financial trend' data for US nonprofits, detailing specific metrics like revenue growth, expense ratios, and health score history. It distinguishes itself from sibling tool 'nonprofit_fetch_nonprofit_full_profile' by adding multi-year context, fulfilling the requirement for a specific verb+resource with differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly complements 'nonprofit_fetch_nonprofit_full_profile' and provides a fallback instruction to call 'report_feedback' if the result doesn't serve the need. While it doesn't give explicit when-not-to-use guidance, the context is clear for when trend data is needed, and the fallback is a useful usage hint.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nonprofit_fetch_nonprofit_full_profileARead-onlyIdempotentInspect
Complete nonprofit due diligence in one call. Revenue trends, executive pay, risk flags, and a health score from IRS 990 data. Uses ProPublica Nonprofit Explorer API with IRS e-File fallback. Data refreshed on each call. Returns financials, executive_compensation, risk_flags, health_score (0–100), programme_ratio, fundraising_sustainability, and upstream_status. Rate limit: 30/minute. No auth required. For grant-makers, investors, and compliance teams performing nonprofit due diligence. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="nonprofit_fetch_nonprofit_full_profile", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| ein | Yes | EIN in format XX-XXXXXXX e.g. 46-5734087. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant behavioral context beyond annotations: it states data is refreshed each call, rate limit of 30/minute, no auth required, and the underlying API (ProPublica with IRS fallback). Annotations already indicate read-only, idempotent, non-destructive, which are consistent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is structured with a clear opening statement, followed by key features, data source, and usage notes. It is slightly verbose with the report_feedback instruction but overall efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (not shown but referenced), the description provides comprehensive context: it lists return fields (financials, executive_compensation, risk_flags, health_score, etc.), covers rate limits, authentication, and data refresh behavior. No gaps are apparent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already fully describes the single parameter 'ein' with required format and example. The description does not add further parameter details beyond what is in the schema, so a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose as 'Complete nonprofit due diligence in one call' and lists key outputs. However, it does not explicitly differentiate from sibling tools like 'nonprofit_fetch_nonprofit_by_ein' or 'nonprofit_fetch_nonprofit_financial_trends'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies target users ('grant-makers, investors, and compliance teams') and includes a fallback instruction to report feedback if the tool is not helpful. It does not explicitly state when not to use this tool or mention alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nonprofit_search_nonprofits_by_categoryARead-onlyIdempotentInspect
Search US nonprofits by mission category and state. Returns up to 25 results with revenue, assets, and health scores (0–100). Category maps to NTEE codes: education, healthcare, arts, environment, human_services, civil_rights, international, religion, science, sports. Raw NTEE letter (A–Z) also accepted. Uses ProPublica Nonprofit Explorer API. Rate limit: 30/minute. No auth required. Starting point for nonprofit due diligence — follow with nonprofit_fetch_nonprofit_full_profile for deep dive on a specific EIN. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="nonprofit_search_nonprofits_by_category", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| state | No | Two-letter US state code e.g. CA. Optional. | |
| category | Yes | NTEE category e.g. education, healthcare, arts. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses external API (ProPublica), rate limit of 30/minute, no auth required, and max 25 results. Annotations already indicate read-only and idempotent, but description adds concrete operational boundaries.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Compact, front-loaded with purpose and key details. Every sentence adds value: capability, limits, API source, rate limit, follow-up, fallback.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all aspects: purpose, parameters, output shape, source, rate limit, auth, follow-up actions, and error handling. Complete for a search tool with good annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Adds meaning beyond schema by explaining category maps to NTEE codes and accepts raw NTEE letters. Schema only gave examples; description provides fuller context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Search US nonprofits by mission category and state' with specific details on results (up to 25 with revenue, assets, health scores). Distinguishes from siblings like nonprofit_search_nonprofits_by_name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly frames as starting point for due diligence, suggests follow-up with nonprofit_fetch_nonprofit_full_profile, and provides fallback to report_feedback.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nonprofit_search_nonprofits_by_nameARead-onlyIdempotentInspect
Search US nonprofits by name with optional state filter. Read-only. No side effects. Idempotent. US only. Returns up to 25 matches. name: Full or partial organisation name. Required. state: Two-letter US state code e.g. CA, NY. Optional, defaults to all states. Returns EIN, name, state, revenue, and NTEE code for each match. Use this when you have a name but not the EIN. Use nonprofit_fetch_nonprofit_by_ein instead when you have the exact EIN for a precise single lookup. Verified source: IRS EO BMF. 7-day cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="nonprofit_search_nonprofits_by_name", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | Organization name to search e.g. Red Cross. Required. | |
| state | No | Two-letter US state code e.g. CA. Optional. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds details beyond annotations: read-only, no side effects, idempotent, US only, returns up to 25 matches, verified source, 7-day cache. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with purpose, includes necessary details, but slightly verbose with fallback feedback instruction. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given low complexity, rich annotations, and presence of output schema, the description covers data source, caching, limit, and even handles inadequate results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds format examples (e.g., 'CA, NY') and default behavior (defaults to all states) and mentions return fields, adding value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Search US nonprofits by name with optional state filter' and distinguishes from sibling tools by specifying when to use this versus `nonprofit_fetch_nonprofit_by_ein`. It also mentions read-only, idempotent, and US scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use this when you have a name but not the EIN. Use nonprofit_fetch_nonprofit_by_ein instead when you have the exact EIN.' Also provides alternative tool name.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
regulatory_fetch_docket_detailsARead-onlyIdempotentInspect
Fetch full details for a specific regulatory docket by ID. Read-only. No side effects. Idempotent. US federal only. docket_id: Docket identifier in agency format e.g. EPA-HQ-OAR-2021-0317 or FTC-2024-0041. Required. Timeout is 30 seconds — large dockets may be slow. Returns docket title, agency, status, comment period dates, total comment count, and list of related documents. Use this when you have a docket ID from a search. Use regulatory_search_open_rulemakings instead when you need to find dockets by topic first. Verified source: Regulations.gov + Federal Register fallback. 4-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="regulatory_fetch_docket_details", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| docket_id | Yes | Docket ID e.g. EPA-HQ-OAR-2021-0668. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, openWorldHint, destructiveHint. The description adds important behavioral context: timeout of 30 seconds, large dockets may be slow, 4-hour cache, verified source (Regulations.gov + Federal Register fallback), and what fields are returned. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is fairly long but well-organized with clear sentences. It includes feedback instructions which are useful but slightly increase length. However, every sentence adds value, and it remains focused.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (as indicated by context signals), the description doesn't need to detail return values. It explains what is returned (docket title, agency, status, etc.) and covers edge cases (slow large dockets). The feedback mechanism ensures completeness for unexpected needs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers the single parameter docket_id with description. The description adds value by providing example formats (EPA-HQ-OAR-2021-0317 or FTC-2024-0041) and reiterating it's required. This helps the agent form correct parameter values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it fetches full details for a specific regulatory docket by ID. It specifies read-only, no side effects, and idempotent. It also distinguishes from the sibling tool regulatory_search_open_rulemakings by saying when to use each.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use this tool ('when you have a docket ID from a search') and when to use an alternative ('regulatory_search_open_rulemakings instead when you need to find dockets by topic'). Also provides guidance on what to do if the tool doesn't serve the user's need (call report_feedback).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
regulatory_fetch_federal_register_noticesARead-onlyIdempotentInspect
Fetch recent Federal Register notices and rules for a specific agency. Read-only. No side effects. Idempotent. US federal only. agency: Agency name or abbreviation e.g. SEC, Food and Drug Administration, EPA. Required. keyword: Optional topic filter e.g. cryptocurrency. Optional, defaults to all notices. date_from: Earliest publication date in ISO 8601 format e.g. 2024-01-31. Optional, defaults to last 90 days. Returns document type, title, publication date, effective date, and CFR citations. Use this to monitor recent regulatory activity for an agency. Use regulatory_search_open_rulemakings instead when filtering by topic across all agencies. Verified source: Federal Register API. 4-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="regulatory_fetch_federal_register_notices", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| agency | Yes | Agency name or abbreviation e.g. SEC, EPA. Required. | |
| keyword | No | Optional topic filter e.g. cryptocurrency. Optional. | |
| date_from | No | Earliest publication date ISO 8601 e.g. 2024-01-31. Optional. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, destructiveHint, idempotentHint. The description adds 'US federal only', '4-hour cache', and a fallback instruction, providing extra behavioral context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is structured and front-loaded with purpose. Some redundancy with schema but each sentence contributes value; slightly lengthy but acceptable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (not shown in context), the description covers all necessary aspects: purpose, parameters, usage, source, cache, and fallback. Complete for a read-only fetch tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. The description adds default values (date_from defaults to last 90 days) and lists return fields, providing supplementary information to the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches recent Federal Register notices and rules for a specific agency, distinguishing it from the sibling tool 'regulatory_search_open_rulemakings' by specifying when to use each.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to use this tool ('monitor recent regulatory activity for an agency') and when to use an alternative ('regulatory_search_open_rulemakings instead when filtering by topic across all agencies'), and mentions the data source and caching.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
regulatory_search_open_rulemakingsARead-onlyIdempotentInspect
Search open rulemakings and public comment periods on Regulations.gov and the Federal Register. Read-only. No side effects. Idempotent. US federal only. keyword: Topic keywords e.g. artificial intelligence, data privacy. Required. agency: Agency abbreviation e.g. FTC, FDA, SEC, EPA. Optional, defaults to all agencies. status: One of open, closed, or all. Optional. Default open. Returns docket title, agency, comment deadline, docket ID, and document count. Use this when monitoring regulatory activity on a topic. Use regulatory_fetch_docket_details instead when you have a docket ID and need full detail. Verified source: Regulations.gov + Federal Register. 4-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="regulatory_search_open_rulemakings", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| agency | No | Agency abbreviation e.g. FTC, FDA, SEC. Optional. | |
| status | No | Filter: open, closed, or all. Default open. Optional. | open |
| keyword | Yes | Topic keywords e.g. artificial intelligence. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, openWorldHint, idempotentHint, destructiveHint. Description adds details: 'Read-only', 'No side effects', 'Idempotent', 'US federal only', 'Verified source', '4-hour cache', and what is returned. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is fairly long but well-structured: main action first, then parameter details, then usage guidance, then fallback. Each sentence adds value, though some repetition of annotation info (e.g., idempotent) exists. Overall efficient for the amount of information conveyed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Comprehensive: explains what the tool does, parameters, usage context, alternatives, caching behavior, source verification, and a fallback mechanism. Schema coverage is 100%, output schema exists, and annotations are rich, so the description adds necessary context without gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, baseline 3. Description adds examples for keyword, clarifies agency abbreviation examples and defaults, and explicitly lists status options and default. This provides more context than the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states verb 'search' and resource 'open rulemakings and public comment periods' on specific sources (Regulations.gov and Federal Register). Distinguishes from sibling tool regulatory_fetch_docket_details, which is for when a docket ID is available.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use ('when monitoring regulatory activity on a topic'), provides alternative tool for detailed docket info, and includes a fallback mechanism (report_feedback) if the response doesn't serve the user's need.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
report_feedbackARead-onlyInspect
Report a data quality issue or agent intent gap for a DataNexus tool response.
tool_id: e.g. "T10" or "security_fetch_cve_detail". query_hash: From the query_hash field of the response. signal: incorrect_data | missing_field | stale_data | not_useful | wrong_entity | data_quality. comment: Issue description. Max 500 chars. missing_fields: Absent or wrong field names. feedback_type: "user_feedback" (default) or "agent_gap". intended_query: Agent's goal. Max 256 chars. gap_description: What was missing. Max 256 chars.
Example: report_feedback(tool_id="T10", query_hash="abc123", signal="incorrect_data")
| Name | Required | Description | Default |
|---|---|---|---|
| signal | Yes | One of incorrect_data, missing_field, stale_data, not_useful, wrong_entity, or data_quality. Required for user_feedback. | |
| comment | No | Description of the issue. Optional. Max 500 characters. | |
| tool_id | Yes | Tool identifier, e.g. T04 or security_fetch_cve_detail. Required. | |
| query_hash | Yes | Hash from the response being reported — found in the query_hash field of any response. Required. | |
| feedback_type | No | user_feedback (default) or agent_gap. Use agent_gap when the tool returned a valid response but did not serve the user's actual need. | user_feedback |
| intended_query | No | What the agent was trying to accomplish — used when feedback_type=agent_gap. Optional. Max 256 chars. | |
| missing_fields | No | List of field names that are absent or wrong. Optional. | |
| gap_description | No | What was missing or wrong in the result — used when feedback_type=agent_gap. Optional. Max 256 chars. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description describes the tool as reporting issues, which implies a write operation. However, annotations set readOnlyHint=true, indicating no state modification. This is a direct contradiction, and the description does not address it. No other behavioral traits (e.g., auth, rate limits) are mentioned. Score 1 due to contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with a one-sentence summary, then lists parameters in a clear bullet-like format, and ends with an example. While it is somewhat lengthy due to parameter repetition, every sentence adds useful information. It could be trimmed slightly but is generally well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has 8 parameters and an output schema. The description covers parameter usage and provides an example. However, it does not explain what happens after reporting (e.g., confirmation, persistence) or clarify the read-only contradiction. Given the complexity, the description is adequate but has gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds extra context for parameters, such as max lengths for 'comment' (500 chars) and 'intended_query' (256 chars), and clarifies enum values for 'signal'. It also provides an example call, adding value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with a specific verb and resource: 'Report a data quality issue or agent intent gap for a DataNexus tool response.' This clearly distinguishes it from sibling tools, which are primarily data retrieval or analysis tools. The addition of parameter details and an example reinforces the purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use different feedback types: 'Use agent_gap when the tool returned a valid response but did not serve the user's actual need.' It provides example calls and parameter guidance. While it does not explicitly state when not to use the tool, the context is clear for a feedback-specific tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
report_mcpize_linkARead-onlyIdempotentInspect
Check MCPize subscription status for a DataNexus tool.
tool_id: DataNexus tool identifier e.g. "T10". Pass the tool the user is asking about.
Returns: status ("free" | "subscription_required" | "not_configured"), message, tool_id, and upgrade_url when subscription is required.
Example: report_mcpize_link(tool_id="T10")
| Name | Required | Description | Default |
|---|---|---|---|
| tool_id | Yes | DataNexus tool identifier to check, e.g. "T01", "T07", "T10" — pass the ID of the tool the user is asking about. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses the return format (status, message, tool_id, upgrade_url) and is consistent with annotations (readOnly, idempotent). It adds value beyond annotations by specifying return structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with three sentences, including an example. Every sentence serves a purpose, no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, output schema exists), the description covers purpose, parameter, example, and return values. It is complete for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3. The description adds an example ('T10') and clarifies the parameter usage, improving semantics beyond the schema description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it checks MCPize subscription status for a DataNexus tool. It uses a specific verb and resource, and while it doesn't explicitly distinguish from siblings, the function is unique among listed tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description instructs to pass the tool_id of the user's query, effectively indicating when to use it. It doesn't provide when-not-to-use or alternatives, but the context makes it clear this is a status check tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_datanexus_toolsARead-onlyIdempotentInspect
Find the right DataNexus tool by describing your task in plain English. Read-only. No side effects. Call this before any other DataNexus tool to reduce context load from 40000 to 800 tokens. query: Plain English description of your task e.g. check if a Python package has CVEs or look up a UK charity by name. Required. domain: Restrict results to one sub-server: nonprofit, security, compliance, domain, legal, govcon, or regulatory. Optional. Returns matching tool names and parameter hints you can call directly. Do not call this recursively or to validate results — use validate_tool_output for that. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="search_datanexus_tools", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Plain English description of your task, e.g. 'check if a Python package has CVEs' or 'look up a UK charity by name'. Required. | |
| domain | No | Restrict results to one sub-server: nonprofit, security, compliance, domain, legal, govcon, or regulatory. Optional. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description reinforces annotations by stating 'Read-only. No side effects.' It adds context beyond annotations: reducing context load from 40000 to 800 tokens, the prohibition on recursion, and the feedback mechanism. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is lengthier than average but every sentence serves a purpose. It front-loads the primary use case and follows with parameter details, usage constraints, and fallback instructions. Minor redundancy (e.g., 'Required.' after schema already marks it required) but overall efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description only needs to summarize return values, which it does ('Returns matching tool names and parameter hints'). It also covers error handling via report_feedback, and explicitly states when to use alternative tools (validate_tool_output). This is comprehensive for a search tool with rich annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with both parameters already documented. The description adds value by providing examples (e.g., 'check if a Python package has CVEs' for query), clarifying required vs. optional, listing domain values explicitly, and indicating that the output contains 'matching tool names and parameter hints'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Find the right DataNexus tool by describing your task in plain English.' It explicitly distinguishes itself from sibling domain-specific tools by acting as a meta-discovery tool, and instructs to call it before any other DataNexus tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit when-to-use ('Call this before any other DataNexus tool to reduce context load'), when-not-to-use ('Do not call this recursively or to validate results — use validate_tool_output for that'), and provides a fallback action if the response does not serve the user's need (call report_feedback with specific parameters).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_audit_licence_compatibilityARead-onlyIdempotentInspect
Audit the licence compatibility of your entire dependency list. Input package names (with ecosystem) or SPDX IDs; get a COMPATIBLE/CONFLICT verdict with specific conflicting pairs and recommended action. Uses static SPDX compatibility table — no network call for spdx_ids path. Package path resolves licences from deps.dev (max 10 concurrent). Max 50 items. Rate limit: 60/minute. No auth required. For developers and compliance teams auditing open source licence risk before shipping. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_audit_licence_compatibility", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| packages | No | List of {name, ecosystem} dicts to check compatibility. Optional. | |
| spdx_ids | No | List of SPDX licence identifiers to check compatibility. Optional. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations (readOnlyHint, idempotentHint, destructiveHint) are complemented by description details: uses static SPDX table (no network call for spdx_ids), resolves licenses from deps.dev (max 10 concurrent), max 50 items, rate limit 60/minute, no auth. These disclose behavioral traits beyond annotations without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Six sentences front-loaded with main action. Each sentence provides value: inputs, outputs, internal behavior, limits, rate limit, auth, audience, and feedback mechanism. Slightly long but justified for the information density.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (two optional params, output schema exists), the description covers input formats, behavioral differences, limits, rate limit, auth, intended users, and fallback feedback. No gaps remain for an agent to select or invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with parameter descriptions already present. The description adds format guidance (e.g., '{name, ecosystem} dicts' vs 'SPDX identifiers') but does not significantly enhance meaning beyond schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it audits license compatibility of dependencies, specifies inputs (package names with ecosystem or SPDX IDs) and outputs (COMPATIBLE/CONFLICT verdict with conflicting pairs and recommended action). It distinguishes from sibling tools like security_fetch_licence_analysis by focusing on compatibility auditing of the entire dependency list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description says it's for developers and compliance teams auditing open source license risk before shipping, and includes fallback feedback instructions. However, it does not explicitly state when not to use it or compare to similar siblings (e.g., security_fetch_licence_analysis), leaving usage context implied.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_audit_sbom_continuousADestructiveInspect
Persistent SBOM watch. Register once, check anytime for new CVEs affecting your dependency snapshot. Silent permanent watch — CycloneDX and SPDX supported. Uses OSV.dev for vulnerability lookup, Redis for persistence with 90-day TTL. Supports CycloneDX 1.4/1.5 and SPDX 2.3 JSON. Input size limit: 500 KB. Returns go_no_go signal on register; new_findings on check. Rate limit: 10/minute. No auth required. For DevSecOps teams monitoring production dependency exposure. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_audit_sbom_continuous", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| sbom | Yes | CycloneDX or SPDX SBOM as JSON string. Required for register action. | |
| action | Yes | Action: register, check, or deregister the SBOM watch. Required. | |
| watch_id | Yes | Unique watch identifier for this SBOM watch. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant behavioral context beyond annotations: persistence (Redis, 90-day TTL), input size limit, rate limit, no auth required, return signals per action, and underlying services (OSV.dev). Annotations already indicate destructiveness and open-world, which align; no contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is somewhat lengthy but well-structured: main idea first, followed by technical details and fallback instruction. It is front-loaded and each sentence adds useful information, though minor trimming could improve conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (3 params, multiple actions, output schema exists), the description covers purpose, constraints (size limit, rate limit), behavioral quirks, and even a fallback mechanism. It is complete for an agent to understand and invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by clarifying conditional requirements (e.g., 'Required for register action' for sbom) and enumerating action options. This warrants a 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose as a 'Persistent SBOM watch' that registers once and checks anytime for new CVEs. It distinguishes from one-shot vulnerability checks by emphasizing continuous monitoring. Specifics like supported formats (CycloneDX, SPDX), output signals, and rate limits further clarify.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly targets 'DevSecOps teams monitoring production dependency exposure' and outlines three actions (register, check, deregister). It provides a fallback instruction for when results don't serve the user. However, it does not directly differentiate from sibling tools like 'security_audit_sbom_vulnerabilities' which might be for one-shot checks.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_audit_sbom_license_policyARead-onlyIdempotentInspect
Audit a CycloneDX or SPDX SBOM against an SPDX licence policy and return a PASS/WARN/BLOCK verdict. sbom: Full SBOM as a JSON string — CycloneDX or SPDX format. Required. 500 KB max. policy: Optional dict with block/warn/allow arrays of exact SPDX licence identifiers (e.g. GPL-3.0, MIT). Defaults to block GPL-3.0 and AGPL-3.0, warn LGPL-2.1/MPL-2.0/BSD-4-Clause, allow MIT/Apache-2.0/BSD-2-Clause/BSD-3-Clause. No glob patterns — exact SPDX IDs only. Unlisted licences default to WARN. Returns verdict (PASS/WARN/BLOCK), blocked_packages, warned_packages, and the policy applied. Use security_audit_sbom_vulnerabilities for CVE auditing instead. Sources: deps.dev (Google). 1-hour cache per package. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_audit_sbom_license_policy", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| sbom | Yes | CycloneDX or SPDX SBOM as JSON string. Required. 500 KB max. | |
| policy | No | Policy dict with block/warn/allow arrays of SPDX licence IDs. Optional. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnly, idempotent, non-destructive. Description adds substantial behavior: sources from deps.dev, 1-hour cache, no glob patterns, default policy details, return structure (verdict, blocked/warned packages, policy applied). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is well-structured and informative, front-loading purpose. Slight redundancy in parameter descriptions (repeats schema info) but every sentence earns its place. Could be slightly tighter but still very effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given output schema exists and annotations provide safety profile, description completes the picture: input constraints, default policy, output fields, source attribution, caching, and a feedback mechanism. Comprehensive for a two-parameter tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds critical nuance: sbom must be CycloneDX or SPDX format, 500 KB max; policy structure explained with examples, exact SPDX IDs only, unlisted licenses default to WARN. Greatly enhances schema definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool audits an SBOM against a license policy and returns a verdict. Specifically mentions the SBOM formats (CycloneDX, SPDX) and explicitly distinguishes from the sibling tool security_audit_sbom_vulnerabilities for CVE auditing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use context: auditing against SPDX license policy. Names an alternative tool (security_audit_sbom_vulnerabilities) and includes a fallback to report_feedback. Could mention more siblings like security_audit_licence_compatibility but still strong guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_audit_sbom_vulnerabilitiesARead-onlyIdempotentInspect
Audit a Software Bill of Materials for known vulnerabilities across all listed packages. Read-only. No side effects. Idempotent. sbom_json: CycloneDX or SPDX SBOM as a JSON string. Required. Large SBOMs (100+ packages) may take up to 10 seconds. Returns CVEs grouped by package with severity and fixed versions. Use this when you have a full SBOM to audit. Use security_fetch_package_vulnerabilities instead when checking a single package version. Verified source: Google OSV.dev batch API. 1-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_audit_sbom_vulnerabilities", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| sbom_json | Yes | CycloneDX or SPDX SBOM as JSON string. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
States 'Read-only. No side effects. Idempotent.' which matches annotations. Adds performance note (up to 10 seconds for large SBOMs), return format, and caching details. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Information is front-loaded and logically ordered. Could be slightly more concise, but each sentence adds value. Not overly verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, behavior, parameter, performance, caching, source, and usage guidance. Complete for a single-parameter tool with output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage 100% with description of sbom_json as CycloneDX/SPDX JSON string. Description adds performance constraint for large SBOMs, which is valuable beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Audit a Software Bill of Materials for known vulnerabilities across all listed packages.' Specifies verb (audit) and resource (SBOM). Distinguishes from sibling tool security_fetch_package_vulnerabilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use this when you have a full SBOM to audit. Use security_fetch_package_vulnerabilities instead when checking a single package version.' Also provides a feedback fallback mechanism.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_detect_typosquattingARead-onlyIdempotentInspect
Detect typosquatting attacks against a package name. Compares using Damerau-Levenshtein distance ≤ 2 against top-10,000 packages. Returns similar_packages with anomaly scores, and a SUSPICIOUS or CLEAN verdict. Uses PyPI and npm download stats stored in Redis. Cold-start fetch on first call (≤ 30s). Rate limit: 60/minute. No auth required. For security engineers auditing supply-chain package names before inclusion. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_detect_typosquatting", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| ecosystem | Yes | Package ecosystem: npm, pypi, cargo, go. Required. | |
| package_name | Yes | Package name e.g. requests. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses behavioral traits beyond annotations: cold-start fetch (≤30s), rate limit (60/minute), no auth required, and algorithm details. Annotations already indicate readOnly, openWorld, idempotent, non-destructive, and the description reinforces and adds context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is dense and well-structured, opening with the core purpose. Each sentence serves a purpose (algorithm, output, performance, rate limit, target audience, fallback). It is concise but could be slightly shorter by condensing the report_feedback instruction.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity and the presence of an output schema, the description covers all necessary aspects: purpose, algorithm, output format, performance characteristics, rate limit, authentication, target audience, and fallback action. It is comprehensive for an AI agent to invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters (package_name and ecosystem). The description adds minimal extra value, such as an example package name 'requests', but does not significantly enhance understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool detects typosquatting attacks against a package name, using Damerau-Levenshtein distance. It specifies the scope of comparison (top-10,000 packages) and outputs (similar_packages with scores, verdict). This distinguishes it from sibling security tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly targets security engineers auditing supply-chain package names. It provides guidance on when to use the tool and includes a fallback action (report_feedback) if the response is insufficient. However, it does not explicitly state when not to use it or compare alternatives directly.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_cisa_kevARead-onlyIdempotentInspect
Check whether a CVE is in the CISA Known Exploited Vulnerabilities (KEV) catalog. Read-only. No side effects. Idempotent. cve_id: CVE identifier in format CVE-YYYY-NNNNN e.g. CVE-2021-44228. Required. Returns in_kev (bool), date_added, due_date, ransomware_use, and notes from the CISA KEV catalog. KEV status answers 'Is this being actively exploited?' — a critical triage question not available in NIST NVD. Verified source: CISA KEV catalog (updated daily, cached). Use security_fetch_cve_detail for full CVE severity. Use security_fetch_cve_epss for exploit probability. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_cisa_kev", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| cve_id | Yes | CVE identifier e.g. CVE-2021-44228. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already convey safety (read-only, idempotent). Description adds 'Read-only. No side effects. Idempotent.' and source freshness details, contributing beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured and informative, but slightly lengthy. Each section adds value; could be slightly more concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Completely covers purpose, usage, output fields, source, and fallback. No gaps given the tool's simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a description for cve_id. Description reinforces required format and example, slightly enhancing clarity.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Check whether a CVE is in the CISA Known Exploited Vulnerabilities (KEV) catalog.' It specifies the resource and action, and differentiates from sibling tools like security_fetch_cve_detail and security_fetch_cve_epss.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides when to use (critical triage question not in NVD) and when not, with direct referrals to sibling tools. Also includes fallback instruction for report_feedback.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_cve_detailARead-onlyIdempotentInspect
Fetch full detail for a specific CVE by ID. Read-only. No side effects. Idempotent. cve_id: CVE identifier in format CVE-YYYY-NNNNN e.g. CVE-2021-44228. Required. Returns description, CVSS base score, affected products, patch references, and publish date. Use this when you have a CVE ID and need complete detail beyond what a package scan returns. Use security_fetch_package_vulnerabilities instead when you want all CVEs for a package version. Verified source: NIST NVD. 1-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_cve_detail", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| cve_id | Yes | CVE identifier e.g. CVE-2021-44228. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. Description adds 'Read-only. No side effects. Idempotent,' plus '1-hour cache' and 'Verified source: NIST NVD,' providing behavioral context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is front-loaded with purpose and is relatively concise. The inclusion of the report_feedback instruction adds a few extra sentences but serves a useful purpose for error handling.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete for a simple tool: explains what it does, when to use, what it returns, cache behavior, source, and error reporting. No gaps given the tool complexity and presence of output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema already covers the single parameter with example and required flag (100% coverage). Description repeats the format and example, adding minimal new meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Fetch full detail for a specific CVE by ID.' It uses a specific verb and resource, and distinguishes from sibling tool security_fetch_package_vulnerabilities by naming it explicitly.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'when you have a CVE ID and need complete detail beyond what a package scan returns.' Also states when not to use and recommends the alternative sibling tool. Provides cache and feedback instructions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_cve_epssARead-onlyIdempotentInspect
EPSS exploit probability score for a CVE — predicts likelihood of exploitation in the next 30 days.
cve_id: CVE identifier e.g. "CVE-2021-44228".
Returns: epss (float 0.0–1.0) and percentile (float 0.0–100.0). Thresholds: >0.7 patch immediately, 0.3–0.7 patch soon, <0.3 monitor. Use with security_fetch_cve_detail to prioritize patching — EPSS measures urgency, CVSS measures severity. Source: FIRST.org. 6-hour cache.
Example: fetch_cve_epss(cve_id="CVE-2021-44228")
| Name | Required | Description | Default |
|---|---|---|---|
| cve_id | Yes | CVE identifier e.g. CVE-2021-44228. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses return fields (epss, percentile), thresholds, source (FIRST.org), cache duration (6-hour). No annotation contradiction; annotations (readOnly, openWorld, idempotent) are consistent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with key info; uses brief sentences and bullet-like presentation. No wasted text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given single parameter, rich output schema, and annotations, the description fully covers the tool's behavior, return values, and use case.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% with format hint, but description adds an explicit example and context on the required parameter. Adds value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb-resource: 'EPSS exploit probability score for a CVE' with prediction horizon. Distinct from siblings like security_fetch_cve_detail which deals with CVSS severity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: with security_fetch_cve_detail. Provides thresholds for action (patch immediately, soon, monitor). Tells what EPSS measures vs CVSS.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_cve_risk_summaryARead-onlyIdempotentInspect
Instant CVE risk verdict. Combines CVSS severity, CISA KEV exploitation status, and EPSS probability in one parallel call. Returns CRITICAL_EXPLOIT, HIGH_RISK, MODERATE, LOW, or UNKNOWN verdict with patch availability from vendor advisories. UNKNOWN means all upstream sources were unreachable — not that risk is low. Rate limit: 60/minute. No auth required. For security engineers triaging vulnerabilities after fetch_cve_watch fires. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_cve_risk_summary", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| cve_id | Yes | CVE identifier e.g. CVE-2021-44228. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that UNKNOWN means sources unreachable, rate limit 60/minute, no auth required. Annotations already indicate read-only and idempotent; description adds valuable nuance beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise but covers all essential aspects: purpose, behavior, usage context, error interpretation, and fallback. Slightly verbose due to full sentences, but each sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description adequately covers verdicts, patch availability, error interpretation, rate limit, and use case. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear description for cve_id. The tool description adds no additional parameter semantics beyond the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it provides an 'Instant CVE risk verdict' combining CVSS, CISA KEV, EPSS, and lists possible verdicts. It distinguishes from siblings by positioning itself as a composite tool used after fetch_cve_watch.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides usage context ('after fetch_cve_watch fires') and includes a fallback instruction to call report_feedback with specific parameters if the response does not serve the user. No contradictions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_cve_watchADestructiveInspect
Persistent CVE watchlist. Create once, check anytime for new events since your last visit — patch releases, KEV listings, PoC publications, exploitation detected. Uses Redis for persistence, NVD + CISA KEV + EPSS for daily background refresh. Returns has_new_events, events (list), call_back_in="24h" on check. Rate limit: 60/minute. No auth required. For security engineers tracking CVE exposure over time. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_cve_watch", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| action | Yes | Action: create, check, or delete the watchlist. Required. | |
| cve_ids | Yes | List of CVE IDs to watch e.g. ['CVE-2021-44228']. Required for create. | |
| watch_id | Yes | Unique watch identifier to create, check, or delete. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description goes beyond annotations by detailing internal mechanisms (Redis persistence, daily refresh from NVD, CISA KEV, EPSS), rate limit (60/min), authentication requirement (none), and return fields. It accurately reflects the destructive nature via annotations and describes the stateful behavior of check returning new events since last visit.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively verbose, with several sentences providing background and a fallback instruction. While informative, it could be more concise. The structure is logical but not tightly written.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of a stateful watchlist tool, the description covers persistence, refresh, return values, and usage context. With an output schema and annotations present, the description provides sufficient additional context to use the tool effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description adds minimal parameter detail beyond what the schema provides, mostly restating that cve_ids are required for create and watch_id is always required. No additional format or constraints are explained.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool manages a persistent CVE watchlist with create, check, and delete actions. It specifies the purpose: track new CVE events over time, distinguishing it from other CVE lookup tools by emphasizing persistence and background refresh.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides context for when to use the tool: for security engineers tracking CVE exposure over time. It includes an explicit fallback instruction to call report_feedback if the response does not serve the user's need, which is helpful. However, it does not explicitly mention when not to use or compare with sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_cve_watch_statusARead-onlyIdempotentInspect
Check all specified CVE watches for new events since your last poll. Returns only watches with new events, making it efficient to run on a schedule. watch_ids: List of watch IDs to check — same IDs used when creating watches with security_fetch_cve_watch. Required. Uses a per-user cursor (last_polled timestamp) stored in Redis. First call returns events from the last 30 days. Subsequent calls return only events newer than the last poll. Sources: Redis (existing watch data written by security_fetch_cve_watch). No external API calls — instant response. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_cve_watch_status", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| watch_ids | Yes | List of watch IDs to check e.g. ['watch-1','watch-2']. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond the annotations (readOnlyHint, openWorldHint, idempotentHint, destructiveHint), the description adds rich behavioral details: uses per-user cursor stored in Redis, first call returns 30 days, subsequent calls return new events, no external API calls, instant response. Also describes the fallback mechanism.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the main purpose and efficiency claim, then expands on parameter, behavior, sources, and fallback. Every sentence adds unique value without redundancy, achieving conciseness despite its length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (cursor-based polling, multiple sources, no external API), the description fully covers purpose, parameter, cursor behavior, source, performance, and fallback. Output schema exists to document return values, so no further explanation is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with a clear description of the single required parameter. The description adds value by linking the watch_ids to the creation tool (security_fetch_cve_watch) and reiterating requirement, going beyond the schema's solo definition.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb ('Check') and resource ('CVE watches'), and explains the key behavior (returns only watches with new events, efficient for scheduling). Clearly distinguishes from sibling tools like security_fetch_cve_watch which creates watches.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains the cursor-based polling behavior and suggests scheduling. It includes a fallback instruction for report_feedback if the tool doesn't meet the need. However, it does not explicitly state when not to use this tool or compare with alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_dependency_graphARead-onlyIdempotentInspect
Fetch the full dependency tree for a package version including transitive dependencies. Read-only. No side effects. Idempotent. Hard 8-second timeout — large dependency trees may return partial results. package: Package name. Required. version: Exact version string e.g. 1.2.3. Required. ecosystem: One of PyPI, npm, Maven, Go, Cargo, NuGet, RubyGems. Required. Returns all direct and transitive dependencies with version constraints. Use this to understand full supply chain exposure. Use security_fetch_package_vulnerabilities instead when you only need CVEs for a single package. Verified source: deps.dev (Google). 1-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_dependency_graph", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| package | Yes | Package name e.g. requests. Required. | |
| version | Yes | Package version e.g. 2.28.0. Required. | |
| ecosystem | Yes | Package ecosystem: npm, pypi, cargo, go, maven, nuget. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds behavioral details beyond annotations: read-only, no side effects, idempotent, 8-second timeout with partial results, verified source (deps.dev), and 1-hour cache. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is well-structured and front-loaded with core purpose, but includes a lengthy feedback instruction that could be more concise. Still, it is mostly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description covers purpose, usage guidelines, constraints (timeout, partial results), source, caching, and alternative tool. Highly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and description restates parameter details with examples, but does not add significant meaning beyond the schema's own descriptions. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Fetch the full dependency tree for a package version including transitive dependencies.' It uses a specific verb and resource, and distinguishes from sibling tool security_fetch_package_vulnerabilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides when to use this tool ('understand full supply chain exposure') and when to use an alternative ('Use security_fetch_package_vulnerabilities instead when you only need CVEs for a single package'). Also notes the hard 8-second timeout and potential partial results.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_licence_analysisARead-onlyIdempotentInspect
Understand any software licence in plain English. Returns obligations, permissions, limitations, risk level, and OSI/FSF status for any SPDX licence identifier. Static bundle covers top-50 common licences (no network call). Falls back to spdx.org API for rare identifiers. All risk levels assume proprietary/commercial use. Rate limit: 60/minute. No auth required. For security engineers and developers understanding what a licence allows before including a dependency. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_licence_analysis", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| spdx_id | Yes | SPDX licence identifier e.g. MIT, Apache-2.0, GPL-3.0. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and idempotentHint=true. The description adds additional behavioral context: rate limit (60/minute), no auth required, and the assumption that risk levels are for proprietary/commercial use. It does not disclose error behavior or exact fallback latency, but the added info is valuable and non-contradictory.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficient and well-structured. It starts with a clear purpose statement, followed by return types, behavioral notes (static bundle vs. API fallback), risk assumption, rate limit, auth, and intended audience, ending with a fallback instruction. Every sentence adds value; there is no redundancy or vague filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that the tool has a simple input (one string parameter) and an output schema exists (so return values are documented elsewhere), the description covers all necessary aspects: purpose, behavior, fallback, rate limits, authentication, and intended use. It is complete for effective agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema provides a description for the single parameter spdx_id that is already clear ('SPDX licence identifier e.g. MIT, Apache-2.0, GPL-3.0. Required.'). With 100% schema description coverage, the description adds no further semantic detail. The baseline of 3 is appropriate because the schema already handles the meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Understand any software licence in plain English.' It specifies the returns (obligations, permissions, limitations, risk level, OSI/FSF status) and the input type (SPDX licence identifier). This distinguishes it from sibling tools like security_fetch_package_licence, which focus on fetching licence identifiers rather than providing a semantic analysis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states the target audience ('security engineers and developers') and the context ('before including a dependency'). It explains the behavior: static bundle for top-50 common licences, fallback to API for rare identifiers. It also provides a clear alternative action if the response is insufficient: call report_feedback with specific parameters. This is comprehensive guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_package_licenceARead-onlyIdempotentInspect
Fetch the SPDX licence identifier for an open source package version. Read-only. No side effects. Idempotent. package: Package name e.g. flask. Required. version: Exact version string e.g. 2.3.0. Required. ecosystem: One of PyPI, npm, Maven, Go, Cargo, NuGet, RubyGems. Required. Returns the SPDX licence identifier e.g. MIT, Apache-2.0, GPL-3.0. Use this to verify licence compatibility before including a dependency. Use security_fetch_package_vulnerabilities instead when checking for security issues not licences. Verified source: deps.dev (Google). 1-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_package_licence", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| package | Yes | Package name e.g. requests. Required. | |
| version | Yes | Package version e.g. 2.28.0. Required. | |
| ecosystem | Yes | Package ecosystem: npm, pypi, cargo, go, maven, nuget. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description repeats the annotations (read-only, no side effects, idempotent) but adds useful behavioral details beyond annotations: the 1-hour cache and verified source (deps.dev). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with key information upfront. Every sentence adds value, though the feedback instruction is somewhat lengthy but standard. It avoids unnecessary fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is comprehensive: it covers purpose, parameters, return value, usage context, alternatives, data provenance, caching, and a feedback mechanism. For a simple tool, it leaves no gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with basic examples, but the description adds value by providing more detailed examples (flask, 2.3.0) and a complete list of ecosystems including RubyGems, which was missing from the schema. This enhances understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches the SPDX licence identifier for a specific package version. It distinguishes itself from the sibling tool security_fetch_package_vulnerabilities by specifying its use for licence compatibility, not security issues.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use the tool (verify licence compatibility before including a dependency) and when to use an alternative (security_fetch_package_vulnerabilities for security issues). It also provides context about the data source and caching.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_package_maintainer_historyARead-onlyIdempotentInspect
Analyse ownership and release history for an npm or PyPI package to detect supply-chain risk. Uses PyPI JSON API and npm registry — data refreshed on each call, 1-hour cache. Returns maintainer_count, recent_changes, ownership_transfers, account_ages, anomaly_score (0.0–1.0), and maintainer_health (healthy | stale | abandoned | suspicious). Rate limit: 60/minute. No auth required. For security engineers auditing open-source dependencies before inclusion in production builds. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_package_maintainer_history", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| ecosystem | Yes | Package ecosystem: npm, pypi, cargo, go. Required. | |
| package_name | Yes | Package name e.g. requests. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description adds significant behavioral information beyond annotations: data source (PyPI JSON API and npm registry), caching (1-hour), rate limit (60/minute), no auth required, and output fields including anomaly_score and maintainer_health enum values. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with purpose, then provides data source, caching, rate limit, target audience, and feedback fallback. No wasted sentences; each adds value. Slightly wordy but efficient overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers most aspects: purpose, data sources, output fields, rate limits, and a fallback mechanism. Could include more detail on interpreting anomaly_score or limitations, but given the output schema exists, it is largely complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. Description provides an example for package_name ('e.g. requests') but adds minimal semantic value beyond schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'analyse', specific resource 'ownership and release history for npm or PyPI package', and explicit purpose 'detect supply-chain risk'. Distinguishes from sibling security tools by focusing on maintainer history.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states target users (security engineers) and use case (auditing dependencies before production). Includes fallback to report_feedback if response is inadequate. Lacks explicit comparison to sibling tools but provides good context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_package_risk_briefARead-onlyIdempotentInspect
Single SHIP/CAUTION/BLOCK verdict for any package. Combines CVEs, licence, maintainer health, and transitive count in one call. Uses OSV.dev, deps.dev, PyPI, and npm registry — data refreshed on each call. Returns verdict (SHIP/CAUTION/BLOCK), critical_cve_count, high_cve_count, licence_risk, maintainer_health, transitive_count, resolved_version, upstream_status, and reasoning. Rate limit: 30/minute. No auth required. For security engineers performing pre-inclusion package review. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_package_risk_brief", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| version | No | Package version e.g. 2.28.0. Required. | |
| ecosystem | Yes | Package ecosystem: npm, pypi, cargo, go, maven. Required. | |
| package_name | Yes | Package name e.g. requests. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description goes beyond annotations by stating the data sources (OSV.dev, deps.dev, etc.), that data is refreshed on each call, a rate limit of 30/minute, and that no auth is required. Annotations already indicate readOnly, idempotent, and destructive false, so the description adds valuable operational context without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured: purpose first, then data sources, output fields, operational info, use case, and fallback. It is informative but slightly lengthy; could be more concise by omitting some repetitive details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (implied), the description adequately covers behavior, rate limits, auth, and use case. Missing error handling or default version behavior, but overall sufficient for agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with descriptions for all three parameters. The tool description does not add any parameter details beyond what the schema provides. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns a single SHIP/CAUTION/BLOCK verdict for a package, combining CVEs, license, maintainer health, and transitive count. It distinguishes from sibling tools by emphasizing it's a consolidated risk assessment versus more granular tools like security_fetch_package_vulnerabilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description specifies the target audience ('security engineers performing pre-inclusion package review') and provides a feedback fallback if the tool fails to serve the need. However, it does not explicitly mention alternative sibling tools or when to prefer this over them.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
security_fetch_package_vulnerabilitiesARead-onlyIdempotentInspect
Fetch all known CVEs for an open source package version or a batch of packages. Read-only. No side effects. Idempotent. Single-package mode: package (e.g. requests), version (e.g. 2.28.0), ecosystem (PyPI/npm/Maven/Go/Cargo/NuGet/RubyGems). Batch mode: packages array of {name, version, ecosystem} objects — max 50 per call. If packages array is provided and non-empty, batch mode is used and package/version/ecosystem are ignored. Batch returns {results: [...], partial: bool, failed_count: int}. Each result has vuln_count and vulnerabilities list. Returns CVE ID, severity, CVSS score, affected range, and fixed version. Use security_fetch_cve_detail for full detail by CVE ID. Use security_audit_sbom_vulnerabilities for SBOM files. Verified source: Google OSV.dev. 1-hour cache. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="security_fetch_package_vulnerabilities", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| package | No | Package name e.g. requests. Required in single-package mode. | |
| version | No | Package version e.g. 2.28.0. Required in single-package mode. | |
| packages | No | Batch list of {name, version, ecosystem} objects. Max 50. | |
| ecosystem | No | Package ecosystem: npm, pypi, cargo, go, maven, nuget. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnly, idempotent, non-destructive. Description reinforces these and adds source (OSV.dev), cache (1-hour), and feedback instruction. This adds value beyond annotations but is not critical new information.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is fairly concise, front-loaded with purpose. Every sentence adds value, though some redundancy with annotations (e.g., 'Read-only. No side effects. Idempotent.'). Could be slightly tighter but still efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 4 parameters, rich annotations, output schema hint, and sibling context, the description covers modes, constraints, fallback behavior, and even provides a feedback mechanism for gaps. Highly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema descriptions cover all 4 parameters. Description adds meaning by explaining mode selection logic (single vs batch), which parameters are ignored in batch mode, and the structure of batch results. This extra context improves understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches CVEs for packages, specifies single vs batch mode, and distinguishes from siblings like security_fetch_cve_detail and security_audit_sbom_vulnerabilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use (fetch CVEs for package version or batch) and when not to (use security_fetch_cve_detail for CVE details, security_audit_sbom_vulnerabilities for SBOM files). Also mentions max batch size and fallback to single mode.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
validate_tool_outputARead-onlyIdempotentInspect
Validate a DataNexus tool response for data quality issues using two-layer validation: deterministic rules first, then AI review for ambiguous cases. Read-only. Never blocks. tool_id: DataNexus tool identifier e.g. T04, T10, T22. Required. Find in the tool_id field of any response. query_hash: Hash from the response you are validating. Required. Enables feedback correlation. response_json: Full tool response serialised as a JSON string. Required. Returns pass or issues_found, with issues from each layer and whether feedback was auto-filed. Both layers must agree before feedback is filed. Use validate_tool_output to check data quality. Use report_feedback instead to manually report an issue you have already identified. If this tool's response does not serve the user's need, call report_feedback with feedback_type="agent_gap", tool_id="validate_tool_output", intended_query="{what the user needed}", gap_description="{what was missing or wrong in the result}".
| Name | Required | Description | Default |
|---|---|---|---|
| tool_id | Yes | DataNexus tool identifier, e.g. T04, T10, T22 — found in the tool_id field of any response. Required. | |
| query_hash | Yes | Hash from the response being validated — found in the query_hash field of any response. Enables feedback correlation. Required. | |
| response_json | Yes | The full tool response, serialised as a JSON string, to validate for data quality issues. Required. |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare read-only, idempotent, non-destructive. Description adds 'Never blocks' and explains the two-layer validation and feedback filing condition, providing extra behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with purpose, then parameters, then usage guidance. Every sentence adds value, no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 3 required params, 100% schema coverage, and an output schema (not shown), the description covers behavior, parameters, usage, and conditions. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage, but description adds practical context for each parameter: sources of tool_id and query_hash, purpose of feedback correlation, and requirement for JSON serialization.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool validates DataNexus tool responses for data quality, using two-layer validation. Distinguishes from sibling report_feedback.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use validate_tool_output vs report_feedback, and provides fallback instructions for when the tool's response is insufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!