dns
Server Details
DNS & email security scanner — 51 tools for SPF, DMARC, DKIM, DNSSEC, SSL, and more.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- MadaBurns/bv-mcp
- GitHub Stars
- 5
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 3.8/5 across 51 of 51 tools scored. Lowest: 3.2/5.
Most tools have distinct purposes focused on specific DNS, email security, or compliance checks, with clear boundaries like check_dmarc for DMARC validation versus check_spf for SPF. However, some overlap exists, such as check_dane and check_dane_https both handling DANE verification, which could cause minor confusion.
Tool names follow a highly consistent verb_noun pattern throughout, using snake_case uniformly. Examples include check_dmarc, generate_dkim_config, and simulate_attack_paths, making the set predictable and easy to parse.
With 51 tools, the count is excessive for a DNS security server, leading to potential overwhelm and redundancy. While the domain is broad, many tools could be consolidated (e.g., multiple check_* tools for similar audits), making the set feel heavy and less scoped.
The tool set provides comprehensive coverage for DNS and email security, including scanning, validation, generation, remediation, and compliance mapping. It supports full CRUD-like workflows (e.g., scan, analyze, generate, validate) with no obvious gaps, ensuring agents can handle end-to-end tasks.
Available Tools
51 toolsanalyze_driftARead-onlyIdempotentInspect
Compare current security posture against a previous baseline. Shows what improved, regressed, or changed.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to analyze drift for | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| baseline | Yes | Previous ScanScore JSON or "cached" to use last cached scan |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, open-world, idempotent, and non-destructive behavior. The description adds context by specifying that it 'shows what improved, regressed, or changed,' which clarifies the output's comparative nature, though it doesn't detail rate limits or authentication needs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded and concise with two sentences that efficiently convey the tool's purpose and output, with no wasted words or redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity, rich annotations, and no output schema, the description is mostly complete but could benefit from more detail on output format or behavioral constraints. It adequately covers the core functionality without being exhaustive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are well-documented in the schema. The description does not add meaning beyond the schema, as it doesn't explain parameter interactions or usage nuances, meeting the baseline score for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('compare', 'shows') and resources ('current security posture', 'previous baseline'), and distinguishes it from siblings like 'compare_baseline' by focusing on drift analysis of improvements, regressions, and changes rather than general comparison.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for analyzing security drift against a baseline, but does not explicitly state when to use this tool versus alternatives like 'compare_baseline' or 'scan_domain', nor does it provide exclusions or prerequisites for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
assess_spoofabilityBRead-onlyIdempotentInspect
Composite email spoofability score (0-100).
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide key behavioral hints (readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true), so the description doesn't need to repeat these. It adds value by specifying the output as a 'composite score (0-100),' which gives context on what the tool returns. However, it lacks details on rate limits, authentication needs, or how the score is computed, leaving some behavioral aspects unclear.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—a single sentence that directly states the tool's output. It's front-loaded with the core purpose and wastes no words, making it efficient and easy to parse for an AI agent. Every part of the sentence earns its place by conveying essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (assessing spoofability) and lack of an output schema, the description is somewhat incomplete. It mentions the score range but doesn't explain what the score means, how it's derived, or what factors contribute to it. With annotations covering safety and idempotency, it's minimally adequate, but more context on the output would enhance completeness for a security assessment tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters ('domain' and 'format'). The description doesn't add any semantic details beyond the schema, such as explaining what 'composite' entails or how 'format' affects the score output. Since the schema does the heavy lifting, a baseline score of 3 is appropriate, as the description doesn't compensate with extra insights.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: to compute a 'composite email spoofability score (0-100).' It specifies the verb ('assess') and resource ('email spoofability'), making it clear what the tool does. However, it doesn't differentiate from sibling tools like 'check_dmarc' or 'check_spf,' which might also relate to email security, so it doesn't fully distinguish from alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools related to email and domain security (e.g., 'check_dmarc,' 'check_spf'), there's no indication of context, prerequisites, or exclusions. This leaves the agent without direction on appropriate usage scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
batch_scanARead-onlyIdempotentInspect
Scan up to 10 domains at once. Returns score, grade, and finding counts per domain.
| Name | Required | Description | Default |
|---|---|---|---|
| format | No | Output verbosity. Auto-detected if omitted. | |
| domains | Yes | Domains to scan (max 10 per request) | |
| force_refresh | No | Bypass cache and run fresh scans. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true. The description adds valuable behavioral context about the 10-domain limit and the specific return format (score, grade, finding counts), which goes beyond what annotations convey.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences that are perfectly front-loaded with essential information. The first sentence covers purpose and constraints, the second covers output. Zero wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich annotations (readOnly, idempotent, openWorld) and full schema coverage, the description provides good context about batch capability and return format. However, without an output schema, it could benefit from more detail about the structure of returned scores/grades/finding counts.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already fully documents all three parameters. The description doesn't add any parameter-specific details beyond what's in the schema, so it meets the baseline expectation without adding extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('scan'), resource ('domains'), scope ('up to 10 at once'), and output ('score, grade, and finding counts per domain'). It distinguishes from sibling 'scan_domain' by specifying batch capability.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for batch scanning of domains, but doesn't explicitly state when to use this versus 'scan_domain' or other scanning tools. It provides clear context about batch processing but lacks explicit alternatives or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_bimiARead-onlyIdempotentInspect
Validate BIMI record and VMC evidence.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide excellent behavioral context (readOnlyHint: true, openWorldHint: true, idempotentHint: true, destructiveHint: false). The description adds value by specifying what exactly gets validated (BIMI record and VMC evidence), which isn't covered by annotations. No contradictions exist between the description and annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise at just 6 words, front-loading the essential action ('validate') and targets. Every word earns its place with zero wasted verbiage, making it immediately scannable and understandable.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich annotations (which cover safety, idempotence, and open-world behavior) and complete parameter documentation, the description provides adequate context. However, without an output schema, the description doesn't explain what validation results look like or what 'VMC evidence' entails, leaving some gaps for a validation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already fully documents both parameters. The description doesn't add any meaningful parameter semantics beyond what's in the schema (domain to check, output format options). This meets the baseline expectation when schema coverage is complete.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('validate') and target resources ('BIMI record and VMC evidence'), making it immediately understandable. However, it doesn't explicitly differentiate from sibling tools like 'check_dmarc' or 'check_spf' that also perform domain validation checks, which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools performing various domain checks (e.g., 'check_dmarc', 'check_spf', 'scan_domain'), there's no indication of when BIMI validation is specifically needed or how it differs from other validation tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_caaARead-onlyIdempotentInspect
Look up CAA records for a domain. Shows which Certificate Authorities are authorized to issue certificates.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true. The description adds useful context about what the tool returns ('Shows which Certificate Authorities are authorized to issue certificates'), which isn't covered by annotations. However, it doesn't mention rate limits, authentication needs, or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences with zero wasted words. The first sentence states the core purpose, and the second explains the output. It's front-loaded and efficiently structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only query tool with good annotations and full schema coverage, the description provides adequate context. It explains what the tool does and what information it returns. However, without an output schema, it could benefit from more detail about the return format (e.g., structured data vs plain text).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear documentation for both parameters. The description doesn't add any parameter-specific details beyond what's in the schema. The baseline score of 3 is appropriate since the schema fully documents the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Look up CAA records') and resource ('for a domain'), with explicit mention of the output ('Shows which Certificate Authorities are authorized to issue certificates'). It distinguishes itself from sibling tools like check_dnssec or check_spf by focusing specifically on CAA records.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for domain security checks but doesn't explicitly state when to use this tool versus alternatives like check_ssl or check_dmarc. It provides basic context (checking CAA records) but lacks guidance on exclusions or specific scenarios where this tool is preferred over other DNS/security checks.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_daneARead-onlyIdempotentInspect
Verify DANE/TLSA certificate pinning.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description doesn't contradict these annotations. It adds context about what gets verified (certificate pinning), though it could mention more about network behavior, rate limits, or authentication needs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's front-loaded with the core action and resource, making it easy to parse quickly. Every word earns its place in conveying essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the annotations cover safety and idempotency, and the schema fully documents parameters, the description provides adequate context for a read-only verification tool. However, without an output schema, it could benefit from mentioning what the verification result includes (e.g., success/failure, details). The complexity is moderate, and the description is complete enough for basic use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters (domain and format). The description doesn't add any additional parameter semantics beyond what the schema provides, such as explaining domain validation rules or format implications. With high schema coverage, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('verify') and resource ('DANE/TLSA certificate pinning'), distinguishing it from sibling tools like check_dane_https, check_ssl, or check_tlsrpt which focus on different aspects of domain security. The verb 'verify' is precise and the target 'DANE/TLSA certificate pinning' is a well-defined technical concept.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for DANE/TLSA verification but doesn't explicitly state when to use this tool versus alternatives like check_dane_https or check_ssl. No guidance is provided about prerequisites, dependencies, or scenarios where this tool is preferred over others in the sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_dane_httpsARead-onlyIdempotentInspect
Verify DANE certificate pinning for HTTPS via TLSA records at _443._tcp.{domain}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true, covering safety and idempotency. The description adds valuable context about what gets verified (DANE certificate pinning for HTTPS via specific TLSA records), which isn't captured in annotations. No contradictions exist between description and annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, dense sentence with zero wasted words. It front-loads the core purpose ('Verify DANE certificate pinning for HTTPS') and efficiently specifies the mechanism ('via TLSA records at _443._tcp.{domain}'). Every element earns its place by providing essential technical context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only, idempotent diagnostic tool with full schema coverage and no output schema, the description is mostly complete. It clearly states what the tool does and how it operates. However, it could benefit from mentioning typical use cases or output characteristics (e.g., what 'verified' means in practice) to fully guide the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('domain' and 'format') well-documented in the schema. The description doesn't add any parameter-specific details beyond what's in the schema, such as explaining TLSA record formats or domain validation rules. Baseline 3 is appropriate given the comprehensive schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Verify DANE certificate pinning for HTTPS') and resource ('via TLSA records at _443._tcp.{domain}'), distinguishing it from sibling tools like 'check_dane' (general DANE check) and 'check_ssl' (general SSL check). It precisely defines the scope as HTTPS-specific DANE verification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implicitly provides usage context by specifying the protocol (HTTPS) and record type (TLSA at _443._tcp), which helps differentiate it from other DNS security tools. However, it lacks explicit guidance on when to use this tool versus alternatives like 'check_dane' or 'check_ssl', or any prerequisites for effective use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_dblARead-onlyIdempotentInspect
Check domain reputation against DNS-based Domain Block Lists (Spamhaus DBL, URIBL, SURBL). Returns listing status with decoded return codes.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true, covering safety and idempotency. The description adds valuable context by specifying the blocklists used (Spamhaus DBL, URIBL, SURBL) and that it 'Returns listing status with decoded return codes,' which clarifies the output format beyond what annotations offer.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently conveys the tool's purpose, method, and output. Every word earns its place, with no redundant or vague phrasing, making it easy to parse and understand quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (domain reputation check), rich annotations (covering safety and behavior), and no output schema, the description is mostly complete. It specifies the blocklists used and output format, but could benefit from mentioning rate limits or error handling. However, it provides sufficient context for effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters ('domain' and 'format'). The description does not add any additional semantic details about parameters beyond what the schema provides, such as explaining domain format requirements or format implications. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Check domain reputation'), identifies the resource ('domain'), and specifies the method ('against DNS-based Domain Block Lists (Spamhaus DBL, URIBL, SURBL)'). It distinguishes itself from sibling tools like 'check_rbl' by focusing on domain-specific blocklists rather than general reputation or IP-based lists.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for domain reputation checking against specific blocklists, but does not explicitly state when to use this tool versus alternatives like 'check_rbl' (which might check IP-based lists) or 'scan_domain' (which performs broader scans). No exclusions or prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_dkimARead-onlyIdempotentInspect
Look up DKIM records for a domain. Probes common selectors and validates key strength and algorithm.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| selector | No | DKIM selector. Omit to probe common ones. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true. The description adds valuable behavioral context beyond annotations: it reveals the tool probes multiple selectors automatically and performs validation (key strength, algorithm). This provides operational insight not captured in structured fields.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with zero waste: first states core purpose, second adds important behavioral details (probing, validation). Every word earns its place, and information is front-loaded appropriately.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only diagnostic tool with good annotations and full schema coverage, the description provides adequate context about what the tool does and how it behaves. The main gap is no output schema, so return format isn't described, but the description compensates somewhat by hinting at validation results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all parameters. The description doesn't add any parameter-specific semantics beyond what's in the schema (e.g., it doesn't clarify 'common selectors' or 'validates' details). Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('look up DKIM records'), target resource ('for a domain'), and scope ('probes common selectors and validates key strength and algorithm'). It distinguishes from siblings like 'generate_dkim_config' (creation) and 'check_dmarc' (different protocol).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context through 'probes common selectors' and 'validates key strength and algorithm', suggesting this is for diagnostic/validation purposes. However, it doesn't explicitly state when to use this vs. alternatives like 'check_dnssec' or 'validate_fix', nor does it mention prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_dmarcARead-onlyIdempotentInspect
Look up and validate DMARC record for a domain. Shows policy enforcement, alignment mode, and reporting config.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide strong hints (readOnlyHint=true, destructiveHint=false, idempotentHint=true, openWorldHint=true), so the bar is lower. The description adds useful context about what information is returned (policy enforcement, alignment mode, reporting config) and implies a DNS lookup operation, which complements the annotations well without contradicting them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently conveys the tool's purpose and output. Every word earns its place, with no redundant information or unnecessary elaboration, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (DNS lookup/validation), rich annotations covering safety and behavior, and 100% schema coverage, the description provides adequate context. It clearly states what the tool does and what information it returns, though it lacks output format details (no output schema exists) and explicit usage guidance compared to siblings.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents both parameters (domain and format). The description doesn't add any parameter-specific information beyond what's in the schema, but it does imply the domain parameter is required for the DMARC lookup. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('look up and validate') and resource ('DMARC record for a domain'), and distinguishes it from siblings by focusing on DMARC-specific validation rather than other DNS checks like SPF, DKIM, or TLS. It explicitly mentions what information is returned (policy enforcement, alignment mode, reporting config).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for DMARC validation but doesn't explicitly state when to use this tool versus alternatives like check_spf, check_dkim, or generate_dmarc_record. No guidance is provided about prerequisites, timing, or exclusion criteria, leaving the agent to infer context from the tool name and description alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_dnssecARead-onlyIdempotentInspect
Check DNSSEC status for a domain. Verifies DNSKEY/DS records and validation chain.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true, covering safety and idempotency. The description adds useful context about what gets verified (DNSKEY/DS records and validation chain), but doesn't mention rate limits, authentication needs, or output format details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with zero wasted words. The first states the core purpose, the second adds technical detail. Perfectly front-loaded and appropriately sized for this tool.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only diagnostic tool with comprehensive annotations and full schema coverage, the description provides adequate context. The main gap is lack of output format details (no output schema exists), but the purpose and verification scope are clearly communicated.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters well-documented in the schema. The description doesn't add any parameter-specific information beyond what's already in the schema, so it meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Check DNSSEC status'), target resource ('for a domain'), and technical scope ('Verifies DNSKEY/DS records and validation chain'). It distinguishes from siblings like 'check_dnssec_chain' by focusing on status verification rather than chain analysis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for DNSSEC verification but provides no explicit guidance on when to choose this tool over alternatives like 'check_dnssec_chain' or 'check_zone_hygiene'. It lacks any when-not-to-use statements or prerequisite context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_dnssec_chainARead-onlyIdempotentInspect
Walk the DNSSEC chain of trust from root to target domain. Reports DS/DNSKEY records, algorithm usage, and linkage status at each zone level.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds valuable context about the tool's behavior: it 'walks' the chain (implying iterative queries), reports specific record types and statuses, and operates at 'each zone level', which clarifies scope beyond what annotations convey.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that front-loads the core action and efficiently lists outputs. Every phrase adds value without redundancy, making it appropriately concise for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity, rich annotations, and 100% schema coverage, the description is largely complete. It explains the tool's purpose and behavior well. However, without an output schema, it could benefit from more detail on return format (e.g., structured vs. textual), but the annotations and description provide sufficient context for agent use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear parameter descriptions in the schema. The description does not add meaning beyond the schema, as it mentions no parameters. Baseline score of 3 is appropriate since the schema adequately documents the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Walk the DNSSEC chain of trust') and resource ('from root to target domain'), with explicit output details ('Reports DS/DNSKEY records, algorithm usage, and linkage status at each zone level'). It distinguishes from sibling tools like 'check_dnssec' by focusing on chain traversal rather than general DNSSEC validation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for DNSSEC chain analysis but does not explicitly state when to use this tool versus alternatives like 'check_dnssec' or 'validate_fix'. No exclusions or prerequisites are mentioned, leaving usage context inferred rather than clearly defined.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_fast_fluxARead-onlyIdempotentInspect
Detect fast-flux DNS behavior by performing multiple rounds of A/AAAA queries with delays. Compares IP answer sets and TTLs across rounds to identify rotating infrastructure.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| rounds | No | Number of query rounds (3-5, default 3). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide readOnlyHint=true, openWorldHint=true, idempotentHint=true, and destructiveHint=false, covering safety and idempotency. The description adds valuable context by explaining the method ('multiple rounds of A/AAAA queries with delays') and analysis ('compares IP answer sets and TTLs'), which is not covered by annotations, enhancing behavioral understanding without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose in the first sentence and adds method details in the second, with no wasted words. It efficiently conveys essential information in two concise sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (3 parameters, no output schema) and rich annotations, the description is mostly complete. It explains the detection method and analysis but could benefit from mentioning output format or result interpretation, though annotations cover safety aspects well.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear documentation for 'domain', 'format', and 'rounds'. The description does not add meaning beyond the schema, such as explaining parameter interactions or default behaviors, so it meets the baseline of 3 for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('detect fast-flux DNS behavior') and methods ('performing multiple rounds of A/AAAA queries with delays'), distinguishing it from siblings like 'check_dnssec' or 'check_mx' by focusing on behavioral analysis rather than configuration or security checks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for detecting DNS fast-flux behavior but does not explicitly state when to use this tool versus alternatives like 'check_dnssec' or 'scan_domain'. No exclusions or specific contexts are provided, leaving usage to inference from the purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_http_securityBRead-onlyIdempotentInspect
Audit HTTP security headers (CSP, COOP, etc.).
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide strong behavioral hints: readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true. The description adds minimal context beyond this—it mentions 'audit' which aligns with read-only behavior, but doesn't disclose details like rate limits, authentication needs, or what specific headers are checked. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise—a single sentence that directly states the tool's purpose with no wasted words. It's front-loaded with the core action and includes examples (CSP, COOP) for clarity. Every part of the description earns its place efficiently.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (security auditing), rich annotations (covering safety and idempotency), and no output schema, the description is minimally adequate. It states what the tool does but lacks details on output format, error handling, or scope limitations. For a security tool among many siblings, more context would be helpful, but annotations provide critical behavioral information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters. The description doesn't add any meaningful semantics beyond the schema—it mentions 'domain' implicitly but provides no extra details on parameter usage, constraints, or interactions. With high schema coverage, the baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Audit HTTP security headers (CSP, COOP, etc.)'. It specifies the action (audit) and target (HTTP security headers), with examples of specific headers. However, it doesn't explicitly differentiate this tool from sibling tools like 'check_ssl' or 'check_tlsrpt', which also involve security checks but focus on different aspects.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools performing various security checks (e.g., 'check_ssl', 'check_dmarc'), there's no indication of whether this tool is for general HTTP header auditing, how it relates to other checks, or any prerequisites. Usage is implied by the domain parameter but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_lookalikesARead-onlyIdempotentInspect
Detect active typosquat/lookalike domains. Standalone.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds minimal behavioral context beyond this, only implying it's a 'Standalone' check, which doesn't fully disclose traits like rate limits, authentication needs, or output format. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with only two sentences, front-loaded with the core purpose and followed by a clarifying note ('Standalone'). Every word earns its place, with no redundancy or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (2 parameters, no output schema) and rich annotations covering safety and idempotency, the description is mostly complete. It clearly states the purpose and usage context but could benefit from more detail on behavioral aspects like what 'active' means or expected output, though annotations mitigate some gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents both parameters (domain and format). The description does not add any meaning beyond what the schema provides, such as explaining what 'active' detection entails or how 'compact' vs. 'full' formats differ. Baseline 3 is appropriate when schema handles parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('Detect') and resource ('active typosquat/lookalike domains'), and distinguishes it from siblings by noting it's 'Standalone', implying it performs a focused check rather than broader analysis like 'analyze_drift' or 'scan_domain'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for usage by specifying it detects 'active' lookalike domains, which suggests it's for real-time threat assessment rather than historical analysis. However, it does not explicitly state when not to use it or name alternatives among the many sibling tools, such as 'check_shadow_domains' or 'simulate_attack_paths'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_mta_stsBRead-onlyIdempotentInspect
Validate MTA-STS SMTP encryption policy.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide key behavioral hints: readOnlyHint=true, destructiveHint=false, openWorldHint=true, idempotentHint=true. The description doesn't contradict these. It adds minimal context by specifying 'SMTP encryption policy,' but doesn't disclose additional traits like rate limits, authentication needs, or what 'validate' entails beyond what annotations cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence: 'Validate MTA-STS SMTP encryption policy.' It's front-loaded with the core purpose, has zero wasted words, and is appropriately sized for a simple validation tool. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (2 parameters, no output schema) and rich annotations covering safety and idempotency, the description is minimally adequate. It states what the tool does but lacks context on usage relative to siblings or behavioral details. For a read-only validation tool with good annotations, it meets basic needs but could be more informative.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters (domain and format). The description adds no parameter-specific information beyond what's in the schema. According to scoring rules, with high schema coverage, the baseline is 3 even without param details in the description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Validate MTA-STS SMTP encryption policy.' It specifies the action (validate) and the resource (MTA-STS policy), making it easy to understand. However, it doesn't explicitly differentiate from sibling tools like 'check_dmarc' or 'check_spf', which are also security validation tools for different protocols.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools for domain security checks (e.g., check_dmarc, check_spf, check_tlsrpt), there's no indication of when MTA-STS validation is appropriate or how it relates to other checks. Usage is implied by the name but not explained.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_mxARead-onlyIdempotentInspect
Look up MX records for a domain. Shows mail servers, email provider detection, and validates configuration.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true. The description adds valuable context about what the tool returns ('shows mail servers, email provider detection, and validates configuration'), which goes beyond the annotations. No contradictions with annotations exist.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose ('Look up MX records for a domain') and adds supplementary functions without unnecessary elaboration. Every part of the sentence provides value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only tool with good annotations (readOnlyHint, idempotentHint) and full schema coverage, the description provides adequate context about what the tool does and returns. However, without an output schema, it could benefit from more detail on return format or error handling, though the 'format' parameter partially addresses this.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents both parameters. The description doesn't add any parameter-specific details beyond what's in the schema, but it implies the 'domain' parameter is used for MX lookups and the 'format' parameter affects output verbosity, which aligns with the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('look up', 'shows', 'validates') and resources ('MX records for a domain'), including additional functions like email provider detection and configuration validation. It distinguishes itself from siblings like 'check_dnssec' or 'check_ns' by focusing specifically on MX records.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for MX record analysis, email provider detection, and configuration validation, but doesn't explicitly state when to use this tool versus alternatives like 'check_mx_reputation' or 'scan_domain'. No explicit exclusions or prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_mx_reputationBRead-onlyIdempotentInspect
Check MX blocklist status and reverse DNS.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide excellent coverage (readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true). The description adds minimal behavioral context beyond annotations - mentioning 'blocklist status' and 'reverse DNS' gives some operational context, but doesn't elaborate on rate limits, authentication needs, or what specific blocklists are checked.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise (8 words) and front-loaded with the core purpose. Every word earns its place, with no redundant information or unnecessary elaboration. The structure is optimal for a tool with comprehensive annotations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the comprehensive annotations (covering safety, idempotency, and world openness) and 100% schema coverage, the description is minimally adequate. However, with no output schema and many similar sibling tools, the description could better explain what 'MX blocklist status' entails and how results differ from tools like 'check_mx' or 'check_rbl'.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters well-documented in the schema. The description adds no parameter-specific information beyond what's already in the structured schema. The baseline of 3 is appropriate since the schema carries the full parameter documentation burden.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('check' and 'reverse DNS') and identifies the resource ('MX blocklist status'). It distinguishes from some siblings like 'check_mx' by mentioning blocklist status specifically, but doesn't fully differentiate from all DNS-related tools in the extensive sibling list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With 40+ sibling tools including many DNS/security checks (check_mx, check_rbl, check_dnssec, etc.), there's no indication of when this specific MX reputation check is appropriate versus other domain validation tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_nsARead-onlyIdempotentInspect
Look up NS (nameserver) records for a domain. Shows DNS provider, delegation, and redundancy.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true. The description adds valuable context about what information is returned (DNS provider, delegation, redundancy) that goes beyond the safety profile indicated by annotations. No contradictions exist.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero waste. First sentence states purpose and parameters, second sentence specifies output content. Every word earns its place, and information is front-loaded appropriately.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only tool with comprehensive annotations and full schema coverage, the description provides adequate context about what information is returned. The main gap is lack of output format details (no output schema), but the description compensates by specifying the three categories of information shown.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are fully documented in the schema. The description doesn't add any parameter-specific details beyond what the schema provides (domain format, format enum values). Baseline 3 is appropriate when the schema carries the full burden.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Look up NS records'), target resource ('for a domain'), and scope ('Shows DNS provider, delegation, and redundancy'). It distinguishes from sibling tools by focusing exclusively on NS records rather than other DNS checks like MX, SPF, or DMARC.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context (DNS analysis) but doesn't explicitly state when to use this tool versus alternatives like 'check_dnssec' or 'check_zone_hygiene'. No guidance on prerequisites, exclusions, or named alternatives is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_nsec_walkabilityARead-onlyIdempotentInspect
Assess zone walkability risk by analyzing NSEC3PARAM configuration. Detects plain NSEC zones, weak NSEC3 parameters, and opt-out flags.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide excellent coverage (readOnlyHint, openWorldHint, idempotentHint, destructiveHint), so the bar is lower. The description adds valuable context about what specific security risks it detects (plain NSEC zones, weak parameters, opt-out flags), which goes beyond the annotations. It doesn't mention rate limits, authentication needs, or response format details, but with comprehensive annotations, this is acceptable.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise - a single sentence that packs maximum information density. Every word earns its place: 'Assess' (action), 'zone walkability risk' (purpose), 'by analyzing NSEC3PARAM configuration' (method), and the three specific detection types. No wasted words or redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the comprehensive annotations (covering safety, idempotence, and world assumptions) and 100% schema coverage, the description provides excellent contextual completeness. The main gap is the lack of output schema, so the agent doesn't know what format the assessment results will take. However, the description's specificity about detection types partially compensates for this.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, providing complete parameter documentation. The description doesn't add any parameter-specific information beyond what's in the schema. The baseline score of 3 is appropriate when the schema does all the parameter documentation work, and the description focuses on tool purpose rather than parameter details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Assess zone walkability risk'), the resource ('NSEC3PARAM configuration'), and the specific detection capabilities ('plain NSEC zones, weak NSEC3 parameters, and opt-out flags'). It distinguishes itself from sibling tools like 'check_dnssec' or 'check_zone_hygiene' by focusing specifically on NSEC/NSEC3 walkability analysis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context (DNS security analysis) but doesn't explicitly state when to use this tool versus alternatives like 'check_dnssec' or 'check_zone_hygiene'. There's no guidance on prerequisites, timing considerations, or specific scenarios where this tool is most appropriate versus other DNS security checks.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_rblARead-onlyIdempotentInspect
Check MX server IP reputation against 8 DNS-based Real-time Blocklists (Spamhaus ZEN, SpamCop, UCEProtect, Mailspike, Barracuda, PSBL, SORBS). Resolves MX hosts to IPs first.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds useful context about resolving MX hosts to IPs first and listing the 8 specific RBLs, which helps understand the tool's behavior beyond annotations. However, it doesn't describe output format, rate limits, or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences with zero waste. The first sentence clearly states the purpose and lists all 8 RBLs, while the second adds important behavioral context about MX resolution. Every word earns its place, and it's front-loaded with the core functionality.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only tool with good annotations and 100% schema coverage, the description provides sufficient context about what the tool does and how it works (resolving MX to IPs first). However, without an output schema, the description doesn't explain what the return values look like (e.g., list of blocklists with status), leaving some uncertainty about the tool's output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('domain' and 'format') well-documented in the schema. The description does not add any parameter-specific information beyond what the schema provides, such as explaining the 'full' vs 'compact' format differences. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Check MX server IP reputation') and target resource ('against 8 DNS-based Real-time Blocklists'), listing all 8 RBLs by name. It distinguishes from siblings like 'check_mx' (which likely checks MX records) and 'check_mx_reputation' (which might check reputation differently) by specifying it resolves MX hosts to IPs first and checks against specific blocklists.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context for checking domain reputation against RBLs, but does not explicitly state when to use this tool versus alternatives like 'check_mx_reputation' or 'check_dbl'. It provides a clear action but lacks explicit guidance on exclusions or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_resolver_consistencyBRead-onlyIdempotentInspect
Check DNS consistency across 4 public resolvers.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| record_type | No | Record type. Omit for A/AAAA/MX/TXT/NS. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover key behavioral traits (read-only, open-world, idempotent, non-destructive), so the description adds minimal value. It mentions '4 public resolvers' as context, but doesn't disclose rate limits, auth needs, or specific resolver identities, which could be useful. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It's front-loaded with the core purpose and avoids unnecessary elaboration, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (3 parameters, no output schema) and rich annotations, the description is adequate but minimal. It lacks details on output format, error handling, or resolver specifics, which could help the agent anticipate results. Annotations compensate somewhat, but more context would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear parameter documentation. The description doesn't add any semantic details beyond the schema, such as explaining interactions between parameters or default behaviors. Baseline 3 is appropriate since the schema fully documents inputs.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Check DNS consistency') and scope ('across 4 public resolvers'), providing a specific verb and resource. However, it doesn't explicitly differentiate from sibling tools like 'check_dnssec' or 'check_mx', which also involve DNS checking but focus on different aspects.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description doesn't mention prerequisites, exclusions, or compare it to sibling tools like 'compare_domains' or 'check_dnssec', leaving the agent without context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_shadow_domainsBRead-onlyIdempotentInspect
Find TLD variants with email auth gaps. Standalone.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide strong behavioral hints (readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true). The description adds value by specifying the focus on 'email auth gaps' and 'TLD variants,' which gives context beyond annotations. However, it doesn't disclose rate limits, authentication needs, or detailed output behavior (no output schema exists).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise (two short phrases) and front-loaded with the core purpose. Every word earns its place: 'Find TLD variants with email auth gaps' states the action and scope, and 'Standalone' provides operational context without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (security analysis), rich annotations cover safety and idempotency, but no output schema exists. The description provides the core purpose but lacks details on return values, error conditions, or integration with sibling tools. It's minimally adequate but has clear gaps for a security tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents both parameters (domain and format). The description doesn't add any parameter-specific semantics beyond what's in the schema. According to rules, baseline is 3 when schema coverage is high (>80%).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Find TLD variants with email auth gaps.' It specifies the verb ('Find') and resource ('TLD variants'), and the 'Standalone' qualifier distinguishes it from batch operations. However, it doesn't explicitly differentiate from sibling tools like 'check_lookalikes' or 'check_spoofability' that might also analyze domain security gaps.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides minimal guidance: 'Standalone' implies this is a single-domain check vs. batch operations, but it doesn't specify when to use this tool versus alternatives like 'check_lookalikes' or 'assess_spoofability' from the sibling list. No explicit when/when-not instructions or prerequisite context is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_spfARead-onlyIdempotentInspect
Look up and validate SPF record for a domain. Shows authorized senders, syntax issues, and trust surface.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide hints (readOnly, openWorld, idempotent, non-destructive), but the description adds valuable context beyond this: it discloses that the tool 'shows authorized senders, syntax issues, and trust surface', giving insight into what information is returned. This enhances transparency about the tool's output behavior, though it could mention rate limits or authentication needs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded and concise, consisting of a single sentence that efficiently conveys the tool's purpose and key outputs. Every word earns its place, with no redundant or unnecessary information, making it easy to understand at a glance.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (2 parameters, no output schema), the description is fairly complete: it states the purpose, what it shows, and aligns with annotations. However, it could be more comprehensive by mentioning potential errors or limitations, such as DNS lookup failures or unsupported domain formats, to fully guide an agent in all scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters ('domain' and 'format'). The description does not add significant meaning beyond the schema, as it doesn't explain parameter interactions or provide additional syntax details. The baseline score of 3 is appropriate since the schema adequately documents the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('look up and validate') and resource ('SPF record for a domain'), and distinguishes it from siblings by focusing on SPF-specific validation rather than other DNS checks like DKIM or DMARC. It explicitly mentions what it shows: authorized senders, syntax issues, and trust surface.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by specifying it checks SPF records, which helps differentiate it from siblings like 'check_dkim' or 'check_dmarc'. However, it does not explicitly state when to use this tool versus alternatives like 'resolve_spf_chain' or 'generate_spf_record', nor does it provide exclusions or prerequisites for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_srvBRead-onlyIdempotentInspect
Probe SRV records for service footprint.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide excellent coverage (readOnlyHint: true, destructiveHint: false, openWorldHint: true, idempotentHint: true), so the agent knows this is a safe, read-only, idempotent operation. The description adds minimal behavioral context beyond annotations - 'probe' implies an active network query, and 'service footprint' hints at discovering services, but doesn't explain what constitutes a service footprint or how results are structured.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. 'Probe SRV records for service footprint' is perfectly front-loaded and contains only essential information. Every word earns its place, making this an excellent example of conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich annotations (covering safety, idempotence, and open-world behavior) and 100% schema coverage, the description provides adequate context for a read-only DNS probing tool. The main gap is the lack of output schema, but the description hints at what information will be returned ('service footprint'). For a tool with such comprehensive structured metadata, the description is reasonably complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters well-documented in the schema. The description adds no parameter-specific information beyond what's already in the structured schema. The baseline score of 3 is appropriate since the schema fully documents the domain parameter (with examples) and format parameter (with enum values and auto-detection behavior).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Probe SRV records for service footprint' clearly states the action (probe) and resource (SRV records), with 'service footprint' providing specific context about what information is being gathered. It distinguishes itself from siblings like check_mx or check_ns by focusing specifically on SRV records rather than other DNS record types.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools performing various DNS checks (check_mx, check_ns, check_dnssec, etc.), there's no indication of when SRV record probing is appropriate versus other DNS validation tools. The description assumes the user already knows when SRV record checking is needed.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_sslARead-onlyIdempotentInspect
Check SSL/TLS certificate for a domain. Shows issuer, expiry, protocol versions, and HTTPS configuration.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds context about what information is returned (issuer, expiry, etc.), which is useful but doesn't disclose behavioral traits like rate limits, authentication needs, or error conditions. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently conveys the tool's purpose and output details without unnecessary words. It is front-loaded with the core action and resource, making it easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (2 parameters, no output schema), the description adequately covers what the tool does but lacks details on usage context, prerequisites, or output format. With annotations providing safety and idempotency info, and schema covering parameters, the description is minimally complete but could benefit from more contextual guidance.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents both parameters (domain and format). The description doesn't add any parameter-specific details beyond what the schema provides, such as clarifying domain format examples or format implications. With high schema coverage, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('Check SSL/TLS certificate') and resources ('for a domain'), and distinguishes it from siblings by specifying what information it provides ('issuer, expiry, protocol versions, and HTTPS configuration'). It goes beyond just restating the name/title.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools focused on domain security checks (e.g., check_dnssec, check_dmarc, check_http_security), there is no indication of when this SSL/TLS certificate check is appropriate versus other security assessments.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_subdomailingARead-onlyIdempotentInspect
Detect SubdoMailing risk by analyzing SPF include chain for takeover-vulnerable domains.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide read-only, non-destructive, idempotent, and open-world hints, covering safety and behavior. The description adds valuable context by specifying the analysis method (SPF include chain) and target (takeover-vulnerable domains), which helps the agent understand the tool's focus beyond generic checks. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose without unnecessary words. Every part of the sentence contributes to understanding the tool's function, making it highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the annotations cover behavioral traits (read-only, non-destructive, etc.) and the schema fully describes parameters, the description provides adequate context for a security analysis tool. However, without an output schema, it does not detail return values (e.g., risk levels or findings), leaving a minor gap in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters ('domain' and 'format'). The description does not add further semantic details about parameters beyond what the schema provides, such as explaining the impact of 'format' choices on output. Baseline 3 is appropriate given the schema's completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('Detect'), resource ('SubdoMailing risk'), and method ('analyzing SPF include chain for takeover-vulnerable domains'). It distinguishes itself from siblings like 'check_spf' or 'check_dnssec' by focusing on a specific security vulnerability rather than general DNS/email checks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context (security analysis of SPF chains for takeover risks) but does not explicitly state when to use this tool versus alternatives like 'check_spf' or 'assess_spoofability'. No exclusions or prerequisites are mentioned, leaving the agent to infer appropriate scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_svcb_httpsARead-onlyIdempotentInspect
Validate HTTPS/SVCB records (RFC 9460) for modern transport capability advertisement.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds value by specifying the standard (RFC 9460) and the specific capability being validated (modern transport capability advertisement), which provides useful context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that packs essential information: action, resource, standard, and purpose. Every word earns its place with zero redundancy or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only validation tool with comprehensive annotations and full schema coverage, the description provides adequate context. It specifies the standard (RFC 9460) and purpose, though it doesn't describe output format or potential limitations. The absence of an output schema means some uncertainty about return values remains.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters well-documented in the schema. The description doesn't add any parameter-specific information beyond what's already in the schema (domain format, format options with auto-detection). Baseline score of 3 is appropriate since the schema carries the full parameter documentation burden.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Validate') and target resource ('HTTPS/SVCB records (RFC 9460)'), with explicit mention of the purpose ('modern transport capability advertisement'). It distinguishes itself from sibling tools like check_ssl or check_tlsrpt by focusing specifically on SVCB/HTTPS records per RFC 9460.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context (checking modern transport capabilities) but doesn't explicitly state when to use this tool versus alternatives like check_ssl, check_dane_https, or check_tlsrpt. No guidance is provided about prerequisites, dependencies, or exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_tlsrptARead-onlyIdempotentInspect
Validate TLS-RPT SMTP failure reporting.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true, covering safety and idempotency. The description adds value by specifying the validation focus on 'SMTP failure reporting,' which provides context about what aspect of TLS-RPT is being checked. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's front-loaded with the core purpose and avoids unnecessary elaboration, making it easy for an agent to parse quickly while conveying essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only validation tool with good annotations and full schema coverage, the description is adequate but lacks output details (no output schema) and usage context. It covers the 'what' but not the 'why' or 'how to interpret results,' leaving gaps for an agent to infer from tool name and parameters alone.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters (domain and format). The description doesn't add any parameter-specific information beyond what the schema provides, such as explaining domain validation rules or format implications. Baseline 3 is appropriate given the comprehensive schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Validate') and the exact resource ('TLS-RPT SMTP failure reporting'), distinguishing it from sibling tools like check_dmarc, check_dkim, or check_spf which focus on different email security protocols. It precisely communicates the tool's function without ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like check_mta_sts or other email security validation tools. It doesn't mention prerequisites, typical use cases, or scenarios where this validation is particularly relevant, leaving the agent to infer usage from context alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_txt_hygieneBRead-onlyIdempotentInspect
Audit TXT records for stale entries and SaaS exposure.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover key behavioral traits (read-only, open-world, idempotent, non-destructive), so the description doesn't need to repeat these. It adds context by specifying what gets audited ('stale entries and SaaS exposure'), which is useful beyond annotations. However, it doesn't describe output format, error handling, or rate limits, leaving some behavioral aspects unclear.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose without unnecessary words. Every part ('audit TXT records for stale entries and SaaS exposure') contributes directly to understanding the tool's function, making it appropriately concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the annotations cover safety and behavioral traits, and the schema fully documents parameters, the description adds value by specifying the audit focus. However, with no output schema, it doesn't explain return values or result format, which is a minor gap. Overall, it's mostly complete for a read-only audit tool in this context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear documentation for both parameters ('domain' and 'format'). The description doesn't add any parameter-specific details beyond what the schema provides, such as examples or constraints. Baseline score of 3 is appropriate since the schema handles parameter semantics adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('audit') and resources ('TXT records'), and specifies what it audits ('stale entries and SaaS exposure'). It doesn't explicitly differentiate from sibling tools like 'check_zone_hygiene' or 'check_dnssec', but the focus on TXT records is specific enough for most contexts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'check_zone_hygiene' or 'check_dnssec', nor does it mention prerequisites or exclusions. It implies usage for auditing TXT records but lacks explicit context for tool selection among the many DNS-related siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_zone_hygieneARead-onlyIdempotentInspect
Audit SOA propagation and sensitive subdomains.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide strong hints (readOnly, openWorld, idempotent, non-destructive), so the bar is lower. The description adds valuable context by specifying what gets audited ('SOA propagation and sensitive subdomains'), which goes beyond annotations. It doesn't contradict annotations, and while it could mention more about output format or rate limits, it provides useful behavioral insight.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with a single, front-loaded sentence that wastes no words. Every part ('audit', 'SOA propagation', 'sensitive subdomains') directly contributes to understanding the tool's purpose without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (auditing DNS hygiene), rich annotations cover safety and behavior, and no output schema exists, the description is somewhat complete but lacks details on output structure or error handling. It adequately conveys the core function but could benefit from more context about results or limitations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents both parameters (domain and format). The description doesn't add any parameter-specific details beyond what's in the schema, such as explaining domain validation or format implications. This meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('audit') and resources ('SOA propagation and sensitive subdomains'), making it easy to understand what the tool does. However, it doesn't explicitly differentiate from sibling tools like 'check_dnssec' or 'check_subdomailing', which might have overlapping DNS-related functions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools focused on DNS and domain checks (e.g., 'check_dnssec', 'check_subdomailing'), there's no indication of specific contexts, prerequisites, or exclusions for this audit tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compare_baselineBRead-onlyIdempotentInspect
Compare domain security against a policy baseline.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to scan and compare. | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| baseline | Yes | Policy baseline requirements. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds no additional behavioral context such as rate limits, authentication needs, or what 'compare' entails operationally (e.g., scanning, analysis). No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and wastes no space, making it easy for an agent to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (3 parameters, nested object) and lack of output schema, the description is minimal. Annotations cover safety aspects, but the description doesn't address what the comparison outputs (e.g., pass/fail, detailed report) or how to interpret results, leaving gaps for a tool that likely returns meaningful security data.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all parameters well-documented in the schema itself. The description mentions 'domain' and 'baseline' but adds no meaningful semantics beyond what the schema provides, such as explaining the relationship between grade/score or the impact of format choices.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('compare') and target ('domain security against a policy baseline'), which is specific and understandable. However, it doesn't explicitly differentiate from sibling tools like 'compare_domains' or 'scan_domain', which might have overlapping functionality in security assessment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'scan_domain' or 'compare_domains'. It lacks context about prerequisites, typical use cases, or exclusions, leaving the agent to infer usage from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compare_domainsARead-onlyIdempotentInspect
Side-by-side security comparison of 2–5 domains. Shows scores, category gaps, and unique weaknesses.
| Name | Required | Description | Default |
|---|---|---|---|
| format | No | Output verbosity. Auto-detected if omitted. | |
| domains | Yes | Domains to compare (2–5 domains) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, open-world, idempotent, and non-destructive behavior. The description adds valuable context by specifying what the comparison shows (scores, category gaps, unique weaknesses), which helps the agent understand the output format and scope beyond the annotations. No contradictions with annotations are present.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose and key outputs without unnecessary details. Every word contributes to understanding the tool's function, making it highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (comparative analysis with 2 parameters), rich annotations, and 100% schema coverage, the description is mostly complete. It lacks an output schema, but the description compensates by specifying output elements (scores, gaps, weaknesses). However, it could benefit from more detail on usage scenarios or limitations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear documentation for both parameters. The description mentions '2–5 domains', aligning with the schema's 'domains' parameter, but does not add significant meaning beyond what the schema provides. The baseline score of 3 is appropriate as the schema adequately covers parameter details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs a 'side-by-side security comparison' of domains, specifying the exact resource (domains) and verb (compare). It distinguishes itself from sibling tools by focusing on comparative analysis rather than individual domain scanning or configuration generation, as seen in tools like 'scan_domain' or 'generate_dmarc_record'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by mentioning 'security comparison' and specifying 2–5 domains, but it does not explicitly state when to use this tool versus alternatives like 'compare_baseline' or 'scan_domain'. It provides basic constraints (domain count) but lacks guidance on scenarios or prerequisites for choosing this tool over others.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cymru_asnARead-onlyIdempotentInspect
Map domain IPs to Autonomous System Numbers via Team Cymru DNS. Returns ASN, prefix, country, registry, and organization for each IP. Flags high-risk hosting ASNs.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, openWorldHint=true, idempotentHint=true, and destructiveHint=false. The description adds valuable behavioral context beyond annotations by specifying the data source ('via Team Cymru DNS'), output format details, and the special feature of 'Flags high-risk hosting ASNs.' This enhances understanding of the tool's behavior without contradicting annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with two sentences that each earn their place. The first sentence covers purpose, method, and output. The second sentence adds the unique high-risk flagging feature. No wasted words, and information is front-loaded appropriately.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity, rich annotations (4 hints), and 100% schema coverage, the description provides good contextual completeness. It explains what the tool does, how it works, what it returns, and a special feature. The main gap is lack of output format details (no output schema exists), but the description mentions key return fields, which helps compensate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema fully documents both parameters. The description doesn't add parameter-specific semantics beyond what's in the schema (domain format example, format options). However, it implies the 'domain' parameter is used for ASN mapping, which aligns with schema documentation. Baseline 3 is appropriate when schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Map domain IPs to Autonomous System Numbers'), resource ('via Team Cymru DNS'), and output details ('Returns ASN, prefix, country, registry, and organization for each IP'). It distinguishes itself from sibling tools by focusing on ASN mapping rather than DNS security checks, domain validation, or other analysis functions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by mentioning 'high-risk hosting ASNs' and the mapping function, suggesting it's for security/threat intelligence analysis. However, it doesn't explicitly state when to use this tool versus alternatives like 'check_mx_reputation' or 'map_supply_chain', nor does it provide exclusion criteria or prerequisites for use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
discover_subdomainsARead-onlyIdempotentInspect
Find subdomains of a domain using Certificate Transparency logs. Reveals shadow IT, forgotten services, and unauthorized certificate issuance.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide excellent coverage (readOnlyHint, openWorldHint, idempotentHint, destructiveHint), but the description adds valuable context about what the tool reveals ('shadow IT, forgotten services, unauthorized certificate issuance') that goes beyond the annotations. This helps the agent understand the investigative value and typical findings, though it doesn't mention rate limits or authentication requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with two sentences that each earn their place. The first sentence states the core functionality, and the second sentence explains the value/use cases. No wasted words, and the most important information (what the tool does) comes first.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity, excellent annotation coverage, and 100% schema coverage, the description provides good contextual completeness. It explains the investigative value and use cases well. The main gap is the lack of output schema, so the agent doesn't know what format the results will be in, but the description compensates reasonably well given the strong structured data support.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema already fully documents both parameters. The description doesn't add any parameter-specific information beyond what's in the schema, so it meets the baseline of 3. The description's focus is on tool purpose rather than parameter details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verb ('Find') and resource ('subdomains of a domain'), and distinguishes it from siblings by specifying the method ('using Certificate Transparency logs'). It explicitly differentiates from tools like 'check_subdomailing' or 'scan_domain' by focusing on certificate-based discovery rather than general scanning.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use this tool ('Reveals shadow IT, forgotten services, and unauthorized certificate issuance'), which helps identify appropriate use cases. However, it doesn't explicitly state when NOT to use it or name specific alternatives among the many sibling tools, though the purpose differentiation implies alternatives exist.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
explain_findingBRead-onlyIdempotentInspect
Explain a finding with impact and remediation.
| Name | Required | Description | Default |
|---|---|---|---|
| format | No | Output verbosity. Auto-detected if omitted. | |
| status | Yes | Finding severity or status. | |
| details | No | Additional detail from check result. | |
| checkType | Yes | Check type (e.g., 'SPF', 'DMARC'). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds minimal behavioral context by mentioning 'impact and remediation', but doesn't elaborate on output format, error handling, or rate limits. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that states the purpose without unnecessary words. It's front-loaded with the core action, though it could be slightly more structured by explicitly separating impact and remediation aspects for clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the annotations cover safety and idempotency, and the schema fully documents parameters, the description provides a basic overview. However, with no output schema and many sibling tools, it lacks details on return values, error cases, or integration context, making it adequate but incomplete for optimal agent use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for all parameters (e.g., 'checkType' as the check type, 'status' as severity). The description doesn't add meaning beyond the schema, such as explaining how parameters interact or providing examples, so it meets the baseline for high schema coverage without extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Explain') and the resource ('a finding'), specifying what aspects to cover ('with impact and remediation'). However, it doesn't differentiate this tool from sibling tools like 'generate_fix_plan' or 'validate_fix' which might also involve remediation guidance, leaving some ambiguity about its unique role.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools like 'analyze_drift', 'assess_spoofability', and 'generate_fix_plan' that might overlap in function, there's no indication of context, prerequisites, or exclusions to help an agent choose appropriately.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_dkim_configARead-onlyIdempotentInspect
Generate DKIM setup instructions and DNS record.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| provider | No | Provider (e.g., "google"). Omit for generic. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate this is a read-only, non-destructive, idempotent operation with open-world data. The description adds valuable context by specifying what gets generated ('setup instructions and DNS record'), which goes beyond the annotations. However, it doesn't mention potential rate limits, authentication requirements, or output format details, leaving some behavioral aspects uncovered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without any unnecessary words. It's front-loaded with the core action and resource, making it immediately understandable. Every word earns its place, achieving maximum clarity with minimal verbiage.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (3 parameters, no output schema) and rich annotations, the description is adequate but has gaps. It clearly states what the tool does, but lacks guidance on usage context and doesn't explain what the generated output looks like (instructions format, DNS record details). The annotations help, but the description could better address the tool's role in the broader sibling ecosystem.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with each parameter clearly documented in the schema itself. The description doesn't add any additional meaning or clarification about the parameters beyond what the schema provides. According to the rules, when schema coverage is high (>80%), the baseline score is 3, which applies here as the description doesn't compensate or enhance parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Generate DKIM setup instructions and DNS record') with the resource (DKIM configuration). It distinguishes itself from sibling tools like 'check_dkim' (which verifies) and 'generate_dmarc_record' (which focuses on DMARC), making the purpose unambiguous and well-differentiated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. While it's clear this generates DKIM configuration, there's no mention of prerequisites (e.g., needing domain ownership), when it's appropriate (e.g., initial setup vs. troubleshooting), or how it relates to siblings like 'generate_dmarc_record' or 'check_dkim'. Usage is implied but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_dmarc_recordBRead-onlyIdempotentInspect
Generate DMARC record with configurable policy.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| policy | No | Policy (default "reject"). | |
| rua_email | No | Report email. Default: dmarc-reports@{domain}. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true, covering safety and idempotency. The description adds minimal behavioral context beyond this, mentioning 'configurable policy' which hints at customization but doesn't detail side effects, rate limits, or authentication needs. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero wasted words. It's front-loaded with the core purpose and includes a key feature ('configurable policy'). Every part of the description earns its place without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (4 parameters, no output schema), annotations provide good safety coverage, but the description lacks context on output format, error handling, or integration with sibling tools. It's minimally adequate but leaves gaps in guiding the agent on what to expect from the tool's behavior and results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all parameters well-documented in the schema (e.g., domain, format, policy, rua_email). The description adds no additional parameter semantics beyond what's in the schema, such as explaining interactions between parameters or default behaviors. Baseline 3 is appropriate given the comprehensive schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Generate DMARC record with configurable policy.' It specifies the verb ('generate'), resource ('DMARC record'), and scope ('configurable policy'). However, it doesn't explicitly differentiate from sibling tools like 'generate_dkim_config' or 'generate_spf_record' beyond the DMARC focus.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, typical use cases, or how it relates to sibling tools like 'check_dmarc' or 'generate_fix_plan'. The agent must infer usage from the tool name and context alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_fix_planBRead-onlyIdempotentInspect
Generate prioritized remediation plan with effort estimates.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds minimal behavioral context about 'prioritized' and 'effort estimates' but doesn't explain what 'remediation' entails, what data sources it uses, or how prioritization works. It doesn't contradict annotations, but adds limited value beyond them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence: 'Generate prioritized remediation plan with effort estimates.' It's front-loaded with the core purpose and includes key features without unnecessary words. Every word earns its place, making it highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (generating plans with prioritization and estimates), the annotations provide good safety coverage, but there's no output schema. The description doesn't explain what the remediation plan contains, how effort estimates are calculated, or what format the output takes. It's minimally adequate but leaves significant gaps about the tool's output and operational context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('domain' and 'format') well-documented in the schema. The description doesn't add any parameter-specific information beyond what the schema already states. According to guidelines, when schema coverage is high (>80%), the baseline score is 3 even without param info in the description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Generate prioritized remediation plan with effort estimates.' It specifies the action (generate), the output type (prioritized remediation plan), and additional features (effort estimates). However, it doesn't differentiate this from sibling tools like 'generate_rollout_plan' or 'validate_fix' which also generate plans or validate fixes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With many sibling tools like 'analyze_drift', 'scan_domain', and 'validate_fix', there's no indication of whether this should be used after scanning, instead of other analysis tools, or as a final step. The description lacks any 'when' or 'when not' context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_mta_sts_policyBRead-onlyIdempotentInspect
Generate MTA-STS record and policy file.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| mx_hosts | No | MX hosts. Omit to detect from DNS. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide significant behavioral information: readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true. The description adds minimal context beyond this - it mentions generating both a record and policy file, which gives some implementation detail. However, it doesn't describe what 'generate' entails (e.g., whether it creates actual files, returns content, or just provides configuration), nor does it mention any rate limits or authentication requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise at just one sentence with no wasted words. It's front-loaded with the core purpose and contains no unnecessary information. Every word earns its place in this minimal but complete statement of what the tool does.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich annotations (readOnly, idempotent, non-destructive) and complete schema coverage, the description provides adequate context for a read-only generation tool. However, with no output schema, the description doesn't explain what gets returned (e.g., the policy content, file locations, or format details). For a generation tool, knowing the output format would be helpful, though the annotations provide safety assurances.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already fully documents all three parameters. The description adds no additional parameter semantics beyond what's in the schema - it doesn't explain the relationship between the record and policy file generation, nor does it provide context about when to use different formats or MX host configurations. The baseline score of 3 is appropriate when the schema does all the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Generate MTA-STS record and policy file.' It specifies both the verb ('Generate') and the resources ('MTA-STS record and policy file'), making the purpose unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'check_mta_sts' or 'generate_dmarc_record', which would be needed for a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention when this generation tool should be used instead of checking tools like 'check_mta_sts', nor does it provide any prerequisites or context for usage. The agent must infer usage from the tool name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_rollout_planARead-onlyIdempotentInspect
Generate a phased DMARC enforcement timeline with exact DNS records per phase.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to generate rollout plan for | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| timeline | No | Rollout speed: aggressive, standard, conservative (default: standard) | |
| target_policy | No | Target DMARC policy (default: reject) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide key behavioral traits: readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true. The description adds context by specifying the output includes 'exact DNS records per phase,' which clarifies the tool's generative nature beyond just planning. However, it doesn't disclose additional details like rate limits, authentication needs, or error handling.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose ('Generate a phased DMARC enforcement timeline') and adds specific output details ('with exact DNS records per phase'). There is no wasted wording, and it directly communicates the tool's function without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (4 parameters, no output schema) and rich annotations, the description is mostly complete. It clearly states what the tool does and its output format. However, it could benefit from mentioning the lack of side effects (implied by annotations) or example use cases to fully guide an AI agent in contextual decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all parameters well-documented in the input schema (e.g., domain, format, timeline, target_policy). The description doesn't add any parameter-specific semantics beyond what the schema provides, such as explaining interactions between parameters or default behaviors. Baseline 3 is appropriate given the comprehensive schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('generate'), the resource ('phased DMARC enforcement timeline'), and the output details ('exact DNS records per phase'). It distinguishes this tool from siblings like 'generate_dmarc_record' (which creates a single record) and 'generate_fix_plan' (which addresses issues rather than planning enforcement).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by specifying 'phased DMARC enforcement timeline,' suggesting it's for planning DMARC rollout rather than immediate implementation or analysis. However, it doesn't explicitly state when to use this tool versus alternatives like 'generate_fix_plan' or 'generate_dmarc_record,' nor does it mention prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_spf_recordBRead-onlyIdempotentInspect
Generate corrected SPF record from detected providers.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| include_providers | No | Providers to include (e.g., ["google"]). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover key behavioral traits: read-only, open-world, idempotent, and non-destructive. The description adds minimal context by mentioning 'corrected' and 'detected providers,' which hints at analysis-based generation. However, it lacks details on rate limits, authentication needs, or output format, leaving some behavioral aspects unclear despite annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose without unnecessary details. Every word earns its place, making it highly concise and well-structured for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (3 parameters, no output schema) and rich annotations, the description is minimally adequate. It states what the tool does but lacks details on output format, error handling, or integration with sibling tools, leaving gaps in completeness for effective agent use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, providing full parameter documentation. The description doesn't add meaning beyond the schema, as it doesn't explain parameter interactions or usage examples. With high schema coverage, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Generate corrected SPF record from detected providers.' It specifies the verb ('generate'), resource ('SPF record'), and source ('detected providers'), making the function unambiguous. However, it doesn't explicitly differentiate from sibling tools like 'check_spf' or 'resolve_spf_chain', which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'check_spf' (for analysis) or 'resolve_spf_chain' (for diagnostics), nor does it specify prerequisites such as needing prior domain scanning. Usage is implied but not explicitly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_benchmarkBRead-onlyIdempotentInspect
Get score benchmarks: percentiles, mean, top failures.
| Name | Required | Description | Default |
|---|---|---|---|
| format | No | Output verbosity. Auto-detected if omitted. | |
| profile | No | Profile to benchmark (default "mail_enabled"). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover key behavioral traits (read-only, open-world, idempotent, non-destructive), so the description's burden is lower. It adds value by specifying output types ('percentiles, mean, top failures'), but does not disclose additional context such as rate limits, authentication needs, or data sources. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose ('Get score benchmarks') and lists key outputs without unnecessary words. Every part earns its place by clarifying what the tool returns, making it appropriately sized and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (2 parameters, no output schema) and rich annotations, the description is minimally complete. It states what the tool does but lacks details on output format, error handling, or integration with sibling tools. With annotations covering safety and behavior, it's adequate but has clear gaps in contextual guidance.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters ('format' and 'profile') well-documented in the schema. The description does not add any meaning beyond the schema, such as explaining the significance of 'top failures' or how 'profile' affects benchmarks, so it meets the baseline for high schema coverage without extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with 'Get score benchmarks' followed by specific outputs ('percentiles, mean, top failures'), which is a specific verb+resource combination. However, it does not explicitly distinguish this tool from sibling tools like 'compare_baseline' or 'get_provider_insights', which might involve similar benchmarking concepts, so it misses full sibling differentiation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. With sibling tools like 'compare_baseline' and 'get_provider_insights' that might overlap in benchmarking or analysis, there is no explicit mention of context, exclusions, or preferred scenarios for this tool, leaving usage unclear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_provider_insightsBRead-onlyIdempotentInspect
Get provider cohort benchmarks and common issues.
| Name | Required | Description | Default |
|---|---|---|---|
| format | No | Output verbosity. Auto-detected if omitted. | |
| profile | No | Profile (default "mail_enabled"). | |
| provider | Yes | Provider (e.g., "google workspace"). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide strong behavioral hints (readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true), covering safety and idempotency. The description adds minimal context about what 'insights' include (benchmarks and common issues), but doesn't elaborate on data sources, freshness, or limitations. No contradiction with annotations exists.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It's front-loaded with the core functionality and wastes no space on repetition or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich annotations (covering read-only, idempotent, non-destructive behavior) and complete schema documentation, the description provides adequate context for a read-only query tool. However, without an output schema, it doesn't detail the structure or format of returned insights, leaving a gap in understanding what 'benchmarks and common issues' entail.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with all parameters well-documented in the schema itself (e.g., 'provider' as the provider name, 'format' for output verbosity, 'profile' for provider type). The description adds no additional parameter semantics beyond what's in the schema, so it meets the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose as retrieving 'provider cohort benchmarks and common issues,' which specifies both the action ('get') and the resource ('provider insights'). It distinguishes itself from most sibling tools focused on domain scanning or configuration generation, though it doesn't explicitly differentiate from 'get_benchmark' which might have overlapping functionality.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, appropriate contexts, or compare it to sibling tools like 'get_benchmark' or 'analyze_drift' that might serve related purposes. The agent must infer usage from the tool name and parameters alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
map_complianceARead-onlyIdempotentInspect
Map scan findings to compliance frameworks: NIST 800-177, PCI DSS 4.0, SOC 2, CIS Controls. Shows pass/fail/partial status per control.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations cover read-only, open-world, idempotent, and non-destructive traits. The description adds valuable context about output behavior ('shows pass/fail/partial status per control') and mentions specific compliance frameworks, which helps the agent understand what to expect beyond the basic safety profile.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently communicates the core functionality and output format. Every word serves a purpose, with no redundant information or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity, comprehensive annotations, and full parameter documentation, the description provides sufficient context for an agent to understand its purpose and basic behavior. The lack of output schema is partially compensated by the description mentioning the output format ('pass/fail/partial status'), though more detail on return structure would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the schema fully documents both parameters. The description doesn't add any parameter-specific details beyond what's in the schema, so it meets the baseline expectation without providing extra semantic value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('map scan findings to compliance frameworks') and resources ('NIST 800-177, PCI DSS 4.0, SOC 2, CIS Controls'), and distinguishes it from siblings by focusing on compliance mapping rather than scanning, analysis, or configuration generation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when compliance mapping is needed, but provides no explicit guidance on when to use this tool versus alternatives like 'get_benchmark' or 'assess_spoofability'. It lacks clear exclusions or prerequisites for effective tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
map_supply_chainARead-onlyIdempotentInspect
Map third-party service dependencies from DNS records. Correlates SPF, NS, TXT verifications, SRV services, and CAA to show who can send as you, control your DNS, and what services are integrated.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide important behavioral hints (readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: true). The description adds valuable context by explaining what the tool actually reveals: 'who can send as you, control your DNS, and what services are integrated.' This goes beyond the annotations to describe the tool's investigative nature and security implications.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise and front-loaded. The first sentence establishes the core purpose, and the second sentence elaborates on the specific correlations and outcomes. Every word earns its place with no redundancy or unnecessary elaboration.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (analyzing multiple DNS record types for security insights) and the absence of an output schema, the description provides good context about what the tool reveals. However, it doesn't specify the format or structure of the returned dependency mapping, which would be helpful since there's no output schema. The annotations provide good behavioral coverage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters clearly documented in the schema. The description doesn't add any parameter-specific information beyond what's already in the schema. The baseline score of 3 is appropriate since the schema does the heavy lifting for parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: mapping third-party service dependencies from DNS records. It specifies the exact DNS record types analyzed (SPF, NS, TXT, SRV, CAA) and the outcomes (showing who can send as you, control your DNS, and integrated services). This distinguishes it from sibling tools like check_spf or check_ns that focus on individual record types rather than comprehensive correlation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by stating it 'correlates' multiple DNS record types to show service dependencies, suggesting it should be used for comprehensive supply chain analysis rather than individual checks. However, it doesn't explicitly state when to use this tool versus alternatives like scan_domain or map_compliance, nor does it provide exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
rdap_lookupARead-onlyIdempotentInspect
Fetch domain registration data via RDAP (modern WHOIS replacement). Returns registrar, creation/expiration dates, EPP status, registrant info, and domain age.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering safety and idempotency. The description adds valuable context by specifying the return data (registrar, dates, status, registrant info, domain age) and noting RDAP as a modern replacement, which helps the agent understand the tool's scope and output format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the purpose and key return values without redundancy. Every phrase adds value, such as clarifying RDAP's role and enumerating output fields, making it appropriately sized for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the rich annotations (read-only, idempotent, non-destructive) and 100% schema coverage, the description is largely complete. It specifies the return data, which compensates for the lack of an output schema. However, it could mention rate limits or authentication needs for full transparency.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear parameter documentation. The description does not add meaning beyond the schema, as it mentions no parameter details. However, it implies the 'domain' parameter is central by listing return data types, aligning with the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Fetch domain registration data via RDAP') and resource ('domain registration data'), distinguishing it from sibling tools focused on DNS, security, or configuration checks. It explicitly positions RDAP as a 'modern WHOIS replacement' to clarify its domain.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for retrieving domain registration metadata but provides no explicit guidance on when to use this tool versus alternatives like WHOIS or sibling tools (e.g., check_dnssec for DNS security). It lacks context on prerequisites or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
resolve_spf_chainARead-onlyIdempotentInspect
Trace the full SPF include chain for a domain. Recursively resolves all includes, shows lookup count, tree depth, and flags circular includes or exceeding the 10-lookup limit.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds valuable behavioral context beyond annotations: it discloses that the tool performs recursive resolution, counts lookups, tracks tree depth, and flags circular includes or exceeding the 10-lookup limit. This enhances transparency about the tool's operational behavior without contradicting annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently conveys the tool's purpose, key features (recursive resolution, lookup count, tree depth, circular include detection, lookup limit), and scope. It is front-loaded with the main action and avoids unnecessary words, making every part of the sentence earn its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (involving recursive SPF chain resolution) and the absence of an output schema, the description does a good job of outlining what the tool does and key behavioral aspects. However, it does not specify the output format or structure (e.g., whether it returns a tree, list, or summary), which could be helpful for an AI agent. Annotations cover safety and idempotency, but the description could be more complete regarding output details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear descriptions for both parameters (domain and format). The description does not add significant meaning beyond the schema, as it focuses on the tool's functionality rather than parameter details. The baseline score of 3 is appropriate since the schema adequately documents parameters, and the description does not compensate or elaborate further.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Trace the full SPF include chain for a domain') and distinguishes it from sibling tools like 'check_spf' by specifying recursive resolution of includes, lookup counting, tree depth analysis, and detection of circular includes or lookup limit violations. It provides a verb+resource+scope combination that is precise and differentiated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context (e.g., for SPF analysis or troubleshooting) by mentioning features like circular include detection and lookup limits, but it does not explicitly state when to use this tool versus alternatives like 'check_spf' or other sibling tools. It provides clear functional scope but lacks explicit when/when-not guidance or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
scan_domainARead-onlyIdempotentInspect
Look up any domain to get a full DNS and email security audit. Use this whenever a user mentions a domain name, asks to check/scan/lookup/analyze a domain, or wants to know about a domain's security posture. Returns score, grade, maturity stage, and prioritized findings. Start here for any domain-related question.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| profile | No | Scoring profile. Default "auto" detects. | |
| force_refresh | No | Bypass cache and run a fresh scan. Useful after DNS changes. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable context beyond annotations: it specifies the return format ('score, grade, maturity stage, and prioritized findings') and implies a comprehensive audit scope. Annotations cover safety (readOnlyHint=true, destructiveHint=false) and idempotency, so the bar is lower, but the description still enhances understanding of the tool's behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose, followed by usage guidelines and output details, all in three efficient sentences with zero wasted words. Each sentence earns its place by adding distinct value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (4 parameters, no output schema), the description provides strong purpose, usage, and output context. It compensates for the lack of output schema by detailing return values. However, it could briefly mention the optional parameters' roles to enhance completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema fully documents all parameters. The description does not add any parameter-specific details beyond what the schema provides, such as explaining the 'profile' enums or 'force_refresh' implications. This meets the baseline of 3 for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('look up', 'get a full DNS and email security audit') and resource ('domain'), and distinguishes it from siblings by positioning it as the starting point for any domain-related question, unlike the more specialized sibling tools like check_dmarc or check_spf.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It provides explicit guidance on when to use this tool ('whenever a user mentions a domain name, asks to check/scan/lookup/analyze a domain, or wants to know about a domain's security posture') and positions it as the primary entry point ('Start here for any domain-related question'), effectively distinguishing it from the many specialized sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
simulate_attack_pathsARead-onlyIdempotentInspect
Analyze current DNS posture and enumerate specific attack paths an adversary could exploit, with severity, feasibility, steps, and mitigations.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check (e.g., example.com) | |
| format | No | Output verbosity. Auto-detected if omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, openWorldHint=true, idempotentHint=true, and destructiveHint=false, indicating safe, repeatable operations. The description adds value by specifying the analysis scope ('current DNS posture') and output details (severity, feasibility, steps, mitigations), though it does not mention rate limits or authentication needs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently conveys the tool's purpose and outputs without unnecessary details. It is front-loaded with key actions and leaves no waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of attack path simulation, the description is complete for a read-only tool with good annotations. It specifies the analysis scope and output details, though no output schema exists to clarify return values. Slightly more detail on behavioral aspects like performance could improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with clear documentation for 'domain' and 'format' parameters. The description does not add meaning beyond the schema, as it does not explain parameter usage or constraints. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('analyze', 'enumerate') and resources ('current DNS posture', 'attack paths'), including detailed outputs like severity, feasibility, steps, and mitigations. It distinguishes from siblings by focusing on attack path simulation rather than specific checks like DNS records or security scans.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for analyzing DNS posture and enumerating attack paths, but does not explicitly state when to use this tool versus alternatives like 'scan_domain' or 'check_dnssec'. It provides context but lacks specific exclusions or named alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
validate_fixARead-onlyIdempotentInspect
Re-check a specific control after applying a fix. Confirms whether the finding is resolved.
| Name | Required | Description | Default |
|---|---|---|---|
| check | Yes | Check name to re-run (e.g., "dmarc", "spf") | |
| domain | Yes | Domain to validate the fix for | |
| format | No | Output verbosity. Auto-detected if omitted. | |
| expected | No | Expected DNS record value to verify against |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds valuable context beyond annotations by specifying that this tool is for 're-checking' after fixes and 'confirms whether the finding is resolved.' While annotations already indicate this is a read-only, idempotent, non-destructive operation, the description clarifies the specific workflow context (post-fix validation) and purpose (confirmation of resolution), which helps the agent understand when this tool is appropriate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with two sentences that each serve distinct purposes: the first establishes the action and context, the second states the outcome. There's no wasted language, and the most important information (re-checking after fixes) is front-loaded. Every word contributes to understanding the tool's purpose and usage.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only validation tool with comprehensive annotations and schema coverage, the description provides sufficient context about when and why to use it. While there's no output schema, the description indicates what the tool confirms ('whether the finding is resolved'), which gives the agent reasonable expectations about the return value. The description could be slightly more complete by mentioning what format the confirmation takes, but it's adequate given the tool's straightforward nature.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage, the input schema already provides complete documentation for all parameters. The description doesn't add any additional parameter semantics beyond what's in the schema, but it does imply the relationship between parameters (domain and check being validated together after a fix). This meets the baseline expectation when schema coverage is comprehensive.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Re-check', 'Confirms') and resource ('a specific control', 'the finding'), distinguishing it from sibling tools like 'scan_domain' or 'check_dmarc' which perform initial assessments rather than post-fix validation. It explicitly mentions the context of 'after applying a fix', which sets it apart from other verification tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool: 'after applying a fix' to confirm resolution. This clearly distinguishes it from initial assessment tools like 'scan_domain' or specific check tools like 'check_dmarc', which would be used before fixes are applied. The context of re-checking implies this tool should be used as a follow-up to remediation actions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!