ContrastAPI
Server Details
55 tools, 7 Resources, Sigma rules, email SPF/DMARC, MITRE, CVE/KEV, risk_score. No key.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- UPinar/contrastapi
- GitHub Stars
- 29
- Server Listing
- contrastapi
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.7/5 across 54 of 54 tools scored.
Most tools have clearly distinct purposes, with differences between lookup/search/scan/audit for each domain. However, some overlap exists (e.g., email_mx vs email_security_posture, scan_headers vs contrast_scan) which could cause occasional confusion. Overall, boundaries are well-defined.
Tool names follow a consistent verb_noun pattern (e.g., cve_lookup, check_headers, bulk_cve_lookup) with all lowercase underscores. Variations like kev_detail or ssl_check are minor and still predictable. No chaotic mixing of conventions.
54 tools is high but justified by the broad cybersecurity scope (CVE, ATLAS, D3FEND, Sigma, domain, email, IOC, scanning). Some redundancy exists (e.g., three email-related tools), but the count is not excessive given the API's comprehensive feature set.
The tool set thoroughly covers the threat intelligence and domain investigation lifecycle: CVE/KEV/exploit/CWE, ATLAS/D3FEND/Sigma, DNS/WHOIS/SSL/subdomains, email security, IOC enrichment, and active scanning. No significant gaps are apparent for the stated cybersecurity purpose.
Available Tools
55 toolsasn_lookupASN LookupARead-onlyIdempotentInspect
Look up Autonomous System Number (ASN) for a domain or IP: AS number, organization, IPv4/IPv6 prefixes. Use to identify network operator and IP range ownership. Default returns first 50 prefixes per family — set include_full_prefixes=True for full list. Free: 30/hr, Pro: 500/hr. Returns {asn, asn_name, ipv4_prefixes, ipv6_prefixes, ipv4_count, ipv6_count}.
| Name | Required | Description | Default |
|---|---|---|---|
| target | Yes | Domain or IP address to look up ASN for (e.g. 'cloudflare.com', '8.8.8.8') | |
| include_full_prefixes | No | Return the full announced-prefixes list (default: False, returns first 50). ipv4_count and ipv6_count are always honest pre-truncation totals. Set True for network mapping or BGP route audits — Cloudflare AS13335 announces 2500+ prefixes. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark the tool as read-only and idempotent. The description adds valuable behavioral details: default prefix truncation (first 50 per family), the effect of include_full_prefixes, and that ipv4_count and ipv6_count are always honest pre-truncation totals. Rate limits are also disclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences long, front-loaded with the core purpose, then specific details about default behavior and rate limits, and finally the output structure. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low parameter count (2) and simple output with an output schema, the description covers all necessary aspects: input examples, default behavior, rate limits, and output fields. No gaps are evident.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. The description adds meaningful context beyond the schema, especially for include_full_prefixes with an example (Cloudflare AS13335) and the default behavior of returning only the first 50 prefixes.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states what the tool does: 'Look up Autonomous System Number (ASN) for a domain or IP' and lists the key outputs (AS number, organization, prefixes). It uniquely identifies the tool's purpose among sibling tools like dns_lookup and whois_lookup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use it ('identify network operator and IP range ownership') and provides hints for using include_full_prefixes for network mapping or BGP audits. It also specifies rate limits. However, it does not explicitly contrast with alternative tools like whois_lookup or ip_lookup.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
atlas_case_study_lookupATLAS Case Study LookupARead-onlyIdempotentInspect
Look up a MITRE ATLAS case study — a documented real-world AI/ML attack incident. Each case study links a sequence of ATLAS techniques (techniques_used) to the incident. Default response is SLIM (description truncated to 240 chars); pass include='full' for the verbose narrative. Use this after atlas_technique_search to find which incidents have exercised a given technique. Drill into the full techniques_used array via bulk_atlas_technique_lookup in a single call (next_calls emits exactly that hint). Returns 404 when the id is not in the synced catalog. Free: 30/hr, Pro: 500/hr. Returns {case_study_id, name, description, techniques_used, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| include | No | Detail level. Default (omit/empty) returns slim (description truncated to 240 chars). Pass 'full' for the verbose narrative — case-study descriptions can run 1-3KB. | |
| case_study_id | Yes | MITRE ATLAS case study id, format 'AML.CS####' (e.g. 'AML.CS0000', 'AML.CS0014'). |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses default slim response vs. full via include parameter, 404 on missing id, and rate limits (30/hr Free, 500/hr Pro). Does not contradict annotations (readOnlyHint, idempotentHint, destructiveHint).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
One dense paragraph, front-loaded with purpose, then behavioral details, then workflow hints, then rate limits, then return fields. Every sentence adds value; no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists (not shown), but description covers key fields (case_study_id, name, description, techniques_used, next_calls), behavior (slim/full, 404), rate limits, and integration guidance. Complete for a lookup tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, but description adds context: default returns truncated description (240 chars), 'full' gives verbose narrative of 1-3KB. Clarifies case_study_id format (e.g., 'AML.CS0000'). Exceeds baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states verb 'look up' and resource 'MITRE ATLAS case study', describes it as a documented real-world AI/ML attack incident with linked techniques. Distinguishes from sibling tools like atlas_case_study_search and atlas_technique_lookup by mentioning workflow integration.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises use after atlas_technique_search to find incidents for a technique. Suggests drilling into techniques_used via bulk_atlas_technique_lookup, including that next_calls provides that hint. Provides clear context for when to use this tool vs. alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
atlas_case_study_searchATLAS Case Study SearchARead-onlyIdempotentInspect
Search ATLAS case studies (real-world AI/ML attack incidents) by keyword or referenced technique. Default response is SLIM (description truncated to 240 chars per row); pass include='full' for the verbose summary. Useful when the user has a technique in hand and wants to see incidents that exercised it. Drill via atlas_case_study_lookup for the full procedure list. Free: 30/hr, Pro: 500/hr. Returns {query, total, results [{case_study_id, name, description (truncated by default), techniques_used}], next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results to return. Range: 1-200. | |
| include | No | Detail level. Default ('') returns slim records (description truncated to 240 chars). Pass 'full' for full description on every row. | |
| keyword | No | Substring match against case study name + description (case-insensitive). Min 2 chars. Example: 'evasion', 'data poisoning'. Omit to list all. | |
| technique_id | No | Filter to case studies that include this ATLAS technique id, format 'AML.T####' or 'AML.T####.###' (e.g. 'AML.T0051'). Omit for any technique. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate safe, idempotent read. Description adds rate limits (30/hr free, 500/hr Pro), default slim response behavior, and the effect of include='full', which are valuable beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four concise sentences, front-loaded with purpose and key details. Every sentence adds value without repetition. Well-structured for quick parsing.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, parameters, rate limits, return structure, and sibling differentiation. With output schema implied, the description is sufficiently complete. Minor gap: could mention the limit parameter's maximum (200) explicitly, but schema covers it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description enhances parameter understanding with examples, format details (e.g., 'AML.T####'), and default behaviors, adding meaningful value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it searches ATLAS case studies by keyword or technique, and distinguishes from the sibling tool atlas_case_study_lookup for drill-down details. It also differentiates from technique search tools implicitly.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides a clear use case ('when the user has a technique in hand and wants to see incidents') and refers to atlas_case_study_lookup for full procedure list. Lacks explicit when-not-to-use guidance, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
atlas_technique_lookupATLAS Technique LookupARead-onlyIdempotentInspect
Look up a MITRE ATLAS technique — the AI/ML adversarial attack catalog. ATLAS catalogues TTPs targeting machine learning systems: prompt injection, model evasion, training data poisoning, model theft, etc. Roughly 80% of ATLAS techniques are AI/ML-specific (no ATT&CK bridge); 20% mirror an enterprise ATT&CK technique via attack_reference_id — use that to pivot to D3FEND defenses (d3fend_defense_for_attack) and CVE search. Sub-techniques inherit tactics from the parent (inherited_tactics=true flag) when ATLAS upstream leaves them empty. Use this tool when the user asks about AI/ML threats, LLM red-teaming, or adversarial ML; for multiple techniques in one call (e.g. drilling into a case study's techniques_used), prefer bulk_atlas_technique_lookup. Returns 404 when the id is not in the synced ATLAS catalog. Free: 30/hr, Pro: 500/hr. Returns {technique_id, name, description, tactics, inherited_tactics, maturity (demonstrated|feasible|realized), attack_reference_id, attack_reference_url, subtechnique_of, created_date, modified_date, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| technique_id | Yes | MITRE ATLAS technique id, format 'AML.T####' or 'AML.T####.###' for sub-techniques (e.g. 'AML.T0000', 'AML.T0051' LLM Prompt Injection, 'AML.T0000.000'). |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds valuable context: returns 404 if ID not found, rate limits (30/hr Free, 500/hr Pro), and the exact return fields (including inherited_tactics flag). No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose and includes all necessary detail. While it is fairly long, every sentence adds value (e.g., 80/20 split, rate limits, return fields). Could be slightly more concise by moving rate limits to annotations, but overall well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of ATLAS techniques (sub-techniques, inheritance, bridging to ATT&CK) and the presence of an output schema, the description is complete. It explains the 80/20 split, inherited_tactics flag, and rate limits, leaving no gaps for the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with one parameter. The description adds the format 'AML.T####' or 'AML.T####.###' with examples, which provides meaning beyond the schema's description field. It also explains sub-technique inheritance (inherited_tactics=true) relevant to the parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool looks up MITRE ATLAS techniques, the AI/ML adversarial attack catalog, and lists examples like prompt injection and model evasion. It distinguishes from siblings by mentioning bulk_atlas_technique_lookup for multiple techniques and atlas_technique_search for searching, making the purpose very specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool (e.g., 'when the user asks about AI/ML threats, LLM red-teaming, or adversarial ML') and provides alternatives: 'for multiple techniques in one call, prefer bulk_atlas_technique_lookup.' It also explains how to pivot to D3FEND via attack_reference_id, giving clear context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
atlas_technique_searchATLAS Technique SearchARead-onlyIdempotentInspect
Search the MITRE ATLAS catalog of AI/ML attack techniques by keyword, tactic, or maturity. Default response is SLIM (description truncated to 240 chars per row); pass include='full' for the verbose record. Pass exclude_id when chaining from atlas_technique_lookup to skip self in sibling-tactic searches. Use this to discover techniques matching a threat-model question, e.g. 'what techniques target LLM serving infrastructure?'. Drill into atlas_technique_lookup with any returned technique_id for the full description, ATT&CK bridge, and pivot hints. For broader cross-referencing: when a result has attack_reference_id, that bridges to D3FEND mitigations via d3fend_defense_for_attack. Free: 30/hr, Pro: 500/hr. Returns {query (echoed filters), total, results [{technique_id, name, description (truncated by default), tactics, inherited_tactics, maturity, attack_reference_id, subtechnique_of}], next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results to return. Range: 1-200. | |
| tactic | No | Filter by ATLAS tactic id, format 'AML.TA####'. Examples: 'AML.TA0002' (Reconnaissance), 'AML.TA0007' (ML Attack Staging). Omit for all tactics. | |
| include | No | Detail level. Default ('') returns slim records (description truncated to 240 chars; drill via atlas_technique_lookup for full text). Pass 'full' for full description on every row — large catalogs (167 techniques) can return ~100KB at full. | |
| keyword | No | Substring match against technique name + description (case-insensitive). Min 2 chars. Example: 'prompt injection', 'model evasion', 'poisoning'. Omit to list all. | |
| maturity | No | Filter by maturity: 'demonstrated' (observed in real attacks), 'feasible' (theoretical), or 'realized' (newer ATLAS classification, treat similar to demonstrated). Omit for all. | |
| exclude_id | No | Optional ATLAS technique id to exclude from results, format 'AML.T####' or 'AML.T####.###'. Useful when chaining from atlas_technique_lookup to fetch siblings without echoing self in the same-tactic search. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds value by explaining the default slim response (truncation to 240 chars), the effect of 'include=full', rate limits (30/hr Free, 500/hr Pro), and the return structure. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficient and well-structured. It starts with the core purpose, then explains default behavior, chaining, cross-referencing, rate limits, and return format. Each sentence serves a purpose; there is no wasted text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (6 parameters, filtering, chaining, output schema exists), the description covers all key aspects: search modes, filtering options (keyword, tactic, maturity), detail level (slim/full), chaining with atlas_technique_lookup via exclude_id, cross-referencing to D3FEND, and rate limits. It also mentions that the output contains fields like technique_id, name, description, etc. This is comprehensive for a search tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds useful context beyond the schema: it explains the default behavior of the 'include' parameter (truncation), the purpose of 'exclude_id' for chaining, and provides examples like 'prompt injection' for keyword. It also mentions the maturity filter's behavior for 'realized' (treat similar to demonstrated). This extra context raises the score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Search the MITRE ATLAS catalog of AI/ML attack techniques by keyword, tactic, or maturity.' This is a specific verb+resource combination that distinguishes it from sibling tools like atlas_technique_lookup (which drills into specific IDs) and atlas_case_study_search (which searches case studies).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use this to discover techniques matching a threat-model question' and provides an example. It further guides the agent to 'Drill into atlas_technique_lookup with any returned technique_id for the full description.' It also mentions chaining with exclude_id and cross-referencing with D3FEND. However, it does not explicitly state when not to use this tool (e.g., when a specific technique ID is known).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
audit_domainAudit DomainARead-onlyIdempotentInspect
Perform comprehensive domain audit: combines domain_report + live HTTP security headers + technology fingerprinting. By default report.dns.txt is filtered to security-relevant entries (SPF, DMARC, DKIM, MTA-STS, TLS-RPT) and report.dns.total_txt_records reports the honest pre-filter count; pass include_all_txt=true for the raw TXT list. Use when you need the full picture (recon + active checks); use domain_report for passive-only assessment. Response carries next_calls — chain with subdomain_enum (always emitted) and ssl_check (when an A record resolves) for the residual recon depth (tech_fingerprint already inline as technologies). Free: 30/hr (costs 6 tokens), Pro: 500/hr. Returns {domain, report, technologies, live_headers, summary, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Root domain to audit, without protocol or path (e.g. 'example.com', 'shopify.com') | |
| include_all_txt | No | Return every TXT record under report.dns.txt (default: False, only SPF/DMARC/DKIM/MTA-STS/TLS-RPT kept). report.dns.total_txt_records is always emitted with the honest pre-filter count. Default filter strips vendor verification strings (google-site-verification, ms=, facebook-domain-verification, etc.) that bloat the response without security signal. Set True only when you need the raw TXT inventory. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, openWorldHint, idempotentHint, destructiveHint. Description adds rate limits, token cost, response structure, and default filtering behavior. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is dense but well-structured with key info first. Some redundancy (e.g., repeating filtering details), but overall efficient for the complexity. Minor improvement possible by trimming verbose explanations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete for a complex tool: covers purpose, usage, parameters, rate limits, chaining hints, and response fields (domain, report, technologies, live_headers, summary, next_calls). Output schema exists, so return values are documented elsewhere.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Both parameters have schema descriptions (100% coverage). The tool description further clarifies the purpose of include_all_txt, explains default filtering logic, and advises when to set it to True, adding significant value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states 'Perform comprehensive domain audit' and specifies it combines domain_report, live HTTP security headers, and technology fingerprinting. It clearly distinguishes from sibling domain_report, making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit guidance: 'Use when you need the full picture; use domain_report for passive-only assessment.' Also details chaining with subdomain_enum and ssl_check, and includes rate limits (30/hr Free, 500/hr Pro).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
brand_assetsBrand AssetsARead-onlyIdempotentInspect
Scrape a domain's homepage <head> for public brand assets — favicon, og:image, theme-color, og:site_name, JSON-LD Organization.logo. Use to enrich CRM records, build company-card UIs, or correlate a lead's site to their visual identity (no manual screenshot required). Strictly homepage-only (path /); we do NOT crawl. Ethical floor: target's robots.txt is honoured — Disallow: / for ContrastAPI OR * returns 403 error.code = robots_txt_disallow and we DO NOT fetch. Cache-Control: no-store / private from the target is respected (response is built but NOT written to our cache; cache_respected=false flags this). Per-target eTLD+1 throttle (60 req/min) prevents weaponising via subdomain rotation. All URL fields are absolute and _untrusted (DO NOT execute or shell-out — the target controls these strings). Free: 30/hr, Pro: 500/hr. Returns {domain, fetched_url, status_code, favicon_url_untrusted, og_image_url_untrusted, theme_color, site_name_untrusted, logo_url_untrusted, cache_respected, summary}. Returns 502 on DNS/TCP/TLS failure; 403 robots_txt_disallow when the target opted out.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Registrable domain to scrape brand assets for (e.g. 'github.com', 'stripe.com'). No scheme, no path, no port. The bot fetches https://<domain>/ with HTTP fallback. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint, openWorldHint, idempotentHint, destructiveHint=false. The description goes far beyond by detailing robots.txt handling (returns 403 with error code), Cache-Control respect, per-target rate limiting (60 req/min), and how untrusted URL fields should not be executed. It also specifies error codes (502, 403) and caching behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the main action and then proceeds logically through use cases, constraints, and output fields. It is slightly dense but every sentence adds value. Could be trimmed slightly but remains efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has only one required parameter, high schema coverage, and an output schema, the description fully explains input format, output fields, error handling, rate limiting, and ethical considerations. It is comprehensive without missing critical details.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter 'domain' with schema description. The tool description adds valuable clarifications beyond schema: 'No scheme, no path, no port. The bot fetches https://<domain>/ with HTTP fallback.' Since schema coverage is 100%, baseline is 3, but the extra context elevates to 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scrapes a domain's homepage <head> for specific public brand assets (favicon, og:image, etc.). It distinguishes itself from sibling tools by explicitly limiting to homepage-only and not crawling, and lists specific use cases like enriching CRM records or building company-card UIs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage scenarios (enrich CRM, build UIs, correlate visual identity) and clear boundaries (homepage-only, no crawl, respects robots.txt). It also states what not to do (no manual screenshot required, not a general scraper) and mentions ethical constraints.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bulk_atlas_technique_lookupBulk ATLAS Technique LookupARead-onlyIdempotentInspect
Bulk ATLAS technique lookup — retrieve full records for up to 50 techniques in a single request instead of N separate atlas_technique_lookup calls. Designed as the natural follow-up to atlas_case_study_lookup, whose techniques_used array can be passed directly. Each item is the same shape as atlas_technique_lookup, including parent-tactics inheritance for sub-techniques (inherited_tactics=true flag) and per-item next_calls (D3FEND bridge when attack_reference_id present, sibling-technique search by tactic, parent lookup for sub-techniques). Free: 30/hr (1 per item), Pro: 500/hr. Returns {results [{technique_id, status (ok|not_found|invalid_format), technique, error}], total, successful, failed, partial, summary}.
| Name | Required | Description | Default |
|---|---|---|---|
| technique_ids | Yes | List of MITRE ATLAS technique ids in format 'AML.T####' or 'AML.T####.###' (e.g. ['AML.T0051', 'AML.T0043', 'AML.T0000.000']). Up to 50 per call. Case-insensitive; normalized + de-duplicated server-side. Each id counts as 1 request toward the rate limit. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate a safe, idempotent read operation. The description adds rich behavioral context: return format with per-item status, error handling, case-insensitivity, normalization, deduplication, max 50 items, and detailed behavior of each returned record (parent-tactics inheritance, next_calls). This goes well beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured: it starts with the core purpose, then usage context, then detailed behavior and return format. Every sentence adds value without redundancy. Despite length, it is well-organized and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers all necessary aspects: input constraints, output format, error handling, rate limits, relationship to sibling tools, and even next-call suggestions. Given the tool's complexity, it is exceptionally complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a detailed parameter description. The description adds value by clarifying that each ID counts as 1 request toward the rate limit, and that IDs are case-insensitive, normalized, and deduplicated server-side. This extra context warrants a score above baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's function as a bulk lookup for up to 50 ATLAS techniques, distinguishing it from the singular atlas_technique_lookup. It specifies the action 'retrieve full records' and the resource 'ATLAS techniques,' with explicit context as a follow-up to atlas_case_study_lookup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly positions the tool as an alternative to multiple atlas_technique_lookup calls, advises when to use (when having multiple technique IDs), and provides rate limit details (Free 30/hr, Pro 500/hr) with per-item counting. It also implies when not to use (for single lookups).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bulk_cve_lookupBulk CVE LookupARead-onlyIdempotentInspect
Batch query multiple CVEs (up to 50 per call, same for Free and Pro): retrieve full CVE details for all in 1 request instead of N. By default each CVE's affected_products is truncated to the first 20 entries (total_products reports honest count) and references to the first 10 (total_references reports honest count); pass include_affected_products=true / include_full_references=true to return full lists. Pass include_reference_tags=true to receive references_full=[{url, tags, source}] per CVE in the batch. Pass include_severity_breakdown=true to receive severity_sources/consensus/disagreement per CVE. Use for dependency audits or bulk vulnerability enrichment; use cve_lookup for single CVE. Each successful item carries next_calls — chain with kev_detail (when kev.in_kev=true), cwe_lookup (when cwe_id is present), or exploit_lookup. Free: 30/hr (1 per item), Pro: 500/hr. Returns {results, total, successful, failed, timed_out, partial, summary}.
| Name | Required | Description | Default |
|---|---|---|---|
| cve_ids | Yes | List of CVE identifiers in format CVE-YYYY-NNNNN (e.g. ['CVE-2024-3094', 'CVE-2021-44228', 'CVE-2023-44487']). Maximum 50 per request (same cap for Free and Pro). | |
| include_reference_tags | No | Return structured references_full per CVE in the batch [{url, tags, source}]. Same shape as cve_lookup (default: True). Activates tag-first patch detection per item. Set False for legacy clients. | |
| include_full_references | No | Return the full references list for each CVE in the batch (default: True). total_references is always emitted. Set False to truncate each item to first 10 entries when payload-bound. | |
| include_affected_products | No | Return the full affected_products list for each CVE in the batch (default: False, each CVE returns first 20). Set True for bulk dependency audits. | |
| include_severity_breakdown | No | Return severity_sources/consensus/disagreement per CVE in batch. Same shape as cve_lookup (default: True). cvss_v2 and cvss_v2_vector are always emitted (additive non-opt-in). Set False to skip if downstream cannot tolerate the extra fields. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. Description adds critical behavioral details: default truncation of affected_products (first 20) and references (first 10), flags to override, rate limits for Free/Pro, and the return envelope structure. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is dense but each sentence earns its place. Front-loaded with core behavior and constraints. Slightly long but not wasteful; could be trimmed minimally, but structure is logical.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (batch operation, multiple optional flags, rate limits, chaining), the description covers all essential aspects. Output schema exists, so return values are not described, which is correct. Includes batching limits, truncation defaults, flag semantics, rate limits, and cross-tool chaining hints. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (baseline 3). Description significantly enriches every parameter: specifies max 50 for cve_ids and format, explains default values for booleans (e.g., include_reference_tags default True, include_affected_products default False) and practical use cases for setting them. This goes well beyond the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it batch queries multiple CVEs (up to 50 per call) to retrieve full CVE details in one request. Explicitly distinguishes from sibling cve_lookup by stating 'use cve_lookup for single CVE.' Verb+resource+scope is specific and actionable.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use scenarios ('dependency audits or bulk vulnerability enrichment') and when-not-to ('use cve_lookup for single CVE'). Also offers chaining guidance with kev_detail, cwe_lookup, exploit_lookup based on returned fields.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bulk_ioc_lookupBulk IOC LookupARead-onlyIdempotentInspect
Batch query multiple IOCs (IP/domain/URL/hash, up to 50 per call, same for Free and Pro) in 1 request: auto-detects type + queries abuse.ch feeds per-indicator. Per-type source coverage matches ioc_lookup: hash → ThreatFox only; IP → ThreatFox + Feodo + URLhaus; domain / URL → ThreatFox + URLhaus. Each result item carries its own verdict.sources_queried / sources_unavailable so partial failures are visible per indicator. Use for SOC alert triage or batch enrichment; use ioc_lookup for single indicator. Free: 30/hr (1 per item), Pro: 500/hr. Returns {results, total, successful, failed, timed_out, partial, summary}.
| Name | Required | Description | Default |
|---|---|---|---|
| indicators | Yes | List of indicators of compromise: IP addresses, domains, URLs, or file hashes (e.g. ['8.8.8.8', 'evil.com', 'd41d8cd98f00b204e9800998ecf8427e']). Maximum 50 per request (same cap for Free and Pro). Each indicator type is auto-detected. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds detailed behavioral context beyond annotations: per-type source coverage (hash→ThreatFox, IP→ThreatFox+Feodo+URLhaus, domain/URL→ThreatFox+URLhaus), partial failure visibility via verdict fields, and return structure. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
All sentences are substantive, front-loaded with core capability, no fluff. Efficiently covers key aspects in a single paragraph.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with multiple indicator types, different sources, and a complex response, the description covers usage, behavior, rate limits, and response composition. Output schema handles return fields.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameter details (max 50, auto-detection). Description adds per-type query behavior, which is helpful but not essential given the schema's completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifies 'batch query multiple IOCs' with a clear scope (up to 50, auto-detects type) and distinguishes from the single-indicator sibling `ioc_lookup`.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states use cases ('SOC alert triage or batch enrichment') and provides an alternative ('use ioc_lookup for single indicator'), plus rate limits per plan.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bulk_sigma_rule_lookupBulk Sigma Rule LookupARead-onlyIdempotentInspect
Bulk Sigma rule lookup — retrieve full records for up to 50 rule UUIDs in a single request instead of N separate sigma_rule_lookup calls. Designed for triage workflows where multiple rule ids are known (e.g., from a SIEM alert batch or a tagged detection bundle). Each item is the same shape as sigma_rule_lookup with status ok/not_found/invalid_format and an error field when applicable. Up to 50 rule ids per call (same cap for Free and Pro). Each rule_id consumes 1 unit of the hourly quota; ids beyond the caller's remaining quota land in skipped_due_to_rate_limit instead of failing the whole batch (parity with bulk_cve/ioc). Free: 30/hr, Pro: 500/hr. Returns {results [{rule_id, status, rule, error}], total, processed, skipped_due_to_rate_limit, successful, failed, partial, summary, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| rule_ids | Yes | List of Sigma rule UUIDs in RFC 4122 format. Up to 50 per call (same cap for Free and Pro). Each rule_id counts as 1 request toward the hourly quota. Per-item validation: invalid-format ids return status='invalid_format', unknown UUIDs return status='not_found' — the whole call does not fail. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnly, idempotent, non-destructive. The description adds detailed behavioral context: rate limits (Free 30/hr, Pro 500/hr), quota consumption per ID, partial failure handling via skipped_due_to_rate_limit, and item status codes (ok/not_found/invalid_format). No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is informative but slightly verbose, including detailed return structure and quota mechanics. It front-loads the main purpose effectively, but could be trimmed slightly for conciseness without losing essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (bulk operation, rate limits, partial failures, output shape) and no output schema provided, the description comprehensively covers return fields, status codes, quota details, and edge-case behavior, making it fully actionable for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter rule_ids is fully documented in the schema (maxItems, format, description). The description adds value by explaining quota consumption per ID and per-item validation behavior, which goes beyond the schema's description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'retrieve full records for up to 50 rule UUIDs' with a specific verb and resource, and explicitly distinguishes from sigma_rule_lookup by noting it replaces N separate calls.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit context: 'Designed for triage workflows where multiple rule ids are known (e.g., from a SIEM alert batch or a tagged detection bundle)' and contrasts with single-lookup alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
calculate_risk_scoreCalculate Risk ScoreARead-onlyIdempotentInspect
Composite CVE risk score (0-100) — fuses CVSS, EPSS, KEV, and PoC into a single agent-ready triage signal. Formula: CVSS0.20 + EPSS0.35 + KEV0.30 + PoC0.15 (each component rescaled to 0-100 before weighting). Multiplicative boosters applied in order: KEV+PoC combo (*1.15), critical-severity-with-high-EPSS (CVSS>=9 AND EPSS>0.7, *1.10), recently published (within last 7 days, *1.05). Final score clamped to [0, 100]. Label bands: CRITICAL>=90, HIGH>=70, MEDIUM>=40, LOW<40. Urgency text encodes patch SLA (immediate when KEV; 24h/72h/30d by label). Use to triage a single CVE without orchestrating cve_lookup + exploit_lookup separately. PoC signal here is the local ExploitDB mirror only — for full multi-source exploit detail (GitHub Advisory + Shodan refs + ExploitDB), call exploit_lookup separately. Methodology adapted from mukul975/cve-mcp-server (Apache-2.0): https://github.com/mukul975/cve-mcp-server. Free: 30/hr, Pro: 500/hr. Returns {cve_id, score (0-100), label (CRITICAL/HIGH/MEDIUM/LOW), urgency, has_public_poc, components (cvss_v3, epss_score, in_kev, has_public_poc, weighted_breakdown), boosters_applied, recommendation, summary, verdict, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| cve_id | Yes | CVE identifier in format CVE-YYYY-NNNNN (e.g. 'CVE-2021-44228', 'CVE-2024-3094') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=true and idempotentHint=true, confirming safe read operation. The description goes beyond by detailing the scoring formula, multiplicative boosters, clamping, label bands, urgency text, and rate limits (30/hr free, 500/hr Pro). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively long but all sentences add value: core purpose, formula, boosters, usage context, methodology attribution, rate limits, and output fields. It is frontloaded with the most critical information. Slight verbosity is justified by the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (composite score with formula), the description is exceptionally complete. It explains the formula, boosters, label bands, urgency, when to use alternatives, rate limits, and output structure. The output schema is mentioned. No gaps remain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The only parameter (cve_id) has 100% schema coverage with a clear format description. The description adds no additional constraints or semantics beyond what the schema provides, so a baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool calculates a composite risk score (0-100) fusing CVSS, EPSS, KEV, and PoC. It distinguishes from siblings like cve_lookup and exploit_lookup by offering a single-call triage alternative. The verb 'calculate' and resource 'risk score' are specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description clearly says 'Use to triage a single CVE without orchestrating cve_lookup + exploit_lookup separately.' It also provides a when-not-to-use: 'PoC signal here is the local ExploitDB mirror only — for full multi-source exploit detail ... call exploit_lookup separately.' This gives explicit context and alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_dependenciesCheck DependenciesARead-onlyIdempotentInspect
Audit project dependencies (npm/PyPI/Maven/RubyGems/etc.) against CVE database: find known vulnerabilities in your package list. Bulk query up to 50 packages per call (same for Free and Pro). Use for dependency security scanning; use cve_lookup for single CVE. Free: 30/hr (1 per package), Pro: 500/hr. Returns {findings, total, by_severity, summary}. Each finding includes fixed_in (first patched version per NVD/MITRE version range) when a version range matched — omitted from wire when the range is open-ended or no input version was supplied; remediation copy then says 'Check if ... is affected ... and upgrade if so' instead of 'Upgrade to X.Y.Z or later'.
| Name | Required | Description | Default |
|---|---|---|---|
| packages | Yes | List of dependency packages to audit. Each item is an object with 'name' (required, max 200 chars, e.g. 'lodash', 'django', 'log4j-core') and optional 'version' (max 100 chars, e.g. '4.17.0', '2.14.1'). Only 'name' and 'version' fields are used; extra fields are ignored. Example: [{"name": "lodash", "version": "4.17.0"}, {"name": "django"}]. Maximum 50 per request (same cap for Free and Pro). |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate safe read operation, but description adds crucial behavioral details: batch limit (50 per call), return structure, and edge case handling for fixed_in field and remediation copy. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is front-loaded with main purpose and usage, then details. All sentences are informative, but could be slightly more concise without losing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all critical aspects: ecosystems, batch behavior, rate limits, return structure, fixed_in edge cases. With an output schema, the description is comprehensive enough for an agent to use effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 100% schema coverage, description adds extensive meaning: specifies required name field, optional version, constraints (200/100 chars), extra fields ignored, example format, and 50-package cap. This significantly enriches understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it audits dependencies for CVEs, specifying the resource (dependencies) and action (find vulnerabilities). It distinguishes from sibling tools like cve_lookup for single CVE and bulk_cve_lookup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases (dependency security scanning) and alternatives (cve_lookup for single CVE), plus rate limits for Free and Pro tiers. Lacks explicit when-not-to-use, but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_headersCheck HeadersARead-onlyIdempotentInspect
Validate HTTP security headers you provide (JSON): CSP, HSTS, X-Frame-Options, X-Content-Type-Options, Permissions-Policy, Referrer-Policy against best practices. Use to test header config before deployment or validate non-public servers; use scan_headers to fetch live. Free: 30/hr, Pro: 500/hr. By default header values are truncated to 500 chars; pass include='full' for the full raw value. Returns {total, by_severity, findings}. No external requests.
| Name | Required | Description | Default |
|---|---|---|---|
| headers | Yes | JSON string of HTTP header name-value pairs to validate. Example: '{"Strict-Transport-Security": "max-age=31536000", "X-Frame-Options": "DENY"}'. Include only security-relevant headers you want to analyze. | |
| include | No | Detail level. Default ('') returns slim findings — raw header values capped at 500 chars with total_value_length carrying the honest pre-truncation length. Pass 'full' to restore the full raw value. Allowed: '' or 'full'. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare read-only, idempotent, non-destructive. Description adds that no external requests are made, details header truncation behavior (default 500 chars, include='full' for full value), and rate limits. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise 6-sentence description. All sentences are essential: purpose, usage, rate limits, truncation, return structure, external requests. No redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description fully covers what the tool does, when to use it, parameters behavior, return value structure, and limitations (truncation, rate limits). Output schema exists, so return values are briefly but adequately described.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameters. Description adds value: for 'headers' param it clarifies to include only security-relevant headers; for 'include' it explains the difference between default and 'full' and mentions total_value_length. Provides more context than schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool validates HTTP security headers (CSP, HSTS, etc.) against best practices using a provided JSON. Distinguishes itself from scan_headers which fetches headers live.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use (before deployment, for non-public servers) and when not (use scan_headers for live fetch). Also mentions rate limits differentiating free/pro tiers.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_injectionCheck InjectionARead-onlyIdempotentInspect
Scan source code for injection vulnerabilities: SQL injection, command injection, path traversal via unsafe string concatenation/unsanitized input. Supports Python, JavaScript, TypeScript, Java, Go, Ruby, Shell, Bash. Use to detect input-handling bugs; for secrets use check_secrets. Companion code-security tools: check_secrets (hard-coded credential detection), check_dependencies (known-CVE vulnerability audit), check_headers (live HTTP security-header validation), scan_headers (live HTTP scan via domain). Free: 30/hr, Pro: 500/hr. Returns {total, by_severity, findings}. No data stored.
| Name | Required | Description | Default |
|---|---|---|---|
| code | Yes | Source code string to scan for injection vulnerabilities (can be a single file or code snippet) | |
| language | No | Programming language of the code. Must be one of: python, javascript, typescript, java, go, ruby, shell, bash, generic. Use 'generic' if unsure. | generic |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, so the agent knows it's safe. Description adds that no data is stored, returns a specific structure, and supports multiple languages. This adds value beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is three sentences with no wasted words. It front-loads the purpose, then lists supported languages, usage guidance, rate limits, companion tools, and return format. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With output schema existing, the description does not need to detail return values. It covers purpose, supported languages, usage boundaries, rate limits, companion tools, and data storage policy. Very complete for the complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds extra guidance for the language parameter: 'Use \'generic\' if unsure', which helps selection. For the code parameter, it clarifies it can be a single file or snippet, slightly enhancing the schema description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it scans for injection vulnerabilities (SQL, command, path traversal) in source code. It differentiates from sibling check_secrets by specifying 'for secrets use check_secrets', and lists companion tools. The verb 'scan' and resource 'source code' are specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'Use to detect input-handling bugs'. Provides an alternative tool: 'for secrets use check_secrets'. Also mentions rate limits (30/hr free, 500/hr Pro), which guides usage boundaries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_secretsCheck SecretsARead-onlyIdempotentInspect
Scan source code (or snippet) for hardcoded secrets — cloud provider keys, API tokens, connection strings, private keys, passwords. Supports Python, JavaScript, TypeScript, Java, Go, Ruby, Shell, Bash. Use to detect leaked credentials before commit; for injection detection use check_injection. Free: 30/hr, Pro: 500/hr. Returns {total, by_severity, findings}. No data stored. The generic password-assignment rule is suppressed when a more-specific credential rule fires on the same line — one targeted finding per leaked secret, not two.
| Name | Required | Description | Default |
|---|---|---|---|
| code | Yes | Source code string to scan for secrets (can be a single file or code snippet) | |
| language | No | Programming language of the code. Must be one of: python, javascript, typescript, java, go, ruby, shell, bash, generic. Use 'generic' if unsure. | generic |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. Description adds that 'No data stored' and explains the suppression rule for generic password rules when a more specific rule fires, providing extra context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is detailed but every sentence adds value. Slightly verbose with the suppression rule explanation, but it is relevant and tightly written.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, usage guidance, behavioral notes, output format ({total, by_severity, findings}), and rate limits. Given the presence of output schema and complete parameter descriptions, the description is fully adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters described. The description adds usage hints like 'can be a single file or code snippet' for code and 'Use generic if unsure' for language, which adds value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scans source code for hardcoded secrets, listing specific examples. It distinguishes from sibling tool check_injection by specifying use for credential detection vs injection detection.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('detect leaked credentials before commit'), provides alternative ('for injection detection use check_injection'), and includes rate limits (Free: 30/hr, Pro: 500/hr).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
contrast_scanContrast ScanARead-onlyIdempotentInspect
Active website security scan: runs the ContrastScan C engine (11 modules — HTTP security headers, SSL/TLS, DNS, redirect chain, information disclosure, cookie flags, DNSSEC, HTTP methods, CORS, HTML hygiene, deep CSP analysis) against the live site and enriches the raw result with severity-ranked vulnerability findings and a letter grade. Use for a hands-on misconfiguration scan; use audit_domain for passive recon (DNS/WHOIS/SSL/threat intel) and scan_headers for headers only. Active outbound fetch — a per-target eTLD+1 throttle (60 req/min) applies. Free: 30/hr (costs 6 tokens), Pro: 500/hr. Returns {domain, resolved_ip, total_score, max_score, grade, findings, findings_count, headers, ssl, dns, redirect, disclosure, cookies, dnssec, methods, cors, html, csp_analysis, enterprise, summary, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Root domain to scan, without protocol or path (e.g. 'example.com'). Bare IPs and private-resolving domains are rejected. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds behavioral details beyond annotations: active outbound fetch, per-target throttle of 60 req/min, enrichment of raw results, and return structure. Annotations already declare readOnly and idempotent, and the description is consistent with these.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is fairly long but well-structured; it uses a bullet-like list of modules and organizes information about usage, limitations, and return fields. It could be slightly more concise but effectively communicates all essential details without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (active scan with 11 modules, multiple return fields), the description is comprehensive. It covers what the tool does, how it works, limitations, and output structure. The presence of an output schema (mentioned) helps, but the description still provides complete contextual guidance.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Although there is only one parameter (domain) and schema coverage is 100%, the description adds valuable semantic context: the required root domain format, prohibition of protocol/path, and rejection of bare IPs and private-resolving domains. This goes beyond the schema's basic description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it performs an active website security scan using the ContrastScan C engine and lists 11 specific modules. It also distinguishes the tool from sibling tools like audit_domain and scan_headers by specifying their different purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use this tool (hands-on misconfiguration scan) and when to use alternatives (audit_domain for passive recon, scan_headers for headers only). It also provides rate limits and pricing information, giving clear usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cve_leadingCVE LeadingARead-onlyIdempotentInspect
List CVEs indexed from MITRE/GHSA BEFORE NVD publication (early-warning, freshest data). By default each result is slim (no description, no cvss_breakdown, no affected_products list, no references) — pass include='full' for the same payload shape as cve_lookup; for drill-down on a single CVE prefer cve_lookup. Use for threat intelligence on emerging CVEs; use cve_search for published NVD data. Verdict (sources_queried, falsifiable_fields, completeness, data_age) is at the response root — applies to the whole batch, not per-row. Response carries a global hint pointing at cve_lookup — drill into any returned cve_id for full detail and chained pivots (exploit_lookup, kev_detail, cwe_lookup). Free: 30/hr, Pro: 500/hr. Returns {count, total, truncated, offset, summary, results, next_offset, verdict, hint}.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum results to return. Range: 1-200. | |
| offset | No | Skip N results for pagination. | |
| include | No | Per-result detail level. Default ('') returns slim list items (cve_id, summary, severity, cvss_v3, cwe_id, epss, kev, total_products, published, modified, sources). Pass 'full' to also return description, cvss_breakdown, affected_products, references, first_seen_source, first_seen_at. Slim default avoids description/summary duplication that bloats 50-item leading lists. Verdict is at the response root, not per-row (deduplicated for ~40% payload savings). Allowed: '' or 'full'. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, covering safety. The description adds value by explaining the response structure (verdict at root, hint pointing to cve_lookup), the payload savings from slim default, and deduplication. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is detailed but efficient, with key information front-loaded. Every sentence contributes to understanding the tool's behavior, usage, and response format. Minor room for improvement in brevity but overall well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (pagination, slim/full modes, verdict at root, hints), the description covers all essential aspects. It explains the response shape, deduplication, and relationships with sibling tools. The output schema likely complements this, but the description stands alone as complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all three parameters. The description adds meaning beyond the schema by explaining the impact of the 'include' parameter on response verbosity and why the slim default is beneficial. This helps the agent understand trade-offs.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'List CVEs indexed from MITRE/GHSA BEFORE NVD publication (early-warning, freshest data).' It uses a specific verb ('list') and resource ('CVEs') with explicit source and timing distinctions, setting it apart from siblings like cve_search and cve_lookup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit when-to-use and when-not-to-use guidance: 'Use for threat intelligence on emerging CVEs; use cve_search for published NVD data.' It also explains the default slim response vs. 'full' and directs users to cve_lookup for single CVE drill-down. Rate limits are clearly stated.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cve_lookupCVE LookupARead-onlyIdempotentInspect
Retrieve detailed CVE data by ID: description, CVSS v3.1 + vector, CVSS v2 (always emitted), EPSS score + percentile, CISA KEV status (expanded: due_date, required_action, ransomware flag, vendor_project, product, vulnerability_name, short_description, notes, cwes, date_removed when in_kev=true), NVD vulnerability_status (Analyzed/Modified/Awaiting Analysis/Deferred/Rejected/Withdrawn), cve_tags ('disputed' triggers [DISPUTED] summary prefix), affected products (CPE), references, patch availability, related CVEs. By default affected_products is truncated to the first 20 entries (total_products reports the honest count) and references to the first 10 (total_references reports the honest count). Pass include_affected_products=true and/or include_full_references=true for the complete lists. Pass include_reference_tags=true to receive structured references_full=[{url, tags, source}] (NVD upstream tags + source provenance) — also activates tag-first patch detection. Pass include_severity_breakdown=true to receive severity_sources/consensus/disagreement (multi-source view of NVD/MITRE/GHSA/OSV severity assessments). Use for single-CVE details; use cve_search for queries by product/severity. Response carries next_calls — chain with kev_detail when kev.in_kev=true, with cwe_lookup on each CWE in cwes (up to 3 pivots), and with exploit_lookup for public PoC availability. Free: 30/hr, Pro: 500/hr. Returns {cve_id, summary, description, severity, cvss_v3, cvss_v2, cvss_v2_vector, cvss_breakdown, cwe_id, cwes, vulnerability_status, cve_tags, published, modified, sources, first_seen_source, first_seen_at, epss, kev (in_kev, date_added, due_date, required_action, known_ransomware_use, vendor_project, product, vulnerability_name, short_description, notes, cwes, date_removed), affected_products (first 20 by default), total_products, references (first 10 by default), total_references, total_references_unique, references_full (only when include_reference_tags=true), severity_sources/severity_consensus/severity_disagreement (only when include_severity_breakdown=true), patch_available, related_cves, verdict, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| cve_id | Yes | CVE identifier in format CVE-YYYY-NNNNN (e.g. 'CVE-2024-3094', 'CVE-2023-44487') | |
| include_reference_tags | No | Return structured references_full field with [{url, tags, source}] objects (NVD reference tags + source provenance) (default: True). Inspects which references are vendor patches (tags=['Patch']) vs exploit PoCs (tags=['Exploit']) vs mailing list discussions. Patch URL detection is tag-first when refs_with_tags is populated; legacy cached rows fall back to regex. Set False to skip the structured shape for legacy clients. | |
| include_full_references | No | Return the full references list (default: True, returns all references). total_references is always emitted with the honest count; patch URL detection always runs against the full list, so patch_url/patch_available are unaffected. Set False to truncate to first 10 entries when bandwidth-bound. | |
| include_affected_products | No | Return the full affected_products list (default: False, returns first 20). Set True for bulk audits or dependency scanning of Log4j-class CVEs with 50+ products. | |
| include_severity_breakdown | No | Return severity_sources, severity_consensus, and severity_disagreement (multi-source severity breakdown) (default: True). Surfaces vendor disputes (e.g. CVE-2023-38545 NVD-CRITICAL vs GHSA-HIGH). cvss_v2 and cvss_v2_vector are always emitted (additive non-opt-in). Consensus uses majority-bucket vote with highest-severity tie-break (CRITICAL > HIGH > MEDIUM > LOW > NONE). Set False to skip if downstream cannot tolerate the extra fields. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations (readOnlyHint true, idempotentHint true, destructiveHint false) are consistent. Description adds extensive behavioral context: default truncation (first 20 affected_products, first 10 references), honest count reporting, tag-first patch detection logic, consensus majority-bucket vote for severity breakdown, and rate limits. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is verbose but well-structured with clear separation of default behavior, parameters, and response fields. Every sentence adds value, though length could be reduced for quicker parsing. However, given the tool's complexity, the structure is effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists, and description thoroughly explains every field in the return value, including optional fields conditional on boolean parameters. Mentions always-emitted fields (cvss_v2) and next_calls for chaining. Covers all necessary context for an agent to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and description enriches each parameter: cve_id format explained, include_reference_tags adds tag-first detection context, include_affected_products suggests use case, include_severity_breakdown explains consensus algorithm, include_full_references states patch detection unaffected. This goes well beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description explicitly states 'Retrieve detailed CVE data by ID' and lists the specific data fields returned. It distinguishes from sibling 'cve_search' tool which is for queries by product/severity, and also references other sibling tools like 'kev_detail', 'cwe_lookup', 'exploit_lookup' for chaining.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Use for single-CVE details; use cve_search for queries by product/severity.' Provides detailed guidance on when to set boolean parameters like include_affected_products for bulk audits, include_severity_breakdown for vendor disputes, and include_reference_tags for tag-first patch detection. Also explains rate limits and the next_calls chaining mechanism.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cve_searchCVE SearchARead-onlyIdempotentInspect
Search CVE database with filters: product/vendor, severity, published date range, EPSS score, CWE, CVSS range, CISA KEV status. Default response is SLIM per-result (cve_id, summary, severity, cvss_v3, cwe_id, epss, kev, total_products, published, modified, sources) — pass include='full' for description, cvss_breakdown, affected_products, references, first_seen_*. Verdict (sources_queried, falsifiable_fields, completeness, data_age) is at the response root — applies to the whole batch, not per-row. Product/vendor filters are EXACT NVD-canonical-token matches (not the common name — e.g. nginx is 'nginx_open_source'/'nginx_plus', vendor 'f5'); a low/zero count for a well-known product means the token differs, so for dependency/package lists use check_dependencies and for a domain's whole stack tech_stack_cve_audit (both auto-normalize tokens). Use for vulnerability discovery by criteria; pass cwe_id (e.g. CWE-79) to enumerate every CVE in our database mapped to a weakness — pair with cwe_lookup for the category description and mitigations. Use cve_lookup for single CVE by ID, kev_detail when kev=true filtering and the agent needs federal patch deadlines per result. Response carries a global hint pointing at cve_lookup — drill into any returned cve_id for full detail and chained pivots (exploit_lookup, kev_detail, cwe_lookup). Free: 30/hr, Pro: 500/hr. Returns {count, total, truncated, offset, summary, results, query_echo, next_offset, verdict, hint}.
| Name | Required | Description | Default |
|---|---|---|---|
| kev | No | If true, return only CVEs in the CISA Known Exploited Vulnerabilities (KEV) catalog — these are actively exploited in the wild. | |
| sort | No | Sort order for results. Must be one of: published_desc (newest first), epss_desc (most exploitable first), cvss_desc (most severe first). Omit for newest first (default=published_desc). | |
| limit | No | Maximum results to return. Range: 1-200. | |
| cwe_id | No | Filter by CWE weakness ID. Exact match, case-insensitive. Common values: CWE-79 (XSS), CWE-89 (SQL injection), CWE-120 (buffer overflow), CWE-78 (command injection). Format: CWE-<number>. Omit to not filter by CWE. | |
| offset | No | Skip N results for pagination. Use with limit to page through results. | |
| vendor | No | Filter by vendor name (case-insensitive). When combined with product, both must match the same CPE row — prevents cross-row false matches. Example: vendor=apache, product=struts. | |
| include | No | Per-result detail level. Default (omit) returns slim list items (cve_id, summary, severity, cvss_v3, cwe_id, epss, kev, total_products, published, modified, sources). Pass 'full' to also return description, cvss_breakdown, affected_products, references, first_seen_source, first_seen_at — only do this when the user explicitly wants drill-down on every result. Even with 'full', per-result affected_products and references may be truncated (the per-result total_products/total_references report the honest counts); use cve_lookup for the guaranteed-complete per-CVE lists. For single-CVE detail prefer cve_lookup; slim default keeps token cost ~70% lower on Log4j-class queries. Note: verdict is at the response root, not per-row (was deduplicated to save ~40% payload). | |
| product | No | Product or vendor token to filter by. EXACT match (case-insensitive) against the NVD-canonical CPE product/vendor token — NOT substring/fuzzy, and NOT necessarily the common project name. Common names, vendor renames, and build-tool artifact ids often differ from the canonical token (e.g. modern nginx CVEs are under 'nginx_open_source'/'nginx_plus', vendor 'f5', not 'nginx'; Maven 'log4j-core' maps to 'log4j'). A low or zero count for a well-known product usually means the token differs — do NOT assume coverage is complete. For dependency/package lists prefer check_dependencies, and for a domain's whole tech stack tech_stack_cve_audit (both auto-normalize tokens). A product match means CVEs exist for that product, not that a specific running version is affected — verify the running version is within each CVE's affected range. Omit to search all products. | |
| cvss_max | No | Maximum CVSS v3 base score (0.0-10.0). Default 10.0 = no filter (sentinel, not applied). Set < 10.0 to filter — CVEs with null CVSS are excluded when active. Combine with cvss_min for a range. | |
| cvss_min | No | Minimum CVSS v3 base score (0.0-10.0). Default 0.0 = no filter (sentinel, not applied). Set > 0 to filter — CVEs with null CVSS are excluded when active. Use 7.0 for high+critical, 9.0 for critical only. | |
| epss_min | No | Minimum EPSS score filter (0.0-1.0). EPSS predicts exploitation probability. 0.5 = top ~5% most likely to be exploited. 0.0 = no filter. | |
| severity | No | CVSS severity level. Must be one of: CRITICAL, HIGH, MEDIUM, LOW. Omit for all severities. | |
| published_after | No | Inclusive lower bound on publish date as YYYY-MM-DD (UTC). Pick this when the user names a starting point, e.g. 'since 2015' → '2015-01-01', 'after March 2024' → '2024-03-01'. Omit to not bound the lower edge. Combine with published_before for ranges. | |
| published_before | No | Inclusive upper bound on publish date as YYYY-MM-DD (UTC). Pick this when the user names an ending point, e.g. 'before 2020' → '2019-12-31', 'up to 2023' → '2023-12-31'. Omit to not bound the upper edge. Combine with published_after for ranges. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses beyond annotations: default slim response, 'full' include behavior, verdict at root, exact token matching, pagination, sentinel filter behavior for cvss_min/max, rate limits (30/hr free, 500/hr pro). Annotations already indicate read-only and idempotent, so description adds substantial context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very long (over 500 words) and contains some repetition (token matching warning appears twice). While well-structured, it could be more concise to improve scanability for an AI agent. However, it front-loads the core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (14 parameters, many filters, pagination, rate limits) and the presence of an output schema, the description is remarkably complete. It covers use cases, limitations, alternative tools, response structure (including verdict and hint), and even token cost implications.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 100% schema coverage, description adds significant value: explains exact token matching for product/vendor with examples (nginx → nginx_open_source), gives common CWE IDs, clarifies 'full' include trade-offs, and explains EPSS threshold meaning (0.5 = top 5%). Every parameter benefits from additional context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs a search on the CVE database with filters. It distinguishes from siblings by naming check_dependencies, tech_stack_cve_audit, cve_lookup, and kev_detail as alternatives for specific use cases.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit guidance on when to use this tool vs alternatives: for dependency lists use check_dependencies, for full stack use tech_stack_cve_audit, for single CVE use cve_lookup, for KEV deadlines use kev_detail. Also explains when to use 'full' include vs slim and warns about exact token matching.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
cwe_lookupCWE LookupARead-onlyIdempotentInspect
Look up MITRE CWE (Common Weakness Enumeration) catalog record from research view 1000. Default response is SLIM (first 3 mitigations, first 3 examples; extended_description is null) — pass include='full' for the verbose record (full mitigations + examples lists, populated extended_description). Returns description, abstract type (Pillar/Class/Base/Variant/Compound), status (Stable/Draft/Incomplete/Deprecated), exploit likelihood, recommended mitigations, observed example CVEs, parent_cwe (walk up the hierarchy), child_cwes (drill down to more specific weaknesses), and cve_count (LOWER BOUND — counts only CVEs whose primary CWE matches; CVEs with multiple CWEs may not be counted). Use after cve_lookup or kev_detail to understand the underlying weakness category; chain with cve_search(cwe_id=...) to enumerate all matching CVEs. Returns 404 when the CWE is not in research view 1000. Free: 30/hr, Pro: 500/hr. Returns {cwe_id, name, description, extended_description (null on slim, populated on include='full'), abstract_type, status, likelihood, mitigations (first 3 by default), total_mitigations, examples (first 3 by default), total_examples, parent_cwe, child_cwes, cve_count, updated_at, verdict, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| cwe_id | Yes | CWE identifier — accepts 'CWE-79', 'cwe-79', or bare '79'. Common values: CWE-79 (XSS), CWE-89 (SQL injection), CWE-78 (command injection), CWE-502 (deserialization), CWE-22 (path traversal), CWE-120 (buffer overflow). | |
| include | No | Detail level. Default ('') returns slim record (first 3 mitigations, first 3 examples; extended_description is null). total_mitigations / total_examples are always honest pre-truncation counts. Pass 'full' to populate extended_description and return the full mitigations + examples lists. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint and idempotentHint. Description adds significant behavioral context: slim vs full response behavior, honest total counts, error on missing CWE, and rate limits. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is well-structured, starting with main function, then detailing slim/full, output fields, usage guidance, and rate limits. It is front-loaded and efficient, though slightly long; all sentences are informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (2 params, output schema, annotations), the description is complete. It covers parameter behavior, output fields, error handling, rate limits, and usage workflow. No gaps remain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds substantial value: cwe_id accepts multiple formats and lists common values; include parameter explains default slim vs full, with honest pre-truncation counts. This goes well beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool looks up MITRE CWE catalog records from research view 1000, with specific details on default and full responses. Differentiates from sibling tools by specifying use cases (after cve_lookup/kev_detail) and chaining with cve_search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit context for when to use this tool (after cve_lookup or kev_detail) and how to chain with cve_search. Includes rate limits and error handling (404). Lacks explicit 'when not to use' instructions, but context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
d3fend_attack_coverageD3FEND Attack CoverageARead-onlyIdempotentInspect
Batch coverage breakdown: given a list of ATT&CK T-codes, return distinct defense counts per D3FEND tactic + identify which techniques have NO D3FEND mapping (undefended_techniques). Use to assess the defensive posture of an entire attack campaign or threat model in one call. defended_techniques is the subset with at least one D3FEND defense; undefended_techniques are gaps worth flagging. Pair with cve_search per gap to identify exploit availability. Free: 30/hr, Pro: 500/hr. Returns {queried_techniques, coverage_by_tactic, defended_techniques, undefended_techniques, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| attack_technique_ids | Yes | List of ATT&CK technique ids (T#### or T####.###) to assess. Capped at 500 — extra entries are dropped server-side. Example: ['T1059', 'T1550.001', 'T1190', 'T9999']. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds rate limits (30/hr free, 500/hr pro), behavior on overflow (extra entries dropped), and the detailed return structure including next_calls. This goes well beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is dense but efficient, front-loading the main function. It packs purpose, return fields, use case pairing, and rate limits into a few sentences. Slightly dense but no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, good schema and annotations, presence of output schema), the description fully covers all needed context: input format, behavior, output structure, rate limits, and pairing with cve_search. No gaps for a batch coverage tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with an example and maxItems in the parameter description. The tool description reinforces the format and cap, adding practical context (e.g., 'extra entries are dropped server-side') that aids understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: given ATT&CK T-codes, it returns defense counts per D3FEND tactic and identifies undefended techniques. It distinguishes itself from sibling tools like d3fend_defense_for_attack by explicitly targeting batch coverage for entire campaigns or threat models.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear usage context: assess defensive posture of attack campaigns, pair with cve_search for gaps. It mentions the 500-item cap and server-side dropping. However, it does not explicitly state when not to use or list alternatives, though the sibling set implies other tools for single-technique queries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
d3fend_defense_for_attackD3FEND Defense for AttackARead-onlyIdempotentInspect
Reverse lookup: given an ATT&CK T-code, return D3FEND defenses that mitigate it. This is the bridge from offensive intelligence (ATT&CK / ATLAS / CVE) to defensive playbook. Pair with cve_lookup or atlas_technique_lookup output — when those carry an ATT&CK id, call this tool to surface the mitigations. defenses is capped at limit (default 30) for token efficiency; total is the honest pre-truncation count and truncated=true flags when the cap was hit. coverage_by_tactic always aggregates the FULL set, not the slice. Default response is SLIM (drops uri from each row); pass include='full' for the verbose record. Pass exclude_id when drilling from d3fend_defense_lookup to skip self in the 'see also' list. Returns 200 with empty defenses list when the T-code has no D3FEND mapping (the gap is itself a signal). Free: 30/hr, Pro: 500/hr. Returns {attack_technique_id, total, truncated, defenses [{defense_id, label, uri (only when include=full), parent_label, tactic, artifact, attack_label, attack_tactic}], coverage_by_tactic, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Cap on `defenses` array length. Default 30; popular T-codes (T1059, T1078) map to 30-50+ defenses. `total` and `coverage_by_tactic` always reflect the honest pre-truncation count. | |
| include | No | Detail level. Default (omit/empty) returns slim rows (drops the deterministic ontology `uri` — popular T-codes with 15+ defenses save ~900 chars). Pass 'full' to get `uri` back on every row. | |
| exclude_id | No | Optional D3FEND defense slug to omit from the defenses list. Used when chaining from d3fend_defense_lookup so the originating defense is not echoed back in its own 'see also' results. | |
| attack_technique_id | Yes | ATT&CK technique id matching 'T####' or 'T####.###' (e.g. 'T1059', 'T1550.001'). Use this to bridge from CVE/ATLAS findings to D3FEND mitigations. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds significant behavioral context beyond annotations: explains the cap on defenses at `limit`, truncation with `truncated=true`, that `coverage_by_tactic` always reflects the full set, default slim response vs. full, use of `exclude_id`, error handling (200 with empty list when no mapping), and rate limits (Free: 30/hr, Pro: 500/hr). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized given the complexity of the tool. It is front-loaded with purpose, then explains usage, behavioral details, parameter specifics, rate limits, and response structure. Every sentence serves a purpose—no redundancy or fluff. It earns its length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (4 parameters, output schema implied, 100% schema coverage), the description is complete. It covers return fields, edge cases (empty list, truncation), and rate limiting. No gaps identified for agent invocation. The output schema is described in the response structure, which suffices.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description adds substantial meaning beyond the schema. For `limit`, it explains why the cap exists and that popular T-codes map to many defenses. For `include`, it details the token efficiency benefit of the slim default. For `exclude_id`, it explains the chaining use case. For `attack_technique_id`, it specifies matching format (e.g., 'T1059', 'T1550.001'). This adds concrete guidance for agent decision-making.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Reverse lookup: given an ATT&CK T-code, return D3FEND defenses that mitigate it.' It uses a specific verb ('return') and resource ('D3FEND defenses'), and distinguishes from sibling tools by positioning itself as the bridge from offensive intelligence to defensive playbook, explicitly mentioning pairing with cve_lookup or atlas_technique_lookup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit context for when to use: 'Pair with cve_lookup or atlas_technique_lookup output — when those carry an ATT&CK id, call this tool.' It implies when not to use (if no ATT&CK id is available) and names alternative approaches via sibling references in the context. A slight deduction because it does not explicitly say 'do not use if...' but the guidance is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
d3fend_defense_lookupD3FEND Defense LookupARead-onlyIdempotentInspect
Look up a MITRE D3FEND defense technique. D3FEND is the canonical defensive counterpart to ATT&CK — each defense is classified into one of 7 tactics (Model/Harden/Detect/Isolate/Deceive/Evict/Restore) and may target a specific digital artifact (e.g. 'Access Token'). Response includes attack_techniques: the list of ATT&CK T-codes this defense mitigates. Use after d3fend_defense_search for the full record + ATT&CK chain. Returns 404 when the slug is not in the synced D3FEND catalog. Free: 30/hr, Pro: 500/hr. Returns {defense_id, label, uri, parent_label, description, tactic, artifact, attack_techniques, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| defense_id | Yes | D3FEND defense slug from the ontology URI fragment (CamelCase), e.g. 'TokenBinding', 'FileHashing', 'CertificatePinning'. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint, idempotentHint, destructiveHint. Description adds error behavior (404 for missing slug), rate limits, and response structure including attack_techniques. No contradiction, but idempotency is not explicitly reinforced.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Very concise: states purpose, explains D3FEND, provides usage guidance, error handling, rate limits, and response fields in a few sentences. No fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, usage, error, rate limits, and response structure (including nested attack_techniques). With output schema present, description adds all necessary context for an AI agent to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage for the single parameter defense_id. Description adds examples (e.g., 'TokenBinding', 'FileHashing') and clarifies it's a CamelCase slug from ontology URI, exceeding schema info.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Look up a MITRE D3FEND defense technique.' and explains D3FEND's role as a defensive counterpart to ATT&CK with 7 tactics. Distinguishes from sibling d3fend_defense_search by advising to use it after search for full record.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use after d3fend_defense_search for the full record + ATT&CK chain.' Also mentions 404 error for missing slug and rate limits (Free: 30/hr, Pro: 500/hr), providing clear when-to-use and constraints.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
d3fend_defense_searchD3FEND Defense SearchARead-onlyIdempotentInspect
Search the MITRE D3FEND catalog of defensive techniques by keyword, tactic, or targeted artifact. Default response is SLIM (drops uri from each row — saves ~60 chars/row, ~30% on popular drills); pass include='full' for the verbose record. Pass exclude_id when chaining from d3fend_defense_lookup to skip self in sibling-artifact searches. Use to discover defenses applicable to a given threat model — e.g. 'what defenses harden access tokens?' (tactic=Harden + artifact='Access Token'). Drill into d3fend_defense_lookup with any returned defense_id for the ATT&CK technique mappings. Free: 30/hr, Pro: 500/hr. Returns {query, total, results [{defense_id, label, uri (only when include=full), parent_label, tactic, artifact}], next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max results to return. Range: 1-200. | |
| tactic | No | Filter by D3FEND tactic. One of: Model, Harden, Detect, Isolate, Deceive, Evict, Restore. Omit for all tactics. | |
| include | No | Detail level. Default (omit/empty) returns slim rows (drops the deterministic ontology `uri` field, ~60 chars/row saved). Pass 'full' to get `uri` back on every row. The slug `defense_id` is always returned and uniquely identifies the defense. | |
| keyword | No | Substring match against defense label, description, or parent_label (case-insensitive). Min 2 chars. Example: 'token', 'hashing', 'sandbox'. Omit to list all. | |
| artifact | No | Filter by exact targeted digital artifact (case-insensitive), e.g. 'Access Token', 'File', 'Process'. Omit for any artifact. | |
| exclude_id | No | Optional D3FEND defense slug (CamelCase, e.g. 'TokenBinding') to omit from results. Useful when chaining from d3fend_defense_lookup so the originating defense is not echoed back in its own siblings list. Omit when not needed. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses default SLIM format with size savings, rate limits (30/hr free, 500/hr Pro), return structure, and behavior of exclude_id. Annotations already indicate read-only and idempotent; description adds valuable context without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Dense single paragraph front-loads purpose and includes useful details like size savings and rate limits. Slightly verbose with precise character counts but remains efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all aspects: purpose, parameters, usage pattern (chaining with lookup), example, rate limits, and return field list. Sufficient for an agent to select and invoke correctly, given output schema references.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Adds meaning beyond schema: explains 'include' saves ~60 chars/row, 'exclude_id' for chaining, 'keyword' substring match min 2 chars, provides examples for artifact and tactic. Schema coverage is 100%, but description enriches each parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches the MITRE D3FEND catalog by keyword, tactic, or artifact, and distinguishes from sibling tool d3fend_defense_lookup for drill-down.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit example, explains when to use (discover defenses for threat model), mentions alternative (d3fend_defense_lookup for detailed mappings), and describes chaining with exclude_id.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
dns_lookupDNS LookupARead-onlyIdempotentInspect
Query all DNS record types (A, AAAA, MX, NS, TXT, CNAME, SOA) for a domain. Use for mail routing inspection, nameserver verification, or SPF/DMARC checks; for full overview use domain_report. TXT records are returned raw (no filter) — total_txt_records always carries the honest count (use domain_report for the security-only filtered TXT view). Free: 30/hr, Pro: 500/hr. Returns {domain, records: {a, aaaa, mx, ns, txt, total_txt_records, cname, soa}, summary}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Root domain to query, without protocol or path (e.g. 'example.com', 'cloudflare.com') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnly, openWorld, idempotent. The description adds significant behavioral detail: rate limits (30/hr free, 500/hr Pro), raw TXT records with unfiltered count, and the return structure. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, front-loaded with core action and use cases. It could be slightly more structured (e.g., separate sections), but it efficiently conveys key information without waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter tool with comprehensive schema, clear annotations, and an output schema (mentioned in description), the description covers purpose, usage, behavioral notes, rate limits, and return format. It is complete for effective agent use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the schema already describes the 'domain' parameter well. The description adds context about querying all record types but doesn't provide additional parameter-level details beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states it queries all DNS record types for a domain, lists the types, and distinguishes from the sibling 'domain_report' for full overview. The verb 'query' and resource 'DNS record types' are specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear guidance on when to use this tool (mail routing inspection, nameserver verification, SPF/DMARC checks) and explicitly recommends 'domain_report' for a full overview or security-only filtered TXT view, though it doesn't list specific exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
domain_reportDomain ReportARead-onlyIdempotentInspect
Query DNS, WHOIS, SSL, subdomains, and threat intel for a domain in one call. By default dns.txt is filtered to security-relevant entries (SPF, DMARC, DKIM, MTA-STS, TLS-RPT) and dns.total_txt_records reports the honest pre-filter count; pass include_all_txt=true for the raw TXT list. Use as a starting point for domain investigations; use audit_domain for live headers + tech stack. Response carries next_calls — chain with subdomain_enum (always emitted), ssl_check + tech_fingerprint (when an A record resolves) for the standard recon depth without re-prompting. Free: 30/hr, Pro: 500/hr. Returns domain report with DNS records, WHOIS data, SSL cert, risk score, email config, threat status, recommendation, and next_calls.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Root domain to analyze, without protocol or path (e.g. 'example.com', 'shopify.com') | |
| include_all_txt | No | Return every TXT record (default: False, only SPF/DMARC/DKIM/MTA-STS/TLS-RPT kept). dns.total_txt_records is always emitted with the honest pre-filter count. Default filter strips vendor verification strings (google-site-verification, ms=, facebook-domain-verification, etc.) that bloat the response without security signal. Set True only when you need the raw TXT inventory. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds context beyond annotations: rate limits (30/hr Free, 500/hr Pro), next_calls for chaining, default TXT filtering behavior, and that total_txt_records is always emitted. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Packed with information in a few sentences, front-loaded with main purpose. Slightly long but each sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, usage, behavioral details, parameter semantics, and next steps. Given complexity, it's nearly complete. Output schema exists so return value explanation is covered.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (baseline 3). Description adds meaning: explains default filtering of TXT records, purpose of include_all_txt, and example domain format. Adds value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specifically states 'Query DNS, WHOIS, SSL, subdomains, and threat intel for a domain in one call', clearly identifying the verb and resource. Distinguishes from sibling tools like audit_domain, ssl_check, subdomain_enum.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use as a starting point for domain investigations; use audit_domain for live headers + tech stack' and provides chaining recommendations with sibling tools for standard recon depth.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
email_disposableEmail DisposableARead-onlyIdempotentInspect
Check if email address uses a known disposable/temporary provider (Guerrilla Mail, Temp Mail, Mailinator, etc.). Use for input validation to detect throwaway signups; for domain reputation use threat_intel. Companion email-investigation tools: email_mx (deliverability + MX trust), domain_report on the email's domain (full recon), threat_intel (malware-distribution signal on the domain). Free: 30/hr, Pro: 500/hr. Returns {disposable, domain, provider}.
| Name | Required | Description | Default |
|---|---|---|---|
| Yes | Full email address to check (e.g. 'user@tempmail.com', 'test@guerrillamail.com') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive. Description adds rate limits (30/hr free, 500/hr pro) and output structure, which are beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences that are front-loaded with purpose, then usage guidelines, then rate limits and output. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple single-parameter tool with full schema coverage and output described, the description is complete. It covers purpose, usage, behavior, and return value.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a clear description for the 'email' parameter. The description does not add new parameter details beyond the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool checks if an email uses a disposable provider, lists examples, and distinguishes from siblings like threat_intel and email_mx. It provides specific verb and resource.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly gives use case: input validation for detecting throwaway signups, and alternative tool for domain reputation. Also references companion tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
email_mxEmail MXARead-onlyIdempotentInspect
Analyze email security: MX records, SPF policy, DMARC policy, DKIM probe across common+date-based selectors, mail provider, grade. Use to verify email-auth setup and phishing risk; for full audit use domain_report. Free: 30/hr, Pro: 500/hr. email_security.dkim_status reports honest evidence: 'verified' iff at least one selector responded, else 'unverifiable' (custom selectors cannot be discovered without prior knowledge). Grade: when DKIM verified, A=SPF+DMARC+DKIM/B=2of3/C=1of3; when DKIM unverifiable, A=SPF+DMARC/B=one/F=neither — DKIM absence is NOT penalized because it is unprovable in DNS. Returns {mx_records, mail_provider, email_security:{spf, dmarc, dkim_selectors, dkim_status, grade, issues}, summary}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to analyze email configuration for (e.g. 'example.com', 'google.com') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds significant behavioral details beyond annotations: DKIM status logic (verified iff at least one selector responds), grading criteria with DKIM absence penalization rules, and disclosure that custom selectors cannot be discovered. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is relatively long but well-structured, front-loading purpose and then detailing grading logic and limitations. Could be slightly more concise without losing essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of email security analysis and excellent annotations, the description comprehensively covers outputs, grading logic, and limitations. Output schema exists, so return values need not be described.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a well-described domain parameter. Description includes example formats but doesn't add significant new meaning beyond the schema, which already provides adequate documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool analyzes email security (MX, SPF, DMARC, DKIM, mail provider, grade) and distinguishes it from domain_report for full audits. The verb 'analyze' is specific to the resource.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use (verify email-auth setup and phishing risk) and when not to (use domain_report for full audit). Also specifies rate limits for Free and Pro tiers.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
email_security_postureEmail Security PostureARead-onlyIdempotentInspect
Analyze domain email authentication posture: SPF, DMARC, DKIM with numeric score and findings. Dual-use: red-team (spoofing feasibility) + blue-team (posture audit). Score 0-100, grades A+-F. DKIM probing tests common selectors + recent dates; custom selectors must be supplied. Passive DNS-only; no SMTP probe. Free: 30/hr, Pro: 500/hr.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to audit email authentication posture for (e.g. 'example.com') | |
| selectors | No | Optional comma-separated custom DKIM selectors to probe |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds significant behavioral context: DKIM probing uses common selectors and recent dates, passive DNS-only, no SMTP probe, and scoring details (0-100, A+-F). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, well-structured: purpose, dual-use, scoring, DKIM probing details, passive nature, rate limits. Front-loaded with purpose. Could be slightly more concise but is efficient overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (not shown but indicated), the description covers essential aspects: what it tests, score range, method (passive DNS), and rate limits. It is adequate for an agent to understand usage without missing critical information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (both parameters have descriptions). The description adds value by clarifying that custom DKIM selectors must be supplied as comma-separated and that DKIM probing tests common selectors plus recent dates, providing context beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool analyzes domain email authentication posture (SPF, DMARC, DKIM) with a numeric score and findings. It distinguishes itself from sibling tools like dns_lookup, domain_report, and email_mx by focusing specifically on email security posture and offering dual-use for red and blue teams.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly describes dual-use for red and blue teams, passive DNS-only nature (no SMTP probe), and rate limits (Free: 30/hr, Pro: 500/hr). While it doesn't name alternative sibling tools, it clearly indicates when to use and limitations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
email_verifyEmail VerifyARead-onlyIdempotentInspect
One-call email validation combining syntax + MX records + disposable check + role-address detection (admin@/info@/...) + free-provider classification (gmail/outlook/yahoo/...). Use BEFORE adding an email to a contact list, sending an outbound message, or auditing a lead-list dump — replaces 2-3 tool calls (email_mx + email_disposable + manual role parse) with one structured response. Deliberately does NOT do SMTP RCPT TO deliverability probing — Hunter.io / NeverBounce-style mailbox enumeration is an ethical grey area we declined; use those services if you need that specific signal. role_address=true on admin@, info@, noreply@, support@, etc. (Gmail-style +tag is stripped before classification). free_provider=true on consumer-mailbox domains (B2B detection signal — a 'work' email at @gmail.com likely isn't a corporate user). Free: 30/hr, Pro: 500/hr. Returns {email, domain, syntax_valid, mx_records, disposable, disposable_provider, role_address, role_type, free_provider, summary}.
| Name | Required | Description | Default |
|---|---|---|---|
| Yes | Full email address to verify (e.g. 'admin@example.com', 'user@gmail.com'). Must contain '@'. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare the tool as read-only, idempotent, and non-destructive. The description adds important behavioral context: it does not perform SMTP probing, explains role_address/free_provider logic, notes Gmail +tag stripping, and includes rate limits. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single dense paragraph that conveys all necessary information without redundancy. However, a more structured format (e.g., bullet points) could improve readability. Still, it is efficiently front-loaded with purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (multiple checks), the presence of an output schema (return fields listed), and the annotations, the description is fully complete. It explains logical details, limitations, and rate limits, enabling the agent to understand the tool's behavior thoroughly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the description reiterates the email parameter with examples and the '@' requirement, but adds no new semantic information beyond the schema. Baseline score of 3 is appropriate as the description does not compensate for missing schema details.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs one-call email validation combining syntax, MX records, disposable check, role-address detection, and free-provider classification. It distinguishes itself from sibling tools like email_mx and email_disposable by explicitly replacing 2-3 tool calls.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: before adding email to a contact list, sending outbound messages, or auditing lead-list dumps. It also specifies what it does not do (SMTP deliverability probing) and suggests alternatives (Hunter.io/ NeverBounce). This provides clear guidance for agent decision-making.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
exploit_lookupExploit LookupARead-onlyIdempotentInspect
Search public exploits/PoC for a specific CVE across three sources: (1) GitHub Advisory Database (sources.github.advisories[]), (2) Shodan CVEDB references (sources.shodan_refs.results[] — packetstorm/seclists/vendor URLs cited by Shodan; results capped at SHODAN_REFS_LIMIT default 200, truncated=true when capped, count is the honest upstream total), (3) ExploitDB CSV mirror (exploits[] array, with edb_id + author + verified flag — these are the actual ExploitDB entries). Use to assess if a vulnerability has weaponized exploits in the wild; run after cve_lookup to evaluate real-world risk. When the CVE is also in CISA KEV (kev.in_kev=true on cve_lookup), pair with kev_detail for federal patch deadline; pair with cwe_lookup on cwe_id for the underlying weakness category and mitigations. Response carries next_calls — single cve_lookup pivot for full context (KEV status, CWE chain, CVSS, EPSS); cve_lookup's own next_calls then surface kev_detail and cwe_lookup automatically (this endpoint has no in_kev/cwe_id schema, so blind emission of those pivots is intentionally avoided). Free: 30/hr, Pro: 500/hr. Returns {cve_id, exploits_found, has_public_exploit, sources: {github, shodan_refs: {found, count, truncated, results}}, exploits: [{edb_id, cve_id, date_published, author, type, platform, url, verified, description}], summary, verdict, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| cve_id | Yes | CVE identifier in format CVE-YYYY-NNNNN (e.g. 'CVE-2024-3094', 'CVE-2023-44487') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds behavioral details: sources, rate limits (30/hr free, 500/hr Pro), truncation behavior for shodan_refs (max 200, truncated flag), and the structure of next_calls. It does not contradict annotations. Slightly less than 5 because some output schema details could be redundant.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is dense and front-loaded with the main purpose, but it includes extensive details about response structure and next_calls that could be streamlined. While informative, it is longer than necessary, reducing conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (multiple sources, rate limits, truncation, integration with sibling tools) and the presence of an output schema, the description is thorough. It covers sources, limits, return fields, and how to proceed with related lookups, leaving no critical gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter cve_id, and its description in the schema is complete. The description adds an example format but no further semantics. With full schema coverage, a baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it searches public exploits/PoC for a specific CVE across three named sources (GitHub Advisory Database, Shodan CVEDB, ExploitDB CSV mirror). It distinguishes from sibling tools like cve_lookup (which provides base CVE data) and kev_detail (CISA KEV details), making the tool's unique role evident.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description advises using this after cve_lookup to assess real-world risk and explains when to pair with kev_detail or cwe_lookup. It also notes that the endpoint lacks in_kev/cwe_id schema so those pivots are intentionally avoided, providing clear context on appropriate usage. An explicit 'when not to use' statement would elevate it to 5.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
geo_auditGeo AuditARead-onlyIdempotentInspect
Deterministic GEO / AI-visibility readiness audit of a domain's homepage with a 0-100 score + a missing_signals fix list. Answers "can AI assistants (ChatGPT, Claude, Perplexity, Google AI) discover, crawl, and recommend this site?" using STRUCTURAL signals ONLY — no LLM is queried, fully deterministic. 7 weighted rules: llms.txt present (15), AI-crawler robots.txt access — 9 crawlers incl. GPTBot/ClaudeBot/PerplexityBot/Google-Extended/CCBot (25 — the dominant signal; blocking = invisible to that AI surface), schema.org @type coverage Organization/Product/FAQPage (20), server-side rendering vs client-only SPA (15 — a JS-only SPA serves AI crawlers empty HTML), discovery signals og/canonical/sitemap (10), semantic headings single-H1 + H2 structure (10), competitor-comparison content (5). Use to triage why a brand is absent from AI recommendations, as a pre-flight before GEO/AEO content work, or to score a prospect's AI-readiness. Strictly homepage-only — we do NOT crawl. Ethical floor: target's robots.txt is honoured — Disallow: / for ContrastAPI returns 403 error.code = robots_txt_disallow and we DO NOT fetch. Cache-Control: no-store/private skips our cache write (cache_respected=false). Per-target eTLD+1 throttle (60 req/min). Free: 30/hr, Pro: 500/hr. Returns {domain, fetched_url, status_code, llms_txt_present, ai_crawlers_total, ai_crawlers_allowed, ai_crawlers_blocked, schema_types, client_side_rendered, render_framework, has_canonical, og_tag_count, sitemap_count, h1_count, h2_count, comparison_content, score, missing_signals, cache_respected, summary}. Returns 502 on DNS/TCP/TLS failure; 403 robots_txt_disallow when the target opted out.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Registrable domain to audit for AI-visibility / GEO readiness (e.g. 'example.com', 'shopify.com'). No scheme, no path, no port. Strictly homepage-only — the bot fetches https://<domain>/ with HTTP fallback (we do NOT crawl). |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description goes far beyond annotations by detailing deterministic nature, 7 weighted rules, robots.txt handling (403 with error code), cache behaviors, throttle limits, rate limits per plan, complete return fields, and error cases. This provides comprehensive understanding of tool behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is dense but well-structured, starting with the core function, then rules, usage, and technical details. Every part adds value, though for a first-time reader it might feel lengthy. Could be slightly trimmed but remains effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity and single parameter, the description covers all necessary aspects: purpose, rules, return fields, error handling, ethical notes, rate limits, and caching behavior. Output fields are listed, making it self-contained despite no formal output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema already describes the 'domain' parameter well (100% coverage). The tool description adds valuable constraints: no scheme/path/port, homepage-only, protocol fallback. This enhances understanding beyond the schema-description alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it performs a deterministic GEO/AI-visibility audit of a domain's homepage, providing a 0-100 score and fix list. It specifies the exact resource and action, and implicitly distinguishes from sibling tools like 'seo_audit' or 'audit_domain' by focusing solely on homepage structural signals for AI discoverability.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit use cases: triaging missing AI recommendations, pre-flight for GEO/AEO work, and scoring prospect AI-readiness. While it doesn't directly compare to siblings, the context is clear; a brief mention of when to use alternatives would elevate it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_cvss_detailsGet CVSS DetailsARead-onlyIdempotentInspect
Parse a CVSS v3.x vector string into a per-metric breakdown plus a recomputed base score. Returns the canonicalized vector, version (3.0 or 3.1), base_score, base_severity (NONE/LOW/MEDIUM/HIGH/CRITICAL), and the eight base metrics: attack_vector (NETWORK/ADJACENT_NETWORK/LOCAL/PHYSICAL), attack_complexity (LOW/HIGH), privileges_required (NONE/LOW/HIGH), user_interaction (NONE/REQUIRED), scope (UNCHANGED/CHANGED), and the three impact metrics confidentiality_impact / integrity_impact / availability_impact (NONE/LOW/HIGH each). When temporal/environmental metrics are explicit in the vector, temporal_score and environmental_score are populated separately. Use to translate raw CVSS strings into agent-friendly attributes without re-parsing the vector grammar yourself, and to verify upstream NVD scoring against the recomputed value. v2 vectors (AV:N/AC:L/Au:N/...) are rejected with 400 — read cvss_v2_vector from cve_lookup if you need v2 detail. Free: 30/hr, Pro: 500/hr. Returns {version, vector, base_score, base_severity, metrics: {attack_vector, attack_complexity, privileges_required, user_interaction, scope, confidentiality_impact, integrity_impact, availability_impact}, temporal_score, environmental_score, summary, verdict}.
| Name | Required | Description | Default |
|---|---|---|---|
| vector | Yes | CVSS v3.0 or v3.1 vector string, e.g. 'CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H'. v2 vectors are rejected — use the cvss_v2_vector field on cve_lookup if you need v2. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive. Description adds disclosure that v2 vectors return 400, recomputes base score, and specifies rate limits. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Front-loaded with core purpose, then describes output fields. Packed with info but could be slightly more concise. Well-organized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Thoroughly explains all return fields (version, base score, severity, metrics, temporal/environmental, summary, verdict). No missing context given tool complexity and output schema absence.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter (vector) with 100% schema coverage. Description adds examples, clarifies v2 rejection, and explains expected format beyond schema. Reduces ambiguity.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Parse a CVSS v3.x vector string' and specific resource. Distinguishes from sibling tools like cve_lookup by explicitly rejecting v2 vectors and directing to alternative. No ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
States when to use: 'translate raw CVSS strings into agent-friendly attributes' and 'verify upstream NVD scoring'. Explicitly says v2 vectors are rejected and where to get v2 detail. Also provides rate limits (Free: 30/hr, Pro: 500/hr).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
hash_lookupHash LookupARead-onlyIdempotentInspect
Query MalwareBazaar for file hash (MD5/SHA1/SHA256): malware family, file type, size, tags, first/last seen, download count. Use to check if file hash is known malware; use ioc_lookup for auto-detection of all IOC types. Companion malware-investigation tools: ioc_lookup (multi-source: ThreatFox + Feodo Tracker + URLhaus), threat_intel (domain-level URLhaus check), exploit_lookup (link a known CVE to PoC code if the hash maps to an exploit binary). Free: 30/hr, Pro: 500/hr. Returns {found, malware_family, file_type, file_size, tags, first_seen, last_seen, signature}.
| Name | Required | Description | Default |
|---|---|---|---|
| file_hash | Yes | File hash to look up. Accepts MD5 (32 chars), SHA-1 (40 chars), or SHA-256 (64 chars). Lowercase hex only, no spaces. Example: 'd41d8cd98f00b204e9800998ecf8427e' |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false. Description adds important behavioral context: rate limits ('Free: 30/hr, Pro: 500/hr'), the source (MalwareBazaar), and the structure of the response. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences pack purpose, output fields, usage guidelines, and rate limits. Everything is front-loaded: first sentence states action and returns. Second sentence covers when-to-use and alternatives. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has an output schema (not shown but mentioned as returned fields), the description complements it well. It covers return structure, rate limits, and sibling differentiation. No gaps for a simple lookup tool with rich annotations and schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage for the single parameter file_hash, including accepted formats and example. The description only mentions 'Query MalwareBazaar for file hash (MD5/SHA1/SHA256)', which adds no meaning beyond the schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description explicitly states the tool queries MalwareBazaar for file hashes (MD5/SHA1/SHA256) and lists the returned fields (malware family, file type, etc.). It distinguishes from sibling tools like ioc_lookup, which auto-detects all IOC types, clarifying the specific resource and action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Clear when-to-use: 'Use to check if file hash is known malware'. Also provides explicit alternatives: 'use ioc_lookup for auto-detection of all IOC types' and lists companion tools for related investigations. This meets the highest standard for usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ioc_lookupIOC LookupARead-onlyIdempotentInspect
Enrich Indicator of Compromise (IP/domain/URL/hash) by auto-detecting type and querying abuse.ch feeds. Per-type source coverage: hash → ThreatFox only (Feodo and URLhaus do not index hashes); IP → ThreatFox + Feodo Tracker + URLhaus; domain / URL → ThreatFox + URLhaus. verdict.sources_queried lists what actually ran; verdict.sources_unavailable lists what failed (timeout / upstream error). Use as primary IOC triage tool when type unknown; use threat_intel for domain-only, hash_lookup for richer MalwareBazaar hash data. Free: 30/hr, Pro: 500/hr. Returns {indicator, type, threat_level, sources, summary, verdict}.
| Name | Required | Description | Default |
|---|---|---|---|
| indicator | Yes | Indicator of Compromise: IP address, domain, full URL, or file hash in MD5/SHA1/SHA256 format (e.g. '8.8.8.8', 'evil.com', 'https://evil.com/malware.exe', 'd41d8cd98f00b204e9800998ecf8427e') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses per-type source coverage (hash only ThreatFox, IP three sources, etc.), mentions failure modes (sources_unavailable), and describes output fields (verdict.sources_queried). Annotations (readOnlyHint, etc.) are consistent; no contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is detailed but efficient, front-loading purpose before covering sources, usage guidance, and return structure. Every sentence adds value, but could be slightly more concise (e.g., combining source lists).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With one well-documented parameter, output schema present (implied), and detailed coverage of behavior, alternatives, rate limits, and output fields, the description is fully adequate for an agent to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema fully describes the indicator parameter (100% coverage), so the description adds no new type/syntax details. It does reinforce auto-detection behavior, but baseline 3 is appropriate per the rubric.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it enriches IOCs (IP/domain/URL/hash) by auto-detecting type and querying abuse.ch feeds. It distinguishes from siblings like threat_intel (domain-only) and hash_lookup (MalwareBazaar), making the tool's unique purpose explicit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states use as primary IOC triage tool when type unknown, and directs to siblings for specific cases. Also includes rate limits (30/hr free, 500/hr Pro), providing clear context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ip_lookupIP LookupARead-onlyIdempotentInspect
Query comprehensive IP intelligence: reverse DNS, ASN + holder name + country inline (RIPE Stat, Phase 1), open ports, hostnames, vulnerabilities (Shodan InternetDB enriched with severity + cvss_v3 from local cve.db — Phase 2 v1.16.0 BREAKING; vulns is now list[VulnInfo] {cve_id, severity, cvss_v3} dicts, pre-1.16 it was list[str] of CVE IDs; unknown CVEs emit severity='UNKNOWN' / cvss_v3=null — do NOT infer benign), cloud provider, Tor exit status, and reputation. cloud_provider uses two-tier detection: published cloud CIDR ranges (AWS/GCP/Cloudflare) first, then an ASN-to-provider fallback map for anycast/public-service IPs outside published ranges (e.g. 8.8.8.8 → AS15169 → 'Google'). Reputation: FireHOL level1 blocklist on Free tier; +AbuseIPDB + Shodan on Pro (Phase 4). Use for IP investigation; for orchestrated IP+reputation use threat_report. Response is null-explicit: every field is always present (cloud_provider=null when neither tier matches; tor_exit=false when not listed or upstream fetch failed — check verdict.sources_unavailable to disambiguate fetch failure from genuine absence). Response carries next_calls (conditional) — asn_lookup when ASN is populated, ioc_lookup when reputation is FireHOL-listed or AbuseIPDB confidence>50, threat_report on Pro tier for orchestrated profile. Free: 30/hr, Pro: 500/hr. Returns {ip, ptr, geo, asn, asn_name, country, ports, hostnames, vulns, cloud_provider, tor_exit, reputation, risk_score, verdict, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| ip | Yes | IPv4 or IPv6 address to investigate (e.g. '8.8.8.8', '2606:4700::1111') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds extensive behavioral details beyond annotations: null-explicit fields, version breaking changes, cloud provider detection logic, tier differences, and next_calls conventions. No contradiction with annotations (readOnlyHint, openWorldHint, etc.).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Very dense and detailed; front-loaded purpose but contains version notes and technical minutiae that could be streamlined. Every sentence adds value but structure could be improved.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity and presence of output schema, the description is thorough: covers response fields, version changes, tier differences, detection logic. No gaps identified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter with 100% schema coverage (description already includes examples). Description adds no further semantic insight beyond schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it queries comprehensive IP intelligence and contrasts with sibling tools like threat_report. It explicitly says 'Use for IP investigation; for orchestrated IP+reputation use threat_report.' This distinguishes it well.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear usage context: for IP investigation, with rate limits (Free/Pro). Mentions alternative tools via next_calls and contrasts with threat_report, but lacks explicit when-not-to-use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
kev_detailKEV DetailARead-onlyIdempotentInspect
Look up CISA KEV (Known Exploited Vulnerabilities) full record for a CVE. Returns federal patch deadline (due_date), CISA-specified required_action remediation, known ransomware association, vendor/product, the CISA-given common name (e.g. 'Log4Shell'), CISA-reported CWE list, plus lifecycle metadata: date_updated (when CISA last revised the entry), date_removed (set when CISA removed the CVE from the catalog — null while still active), and updated_at (our DB sync freshness). Returns 404 when the CVE is not in the KEV catalog — use cve_lookup for non-KEV CVEs. Best follow-up after cve_lookup or cve_search(kev=true) when an in_kev=true CVE is identified; chain with cwe_lookup on each returned CWE to investigate the weakness category. Free: 30/hr, Pro: 500/hr. Returns {cve_id, vendor_project, product, vulnerability_name, date_added, due_date, required_action, known_ransomware_use, notes, cwes, date_updated, date_removed, updated_at, verdict, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| cve_id | Yes | CVE identifier in format CVE-YYYY-NNNNN (e.g. 'CVE-2021-44228', 'CVE-2024-3094') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive. The description adds further behavioral context: returns 404 when CVE not in KEV catalog, and lists detailed return fields including lifecycle metadata. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is thorough and well-structured, with front-loaded purpose. It is slightly verbose but each sentence adds value. Could be slightly more concise, but overall efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (single parameter, output schema exists), the description covers error handling, rate limits, chaining suggestions, and return fields. It is complete for a lookup tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and already includes description and example format. The description repeats the format but adds no significant meaning beyond the schema; baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's specific verb and resource: 'Look up CISA KEV full record for a CVE.' It distinguishes from sibling tools like cve_lookup (for non-KEV CVEs) and cve_search (as a prior step).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly provides when-to-use ('Best follow-up after cve_lookup or cve_search(kev=true) when an in_kev=true CVE is identified'), when-not-to-use ('use cve_lookup for non-KEV CVEs'), and alternatives (chain with cwe_lookup). Also mentions rate limits.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
password_checkPassword CheckARead-onlyIdempotentInspect
Check if SHA-1 hash appears in Have I Been Pwned (HIBP) breach dataset using k-anonymity (5-char prefix only, full hash never leaves tool). Use for password breach audits; read-only, no data stored. Companion OSINT investigation tools: hash_lookup (file-hash malware family lookup, different namespace), email_disposable (throwaway-mail signal on associated accounts), username_lookup (social-platform exposure on associated handles). Free: 30/hr, Pro: 500/hr. Returns {found, count}.
| Name | Required | Description | Default |
|---|---|---|---|
| sha1_hash | Yes | Full SHA-1 hash of the password as 40 lowercase hexadecimal characters (e.g. '5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8' for 'password') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, and non-destructive. The description adds key behavioral details: k-anonymity with 5-char prefix, that the full hash never leaves the tool, and the result format {found, count}. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with the core purpose, includes sibling differentiation, rate limits, and output format. Every sentence adds value with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (1 param, 100% schema coverage, annotations present, output schema exists), the description covers usage context, behavioral details, rate limits, and return value sufficiently.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema already provides a detailed description of the sha1_hash parameter (100% coverage). The tool description does not add further parameter-level information beyond mentioning the hash implicitly.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the action (check if SHA-1 hash appears in HIBP) and differentiates from sibling tools (hash_lookup, email_disposable, username_lookup) by explaining their different namespaces and purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use for password breach audits; read-only, no data stored.' and provides rate limits, but does not explicitly state when not to use the tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
phishing_checkPhishing CheckARead-onlyIdempotentInspect
Query URLhaus for a specific URL and its host. is_malicious is True only when there is ACTIVE evidence — exact URL match with url_status='online' (or unknown) OR host has urls_online > 0. URLhaus retains historical records forever, so a host can have url_count > 0 with urls_online == 0; in that case is_malicious=False, is_stale=True, threat_level='low'. Use for URL-level threat assessment; use threat_intel for domain-level checks. Companion threat-investigation tools: ioc_lookup (multi-source IOC: ThreatFox + URLhaus + Feodo Tracker, auto-detect type), hash_lookup (file-hash malware family, MalwareBazaar), threat_intel (domain-level URLhaus only). Free: 30/hr, Pro: 500/hr. Returns {url, host, is_malicious, is_stale, urlhaus_host:{found,urls_online,url_count}, urlhaus_url:{found,threat,tags,status}, threat_level, summary}.
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Full URL to check, including protocol (e.g. 'https://suspicious-login.com/verify', 'http://evil.com/payload.exe') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate safe read-only behavior. The description adds rich behavioral details: how is_malicious is determined (exact match with active status or host urls_online>0), historical retention causing stale results, and the logic for threat_level. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single dense paragraph that front-loads the core action, then efficiently covers conditions, sibling comparisons, rate limits, and return format with no redundant sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description fully covers the tool's behavior, including edge cases (historical records, stale state), and even specifies the return structure, complementing the existing output schema and annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with a descriptive parameter description. The tool description adds context beyond the schema by explaining that the URL is checked both as a direct URL and via its host, enriching parameter meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it queries URLhaus for a URL and its host, defines when is_malicious is True, and contrasts with sibling tools like threat_intel for domain-level checks, making the purpose distinct and specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises when to use this tool (URL-level assessment) vs alternatives (threat_intel for domain-level), lists companion tools with their purposes, and mentions rate limits, providing comprehensive usage guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
phone_lookupPhone LookupARead-onlyIdempotentInspect
Validate and analyze phone number: country, region, carrier, line type (mobile/landline/VoIP), timezone, formatted versions. Use to verify phone legitimacy and detect fraud risks. Requires E.164 format (+1234567890). Companion OSINT identity-investigation tools: username_lookup (social-platform handle correlation), email_disposable (throwaway-mail signal on associated email). Free: 30/hr, Pro: 500/hr. Returns {valid, country, region, carrier, carrier_status, line_type, timezone, formats}. carrier is omitted from the wire when libphonenumber has no mapping for the region (US/CA/GB and other MNP-restricted regions); always read carrier_status — 'known' means carrier is present, 'unsupported_region' means we cannot identify the carrier (do not infer the number lacks one).
| Name | Required | Description | Default |
|---|---|---|---|
| number | Yes | Phone number in E.164 format: + followed by country code and number, no spaces or dashes. Examples: '+14155552671' (US), '+905551234567' (TR), '+442071234567' (UK). Wrong: '0555-123-4567', '(415) 555-2671' |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description goes beyond annotations by disclosing that carrier is omitted in certain regions (MNP-restricted) and advises to read carrier_status field. It explains the meaning of carrier_status values. This adds significant behavioral nuance that annotations (readOnlyHint, idempotentHint) do not cover. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise at ~100 words, well-structured with clear sections: purpose, usage, companion tools, rate limits, return fields, and behavioral note. Every sentence adds value, and the most critical information (purpose and format) is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has one parameter, rich annotations, and an output schema, the description covers all necessary context: purpose, format requirements, rate limits, return fields, and critical carrier behavior. It addresses potential confusion about carrier omission, making it complete for effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already provides comprehensive documentation for the single parameter 'number', including examples and wrong formats. The description's mention of E.164 format adds no new meaning beyond the schema. With 100% schema coverage, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool validates and analyzes phone numbers, listing specific outputs like country, region, carrier, line type, timezone, formats. It distinguishes itself from sibling tools by mentioning companion OSINT tools like username_lookup and email_disposable, indicating its unique role in phone verification and fraud detection.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear usage context: 'Use to verify phone legitimacy and detect fraud risks.' It mentions companion tools for alternative investigations and specifies the required E.164 format with examples. It also includes rate limits. However, it does not explicitly state when not to use the tool or exclusions, lowering the score slightly.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
redirect_chainRedirect ChainARead-onlyIdempotentInspect
Walk an HTTP redirect chain hop-by-hop, returning per-hop {url, status_code, location, latency_ms}. Use to deobfuscate URL shorteners (bit.ly / t.co / lnkd.in), audit suspicious links from phishing investigations, or trace marketing tracking redirects. SSRF-guarded: each redirect target's resolved IP is re-validated before connecting (private IPs and non-HTTP schemes rejected). Up to 10 hops; loop_detected=true if a hop would revisit a previously-seen URL (we abort before the duplicate fetch); truncated=true if the chain still had a 30x at hop 10. Per-target eTLD+1 throttle (60 req/min) consumed once for the start host AND once per new host reached — a chain across 11 unrelated domains cannot bypass the cap. Free: 30/hr, Pro: 500/hr. Returns {start_url, final_url, hops, hop_count, final_status, loop_detected, truncated, summary}. Returns 502 ErrorResponse on hard fetch failure (timeout / TLS / connect); 429 with Retry-After if a hop's eTLD+1 throttle is exceeded mid-chain.
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Full URL whose redirect chain to walk, e.g. 'https://bit.ly/3xyz' or 'http://example.com/old-path'. Must start with http:// or https://. Pass the URL exactly as you'd `curl -L` it; the server handles encoding. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds substantial behavioral context beyond annotations: SSRF guard with IP validation, loop detection, truncation at 10 hops, per-target eTLD+1 throttle, and error responses (502, 429). No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core action, followed by use cases and behavioral details. It is informative but somewhat lengthy; however, every sentence adds value, so it remains efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all key aspects: how it works, security measures (SSRF guard), rate limiting, error conditions, and return fields. Given the complexity and the presence of an output schema, the description is fully adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already provides a detailed description of the 'url' parameter with examples and constraints. The tool description does not add new information about the parameter, so the baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool walks an HTTP redirect chain hop-by-hop, listing per-hop details. This specific verb+resource distinguishes it from all sibling tools, which focus on other network or security tasks.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases: deobfuscating URL shorteners, auditing suspicious links, tracing marketing redirects. It also mentions SSRF guarding and rate limits, giving context for when it's safe to use, though it doesn't explicitly state when not to use or name alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
robots_txtRobots.txtARead-onlyIdempotentInspect
Fetch + parse the target domain's robots.txt — sitemaps, per-User-agent allow/disallow rules, crawl-delay, Host directive. Use BEFORE crawling/scraping a target site (seo_audit, brand_assets, redirect_chain) to honour the site's published rules. status_code=404 means no robots.txt exists = implicit allow-all per RFC 9309 §2.4. ContrastAPI fetches with User-agent: ContrastAPI/<version> (+https://contrastcyber.com/bot) so site operators can identify + opt out via robots.txt; we honour Disallow: / for our UA in seo_audit and brand_assets. Per-target eTLD+1 throttle (60 req/min) prevents weaponising this endpoint against a single site; subdomain rotation collapses to the same bucket. Free: 30/hr, Pro: 500/hr. Returns {domain, fetched_url, status_code, sitemaps, user_agents:{ua:{allow,disallow,crawl_delay}}, host, truncated, summary}. Returns 502 ErrorResponse if the target rejected the connection (DNS/TCP/TLS failure); the agent should NOT assume "no robots" in that case — it's an upstream-failure signal.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Registrable domain to fetch robots.txt for (e.g. 'example.com', 'github.com'). No scheme, no path, no port. Subdomains accepted; the bot fetches https://<domain>/robots.txt with HTTP fallback. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Goes beyond annotations by detailing User-Agent string used, honoring disallow for ContrastAPI UA, per-domain throttling (60 req/min), rate limits (30/hr Free, 500/hr Pro), and error handling (502 on connection failure). No contradiction with annotations (readOnlyHint, openWorldHint, idempotentHint, destructiveHint).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with core purpose and includes all necessary details without redundancy. Slightly lengthy but every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the single parameter and comprehensive annotations, the description covers all relevant context: output fields, error conditions, rate limits, and behavioral nuances. The output schema exists, but the description still provides valuable summary.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the description adds constraints: 'No scheme, no path, no port. Subdomains accepted; fetches https://<domain>/robots.txt with HTTP fallback.' This adds meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Fetch + parse the target domain's robots.txt' with specific resources (sitemaps, rules, crawl-delay, etc.). It distinguishes itself from sibling tools by focusing on robots.txt acquisition and parsing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Use BEFORE crawling/scraping a target site (seo_audit, brand_assets, redirect_chain)' and provides guidance on interpreting 404 vs. 502 responses, including when not to assume 'no robots'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
scan_headersScan HeadersARead-onlyIdempotentInspect
Perform live HTTP GET and analyze security headers: CSP, HSTS, X-Frame-Options, X-Content-Type-Options, Permissions-Policy, Referrer-Policy. Use to audit live website headers; use check_headers to validate headers you already have. Free: 30/hr, Pro: 500/hr. By default header values are truncated to 500 chars (CSP can exceed 4 KB on large sites); pass include='full' for the full raw value. Returns {headers_present, headers_missing, findings, total_score}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to scan live HTTP headers for (e.g. 'example.com', 'api.github.com') | |
| include | No | Detail level. Default ('') returns slim findings — raw header values capped at 500 chars with total_value_length carrying the honest pre-truncation length. Pass 'full' to restore the full raw value (useful for inspecting full CSP directives on sites like GitHub where the CSP header exceeds 4 KB). Allowed: '' or 'full'. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate safe, read-only, idempotent behavior. Description adds crucial details: default truncation at 500 chars, ability to get full values via 'include=full', and the output format. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences efficiently cover purpose, usage, behavior, and output. Information is front-loaded and every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of output schema, the description adequately covers usage, parameters, behavioral quirks (truncation), and result structure. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and description adds meaning: explains the 'include' parameter's effect, default behavior, and provides concrete example of CSP exceeding 4 KB to justify the 'full' option.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs a live HTTP GET to analyze specific security headers, and distinguishes itself from sibling 'check_headers' tool by specifying its use case.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use this tool ('audit live website headers') and when to use an alternative ('use check_headers to validate headers you already have'). Also mentions rate limits.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
seo_auditSEO AuditARead-onlyIdempotentInspect
One-shot SEO audit of a domain's homepage with a 0-100 composite score + a missing_signals list of concrete fixes. Use BEFORE pitching SEO work to a prospect, when triaging a lead's marketing maturity, or as a structured pre-flight before deeper auditing tools (Lighthouse / SEMrush). 10 audit rules each worth 10 pts: title present, title length 30-60 chars (Google SERP truncation window), meta description present, meta description length 50-160, exactly one H1, canonical link, >=3 OG tags, JSON-LD present, image alt-text coverage (proportional), HTTPS. Strictly homepage-only — we do NOT crawl the site. Ethical floor: target's robots.txt is honoured — Disallow: / for ContrastAPI OR * returns 403 error.code = robots_txt_disallow and we DO NOT fetch. Cache-Control: no-store/private skips our cache write (cache_respected=false in the response). Per-target eTLD+1 throttle (60 req/min) prevents weaponising via subdomain rotation. All target-derived strings/lists are _untrusted. Free: 30/hr, Pro: 500/hr. Returns {domain, fetched_url, status_code, title_untrusted, meta_description_untrusted, canonical_url, h1_untrusted, h1_count, h2_count, h3_count, images_total, images_missing_alt, internal_link_count, external_link_count, og_tags, json_ld_present, score, missing_signals, cache_respected, summary}. Returns 502 on DNS/TCP/TLS failure; 403 robots_txt_disallow when the target opted out.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Registrable domain to audit SEO for (e.g. 'example.com', 'shopify.com'). No scheme, no path, no port. Strictly homepage-only — the bot fetches https://<domain>/ with HTTP fallback and audits that single page (we do NOT crawl). |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint, etc.), the description details critical behavioral traits: robots.txt and cache-control are respected, throttle per eTLD+1, free/Pro rate limits, error codes (502, 403), and that all target-derived strings are marked '_untrusted'. This adds significant value and matches annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is long but well-structured, front-loading the key output (score + missing_signals). Every sentence provides essential information, though some brevity could be achieved without losing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the single parameter, presence of an output schema, and complex behavioral rules (robots.txt, caching, throttle, error conditions), the description comprehensively covers all aspects, making it fully self-contained for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The only parameter 'domain' is fully described in the schema, but the description adds valuable usage constraints: 'No scheme, no path, no port. Strictly homepage-only.' This goes beyond the schema's basic description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool performs a one-shot SEO audit of a domain's homepage, producing a composite score and a list of missing signals. It distinguishes itself from sibling tools like 'audit_domain' or 'geo_audit' by specifying the SEO focus and homepage-only scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use the tool: before pitching SEO work, when triaging a lead's marketing maturity, or as a pre-flight before deeper auditing tools. It also clearly defines what not to do (strictly homepage-only, no crawling) and mentions alternatives (Lighthouse, SEMrush).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
sigma_rule_lookupSigma Rule LookupARead-onlyIdempotentInspect
Look up a single Sigma detection rule by UUID from the SigmaHQ corpus (~3,200 rules, refreshed daily at 02:00 UTC). Returns the full rule with title, description, status (stable/test/experimental/deprecated/unsupported), level (informational/low/medium/high/critical), logsource (product/category/service), detection logic, tags (including attack.t#### ATT&CK technique refs and cve.YYYY-#### CVE refs), author, references, and modification date. Use to fetch a known rule for context (e.g., a SIEM detection that fired) or to inspect a rule discovered via REST sigma_rule_search. When a rule tags an ATT&CK technique or CVE, the response next_calls surfaces atlas_technique_lookup / cve_lookup as natural follow-ups. Free: 30/hr, Pro: 500/hr. Returns {rule, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| rule_id | Yes | Sigma rule UUID (RFC 4122, 36 chars, hyphenated). Example: '195e1b9d-bfc2-4ffa-ab4e-35aef69815f8'. Obtained from the REST sigma_rule_search endpoint or external SIEM correlation. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description reveals behavioral traits beyond the annotations, such as the rule corpus size (~3,200 rules), daily refresh schedule (02:00 UTC), and that the response includes a 'next_calls' field suggesting follow-up tools. Annotations already indicate readOnly, idempotent, non-destructive, so the bar is lower, but the description adds meaningful context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and front-loaded: the first sentence states the core purpose. Every subsequent sentence adds specific details (corpus size, refresh, fields returned, rate limits). No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple lookup tool with a single parameter, the description is complete. It covers purpose, parameter details, return fields (including tags, ATT&CK, CVE), rate limits, and even hints at natural follow-up tools via 'next_calls'. The presence of an output schema is not shown, but the description sufficiently documents the response structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
There is only one parameter (rule_id) with 100% schema description coverage. The description adds a concrete example UUID and mentions the source (REST endpoint or SIEM correlation), which adds meaning beyond the schema's description. Baseline is 3 due to high coverage, but the extra context merits a 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Look up a single Sigma detection rule by UUID from the SigmaHQ corpus'. It specifies the resource (Sigma rule), action (look up), and scope. It also distinguishes from sibling tool sigma_rule_search by noting it's for known rules vs discovering via search.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage scenarios: 'Use to fetch a known rule for context (e.g., a SIEM detection that fired) or to inspect a rule discovered via REST sigma_rule_search.' It also mentions rate limits (Free 30/hr, Pro 500/hr) as practical guidance. While it doesn't explicitly state when not to use, the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ssl_checkSSL CheckARead-onlyIdempotentInspect
Analyze SSL/TLS certificate: grade (A/B/C/D/F), protocol version, cipher suite, chain, expiry, Subject Alternative Names, and structured validation findings. Invalid certs (expired, self-signed, hostname mismatch, untrusted root) are reported as findings via valid=false + validation_errors[] rather than as endpoint failures, so an unreachable cert still returns useful intel. Grade D = cert readable but invalid; F = expired, legacy TLS, or probe failure. Use to audit certificate validity and detect expiring certs; for full domain audit use audit_domain. Free: 30/hr, Pro: 500/hr. Returns {grade, valid, validation_errors, protocol, cipher, issuer, subject, not_before, not_after, days_remaining, chain, san, warnings}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check SSL/TLS certificate for (e.g. 'example.com', 'api.stripe.com') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate safe read, but the description adds valuable details: how invalid certs are handled (valid=false + validation_errors[] rather than endpoint failure), grade definitions, and rate limits (30/hr free, 500/hr Pro). No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is somewhat lengthy but well-structured and front-loaded with purpose, then special behaviors, then usage guidance, then rate limits, then return fields. Almost every sentence adds value, though a minor trim could be possible.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (single parameter, rich output), the description covers behavior for invalid certs, grade interpretation, rate limits, and distinguishes from a sibling. Output schema is implied but not needed as description lists return fields. Very complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a decent parameter description for 'domain'. The tool description does not add significant extra meaning beyond the schema, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it analyzes SSL/TLS certificates, lists outputs (grade, protocol, etc.), and distinguishes from sibling audit_domain by specifying when to use each.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use to audit certificate validity and detect expiring certs; for full domain audit use audit_domain', providing clear context and an alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subdomain_enumSubdomain EnumARead-onlyIdempotentInspect
Discover subdomains using passive methods: Certificate Transparency logs + DNS brute-force (no active probing). Use to map organization's attack surface; non-intrusive. Response carries next_calls — capped at 5 ssl_check hints (one per first-five subdomain) so triage scales to large enumerations without token bloat; pull tail entries by name when needed. Free: 30/hr, Pro: 500/hr. Returns {domain, count, subdomains, sources, found_via_wordlist, found_via_crtsh, crtsh_status, warnings, summary, next_calls}. Always check crtsh_status: 'ok' means the CT lookup completed (so a low count is real); 'timeout' / 'rate_limited' / 'unavailable' / 'error' means CT logs did not respond and the count is wordlist-only — the actual attack surface is likely larger, retry later or surface the limitation to the user.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Root domain to enumerate subdomains for (e.g. 'example.com', 'tesla.com') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnly, idempotent, non-destructive), the description details passive methods, response structure (next_calls), rate limits, output fields, and crtsh_status interpretation, significantly enhancing behavioral transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and covers all necessary aspects without being excessively verbose, though it could be slightly more concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description, combined with the output schema and annotations, fully addresses usage context, limitations, error conditions, and return value interpretation, making it complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema description coverage for the single parameter (domain), the description adds minimal additional meaning beyond what the schema provides, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool discovers subdomains using passive methods (CT logs and DNS brute-force) and is non-intrusive. However, it does not explicitly differentiate this tool from sibling tools like dns_lookup or domain_report.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides context for use (mapping attack surface) and notes non-intrusiveness, but lacks explicit when-to-use/when-not-to-use guidance or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tech_fingerprintTech FingerprintARead-onlyIdempotentInspect
Detect website technology stack: CMS, frameworks, CDN, analytics tools, web servers, languages (via HTTP headers + HTML analysis). Use for passive reconnaissance; for full audit use audit_domain. Free: 30/hr, Pro: 500/hr. Returns {technologies: [{name, category, confidence%, version}]}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to fingerprint (e.g. 'example.com', 'shopify.com') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint, covering safety and idempotency. The description adds rate limit information (Free: 30/hr, Pro: 500/hr) and output structure (returns technologies with name, category, confidence%, version), which are valuable beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first explains what it detects, second provides usage guideline, rate limits, and output format. No unnecessary words, front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity, the description covers purpose, usage guidelines, limitations (rate limits), and return format. Output schema exists and the description summarizes it. Annotations cover behavioral traits, making it complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter 'domain' with schema description coverage 100%. The description adds example values ('example.com', 'shopify.com') which enhance clarity beyond the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool detects website technology stack (CMS, frameworks, CDN, etc.) via HTTP headers and HTML analysis. The verb 'detect' and resource 'website technology stack' are specific, and the description distinguishes it from the sibling 'audit_domain' for full audits.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Use for passive reconnaissance; for full audit use audit_domain.' This provides clear guidance on when to use this tool and when not, with a named alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
tech_stack_cve_auditTech Stack CVE AuditARead-onlyIdempotentInspect
Composite tech-stack + CVE audit (MCP-only, no REST endpoint). Detects technologies on the target domain, queries CVE database for known vulnerabilities per product, enriches top-10 CVE candidates with CISA KEV federal patch deadlines, and checks public exploit / PoC availability. Identical for every tier — all data is sourced from local DB mirrors (no Shodan/AbuseIPDB), so there is no tier gating. CVE candidate batch: 50. Cost: 10 tokens per call — Free 30/hr ≈ 3 audits, Pro 500/hr ≈ 50 audits. Returns {domain, technologies, cves_by_tech, kev_findings, exploit_findings, summary, next_calls}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Target domain to fingerprint and CVE-audit (e.g. 'example.com'). IPs and internal hostnames are rejected. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true. The description adds significant behavioral details: 'Identical for every tier', no Shodan/AbuseIPDB, local DB mirrors, returns structured object with specific fields, batch size, cost. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single dense paragraph, but every sentence provides essential information. It is front-loaded with the composite nature and then lists details. Could be slightly improved with bullet points, but it's concise enough for the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of the tool (composite operations), the description covers all key aspects: inputs, outputs (listed fields), behavioral traits, cost, rate limits, tier behavior, and data sources. With annotations and output schema implied, the context is complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter 'domain' with 100% schema coverage. The description reinforces that IPs and internal hostnames are rejected, and the tool detects technologies and audits CVEs on the target domain. This adds value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it performs a composite tech-stack fingerprint and CVE audit, specifying the verb 'audit' and the resource 'domain'. It distinguishes itself from sibling tools like tech_fingerprint and cve_lookup by combining both actions, along with enrichment steps (KEV deadlines, exploit checks).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides detailed usage context: it's MCP-only, no tier gating, cost per call, batch size (50 CVE candidates), and rate limits. However, it does not explicitly tell when to use this composite tool versus using separate tools like tech_fingerprint or cve_lookup individually.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
threat_intelThreat IntelARead-onlyIdempotentInspect
Check domain against abuse.ch URLhaus for known malware-distribution URLs (single source — for multi-feed correlation use ioc_lookup which adds ThreatFox and, for IPs, Feodo Tracker). Use for fast domain-level threat assessment; use phishing_check for specific URLs. Free: 30/hr, Pro: 500/hr. Returns {malware_urls, threat_tags, threat_status, summary}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to check for threats (e.g. 'suspicious-site.com', 'example.com') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive. Description adds context about checking a single source and the return structure, without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise single paragraph with clear structure: main function, alternatives, rate limits, return format. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple single-parameter tool and presence of output schema, the description fully covers source, usage alternatives, rate limits, and return fields.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with description for domain. Description adds context about the threat feed but does not add new parameter semantics beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool checks a domain against abuse.ch URLhaus for malware-distribution URLs, with specific verb and resource. Distinguished from siblings ioc_lookup and phishing_check.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides when to use this tool vs alternatives (multi-feed correlation, specific URLs) and includes rate limits for Free/Pro tiers.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
threat_reportThreat ReportARead-onlyIdempotentInspect
Query comprehensive threat profile for an IP: Shodan host data, AbuseIPDB reputation, ASN/geolocation, and open ports. Use for IP investigation and SOC alert triage; for domain data use domain_report. Note: nested asn block always returns at most 50 IPv4/IPv6 prefixes — call asn_lookup with include_full_prefixes=True for the full announced-prefixes list. enrichment.vulns is severity-aware list[VulnInfo] (cve_id + severity + cvss_v3) — Phase 2 v1.16.0 BREAKING; pre-1.16 it was list[str] of CVE IDs. Free: 30/hr (costs 6 tokens), Pro: 500/hr. Returns {ip, enrichment, abuseipdb, shodan, asn, threat_level}.
| Name | Required | Description | Default |
|---|---|---|---|
| ip | Yes | Public IPv4 or IPv6 address to investigate (e.g. '8.8.8.8', '1.1.1.1'). Private/reserved IPs are rejected. |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false. The description adds valuable behavioral context: limits on ASN prefixes (at most 50), a breaking change note for enrichment.vulns format (list of strings vs list of VulnInfo objects), and rate limits (30/hr free, 500/hr Pro). No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is front-loaded with the core purpose and data sources. It contains several technical details (ASN limit, breaking change, rate limits) that are necessary for correct usage. While somewhat long, every sentence adds value; a slight tightening would improve conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of the tool (multiple data sources, breaking changes, rate limits) and presence of an output schema, the description covers usage context, limitations, and important behavioral quirks comprehensively. No gaps for successful agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds meaningful detail beyond schema: 'Public IPv4 or IPv6 address' and 'Private/reserved IPs are rejected', which clarifies validation behavior not in schema. This justifies a 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with a specific verb-resource pair 'Query comprehensive threat profile for an IP', listing distinct data sources (Shodan, AbuseIPDB, ASN/geolocation, open ports). It clearly distinguishes from the sibling tool 'domain_report' by stating its use case for IP investigation vs. domain data.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'Use for IP investigation and SOC alert triage' and when not: 'for domain data use domain_report'. This provides clear decision guidance with a named alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
username_lookupUsername LookupARead-onlyIdempotentInspect
Search for username across 15+ social/dev platforms (GitHub, Reddit, X/Twitter, LinkedIn, Instagram, TikTok, Discord, YouTube, Keybase, HackerOne, etc.). Use for OSINT investigations and identity verification. Free: 30/hr, Pro: 500/hr. Returns {username, total_found, platforms: [{name, exists, url, status_code}]}.
| Name | Required | Description | Default |
|---|---|---|---|
| username | Yes | Username string to search across platforms, without @ prefix (e.g. 'torvalds', 'johndoe', 'elonmusk') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds rate limit details (30/hr free, 500/hr Pro) and the return format, complementing the annotations (readOnlyHint, idempotentHint) with concrete behavioral context beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, front-loaded with the action, and includes essential details (platforms, use case, rate limits, output) without any wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple one-parameter tool with an output schema, the description covers all necessary aspects: purpose, usage context, behavioral constraints, and return value structure. No gaps remain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description does not add new meaning beyond the schema's description of the username parameter. Baseline 3 is appropriate as the schema already documents the parameter adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Search for username' and lists multiple example platforms, distinguishing it from sibling tools like domain lookups or IP lookups. It explicitly defines the scope as 15+ social/dev platforms.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit use cases: 'OSINT investigations and identity verification.' It does not mention when not to use or alternatives, but given the tool's specificity, this is adequate guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
wayback_lookupWayback LookupARead-onlyIdempotentInspect
Retrieve Wayback Machine snapshots for a domain: first capture, latest, total count, snapshot list. Use to investigate domain history and age; for full audit use domain_report. Free: 30/hr, Pro: 500/hr. status='ok' means the count is authoritative (even when 0 → confirmed no archives). status='unavailable' means CDX timed out/rate-limited/5xx — total_snapshots is OMITTED (unknown, NOT zero) and the agent should NOT report "no snapshots"; the warnings[] array carries the cdx_* error code (cdx_timeout/cdx_rate_limited/cdx_unavailable/cdx_error/cdx_parse_error/cdx_body_too_large). Heavy domains (kernel.org, microsoft.com, archive.org itself) frequently time out the CDX endpoint despite having millions of snapshots — fall back to archive_url for manual inspection. Returns {domain, status, total_snapshots, first_seen, last_seen, years_online, snapshots, archive_url, summary, warnings}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain to look up in web archives (e.g. 'example.com', 'archive.org') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark the tool as readOnly, idempotent, and not destructive. The description adds critical behavioral details: status 'unavailable' means total_snapshots is omitted (not zero), warnings carry specific error codes, and heavy domains (kernel.org, etc.) frequently time out. It also advises fallback to archive_url. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is quite detailed but every sentence adds value: purpose, usage, rates, behavior, error handling, heavy domain advice, and return structure. It is front-loaded with the main action. Slightly dense but efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description still provides a full list of returned fields. It covers edge cases (status unavailable, heavy domains) and gives practical advice. For a single-parameter lookup tool, it is complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter 'domain' with schema description already sufficient. The description adds context about heavy domains but does not enhance parameter semantics beyond the schema. With 100% schema coverage, baseline is 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves Wayback Machine snapshots for a domain, listing specific items (first capture, latest, total count, snapshot list). It distinguishes itself from domain_report by noting 'for full audit use domain_report.' The verb 'Retrieve' and resource 'Wayback Machine snapshots' are precise.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use the tool (investigate domain history and age) and explicitly gives an alternative (domain_report for full audit). It also provides rate limits (30/hr free, 500/hr pro) and specific guidance on interpreting status, including handling of unavailable status and heavy domains.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
whois_lookupWHOIS LookupARead-onlyIdempotentInspect
Retrieve WHOIS registration data: registrar, creation/expiry dates, nameservers, status. Use to verify domain ownership, age, expiration; for full audit use domain_report. Free: 30/hr, Pro: 500/hr. Returns {domain, whois: {registrar, creation_date, expiry_date, updated_date, name_servers, status, raw_length, error}, summary}.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Root domain to query WHOIS for (e.g. 'example.com', 'github.com') |
Output Schema
| Name | Required | Description |
|---|---|---|
| result | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and idempotentHint=true. Description adds rate limits and return structure, providing useful behavioral context beyond annotations without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single paragraph, well-structured: purpose, usage guidance, rate limits, return fields. Every sentence is relevant and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, description explains return fields and structure. Also covers rate limits and alternatives, making it fully complete for a simple lookup tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds examples ('example.com', 'github.com') and clarifies 'root domain', adding value over the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'retrieve' and resource 'WHOIS registration data', lists specific fields (registrar, dates, nameservers, status), and distinguishes from sibling 'domain_report' and 'audit_domain'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use (verify domain ownership, age, expiration) and when not (for full audit use domain_report). Also provides rate limit context (Free: 30/hr, Pro: 500/hr).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.