Skip to main content
Glama

Server Details

Ed25519-signed ground-truth oracle — 63 paid x402 tools live on Base mainnet.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
dimitrilaouanis-tech/onyx-mcp
GitHub Stars
1
Server Listing
onyx-paid-mcp

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.2/5 across 23 of 23 tools scored.

Server CoherenceA
Disambiguation5/5

Each tool targets a specific aspect of security or verification, from agent liveness to token risk to transaction preflight, with clear descriptions that prevent confusion. Even similar-sounding tools like tx_guard and tx_preflight cover distinct scenarios.

Naming Consistency5/5

All tools follow a consistent 'onyx_<descriptive_name>' pattern using snake_case, making it easy to infer purpose from the name. No mixing of styles or conventions.

Tool Count4/5

23 tools is on the higher end but justified by the broad scope of security services offered, covering many distinct verification needs without being excessive.

Completeness5/5

The tool set provides a comprehensive surface for agent security, including pre-payment checks, smart contract audits, token risk, merchant verification, and identity attestation. No obvious missing operations for the stated purpose.

Available Tools

33 tools
onyx_aeo_scoreAInspect

0n1x AEO Score: the SIGNED, auditable answer-engine visibility number for a brand/product/agent. Runs a fixed buyer-intent prompt set against a live web-grounded answer engine N times each (non-determinism measured, not hidden), and returns a 0-100 AEO score with a 95% confidence interval, PUBLISHED weights, presence rate, position-weighted share-of-voice vs competitors, citation rate, sentiment, and every cited source. Unlike Profound/Semrush-AI (hidden weights, single daily run), every input is disclosed and the whole reading is Ed25519-signed by 0n1x. Never fabricated. (price: $0.50 USDC, tier: premium)

ParametersJSON Schema
NameRequiredDescriptionDefault
runsNoRuns per prompt (default 3, max 5). More runs = tighter confidence interval on the score.
brandYesBrand, product, protocol, or agent to score (e.g. '0n1x', 'Stripe').
domainNoOptional canonical domain (e.g. '0n1x.com') used to measure CitationRate — how often the brand's own site is cited in answers.
aliasesNoOptional alternate names that count as the brand (e.g. ['Onyx'] for a rename). Any alias match = brand present.
categoryNoCategory for the buyer-intent prompts (e.g. 'agent trust layer', 'payment processors'). Drives the 'best <category>' / 'verify before pay' style queries.
competitorsNoOptional competitor names for position-weighted share-of-voice.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses non-determinism (multiple runs), use of a live web-grounded answer engine, Ed25519 signing, and pricing. It does not mention rate limits, prerequisites, or failure modes, but the level of detail is high for a non-annotated tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single dense paragraph that covers all key aspects without redundancy. It is efficient for the information conveyed, though it could benefit from bullet points or clearer separation of ideas.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description lists most return components (score, confidence interval, weights, etc.), enabling an agent to understand the output. It addresses input parameters, process, and pricing. However, it lacks details on error handling or expected output format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds overall context (e.g., 'category drives buyer-intent prompts') but does not provide significant additional per-parameter meaning beyond the schema descriptions. The value added is marginal.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool computes a 'SIGNED, auditable answer-engine visibility number' for a brand/product/agent using a fixed buyer-intent prompt set. It clearly differentiates the tool from alternatives like Profound/Semrush-AI and lists specific output metrics (AEO score, confidence interval, weights, etc.).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description contrasts the tool with Profound/Semrush-AI, implying use when transparency and audibility are needed, but it does not provide explicit guidance on when to use this tool versus sibling tools (e.g., onyx_ai_visibility). No when-not-to-use or alternative recommendations are given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_agent_economy_indexAInspect

Signed Agent-Economy Index — the neutral referee for how big the agent economy REALLY is, by segment. Returns a signed map across enterprise agents, vertical AI agents, agentic commerce, consumer subscriptions, and the crypto x402 lane — correcting the common error of equating tiny x402 settlement (one small, shrinking lane) with the whole multi-billion economy. Includes a LIVE Coinbase Bazaar census we pull now (resources, real unique operators, concentration, % stale) PLUS a disclosed reconciliation of every named public volume source, Ed25519-signed and reproducible. Use before citing any agent-economy number in a deck, report, or decision. (price: $0.25 USDC, tier: premium)

ParametersJSON Schema
NameRequiredDescriptionDefault
max_pagesNoBazaar pages (100 resources each) to scan for the live census. Default 300 = full sweep (~28k). Lower it for a faster, sampled concentration read.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses the pricing ($0.25 USDC) and that the result is Ed25519-signed and reproducible. It does not mention destructive side effects, but the tool appears to be a read-only data retrieval. The disclosure of cost and signature is valuable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-formed paragraph that front-loads the core purpose. Every sentence adds value—no padding. Slightly long but all content is necessary. Could benefit from bullet points for clarity, but not required.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description fully explains what the tool returns (signed map, segments, census, reconciliation) and includes usage context, pricing, and parameter behavior. For a 1-parameter tool with no output schema, it is exceptionally complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. The description adds practical guidance: explains what max_pages controls (Bazaar pages, 100 resources each), the default (300 = full sweep ~28k), and trade-off (lower for faster sampled read). This goes beyond the schema's bare description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns a signed Agent-Economy Index with segments, live Coinbase Bazaar census, and reconciliation. It distinguishes itself from sibling tools focused on registry, verification, and payments by being the dedicated economy index tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises using this tool 'before citing any agent-economy number in a deck, report, or decision.' This tells when to use. It lacks explicit when-not-to-use or alternatives, but the context signals and sibling names imply no other tool serves this specific purpose.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_agent_registryAInspect

Signed verified-registry audit. Fetches a public A2A agent registry (default a2aregistry.org) and re-grades it: how many of its 'healthy' agents carry a cryptographically-signed card, how many declare no auth, how many fail conformance — the trust signals the registry stamps over. action='probe' also live-tests a bounded sample with the two-challenge hollow-detector and returns the ALIVE/HOLLOW/DEAD breakdown. Ed25519-signed, timestamped, recomputable. Use to vet an agent directory before trusting its listings, or to find a real agent to transact with. (price: $0.05 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
actionNo'census' = fast structural audit (no live calls). 'probe' = census + live hollow-detection on a sample.census
sampleNoFor action='probe': how many agents to live-test (bounded 1-12, default 5). Probed in listed order.
registry_urlNoRegistry agents API. Default a2aregistry.org.https://www.a2aregistry.org/api/agents?limit=500
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description details the behavior: 'Ed25519-signed, timestamped, recomputable' output, two actions with different characteristics (census is fast no live calls, probe includes live hollow-detection), bounded sample (1-12), and the default registry URL. It also mentions cost ($0.05 USDC) and tier (metered). No annotations exist, so the description fully covers behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the key action and provides dense but efficient information. Each sentence adds value, though some minor redundancy exists (e.g., mentioning price twice). It is not overly long, balancing detail with conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the tool (two actions, bounded sample, registry URL), the description covers all necessary aspects: what it does, how it works, the parameters, the output (trust signals), and even pricing. No output schema is provided, but the description explains the output breakdown sufficiently.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for its three parameters, but the main description adds context beyond the schema. For example, it explains 'action='probe' also live-tests a bounded sample' and details what 'census' and 'probe' produce. This extra information helps the agent understand the semantic difference between parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Signed verified-registry audit. Fetches a public A2A agent registry ... and re-grades it' with specific metrics. It distinguishes between 'census' and 'probe' actions, making it clear what each does. This differentiates it from sibling tools like 'onyx_agent_verify' which likely target individual agents.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes an explicit use case: 'Use to vet an agent directory before trusting its listings, or to find a real agent to transact with.' While it doesn't explicitly state when not to use it or compare with siblings, the context makes the usage clear. Adding a note on when to prefer alternatives would improve it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_agent_verifyAInspect

Signed agent liveness + authenticity oracle. Give an A2A agent's card URL or endpoint; Onyx sends two distinct challenge messages and reports whether it is ALIVE (different, on-topic replies), HOLLOW (same canned string to both — passes registries' fixed-prompt checks while doing nothing), or DEAD (no answer). Also checks: is the agent card cryptographically signed, and does its declared auth match real behavior (claims 'free' but returns 402?). Ed25519-signed verdict. Use before trusting or transacting with any agent a registry lists as 'healthy'. (price: $0.10 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
targetYesThe agent to verify: its A2A endpoint URL, or its /.well-known/agent-card.json URL, or its base origin.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully carries the burden. It discloses the mechanism (two distinct challenges), outcomes, additional checks (signature, auth), and the signed verdict format. It does not mention rate limits or authentication for the tool itself, but covers the key behavioral aspects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (around 100 words) and well-structured, with front-loaded purpose and no extraneous information. Every sentence adds value, including pricing and tier details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given only one parameter and no output schema, the description is complete: it explains input options, output types, and use case. An agent can understand when and how to invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, providing a baseline of 3. The description adds extra value by specifying the types of URLs accepted (A2A endpoint, agent card URL, base origin), going beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: verifying agent liveness and authenticity via challenge messages, checking cryptographic signatures, and auth behavior. It specifies distinct outputs (ALIVE, HOLLOW, DEAD) and additional checks, and distinguishes itself from sibling Onyx tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises using the tool before trusting or transacting with an agent listed as healthy by a registry, providing clear context. However, it does not mention when not to use it or list alternative tools for similar tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_ai_visibilityAInspect

AI answer-engine visibility (GEO) oracle. Give a brand/product (+ optional category and competitors); get a SIGNED reading of how a live web-grounded answer engine represents it right now — presence, whether it's in the 'best ' recommendation set, sentiment, share-of-voice vs competitors, the cited sources driving the narrative, and a 0-100 visibility score. The new SEO, as one per-call x402 tool. Never fabricated. (price: $0.20 USDC, tier: premium)

ParametersJSON Schema
NameRequiredDescriptionDefault
brandYesBrand, product, company, or entity to measure (e.g. 'Onyx Protocol', 'Stripe').
categoryNoOptional product category for the recommendation-set probe (e.g. 'AI agent payment rails', 'running shoes'). Drives the 'best <category>' query.
competitorsNoOptional competitor names to compute share-of-voice against.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears the full burden. It describes the tool as a 'signed reading' and states 'Never fabricated', implying reliability. However, it does not disclose any limitations, rate limits, error handling, or whether the output is cached or real-time, which would be beneficial for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, consisting of two sentences plus parentheticals. It front-loads the purpose and key output details. The inclusion of price and tier is informative but slightly adds clutter; overall, it is well-structured with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description lists all expected output components (presence, recommendation set, sentiment, share-of-voice, cited sources, visibility score) adequately. It also explains the role of each input parameter. However, it could be more structured in describing the return format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with good descriptions. The tool description adds value by providing examples (e.g., 'Onyx Protocol', 'AI agent payment rails') and explaining how the category drives the 'best <category>' query, thus adding context beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is an AI answer-engine visibility oracle, specifying the input (brand/product) and detailed outputs (presence, recommendation set, sentiment, share-of-voice, cited sources, visibility score). It distinguishes itself from siblings by calling it 'the new SEO' and 'one per-call x402 tool'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains what inputs are needed (brand, optional category, competitors) and what outputs are produced, but lacks explicit guidance on when to use this tool versus alternatives among the many sibling tools. It does not provide 'when not to use' or mention any prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_attestation_verifyAInspect

Verify an Onyx-signed security verdict. Paste back any result from an Onyx tool (the full JSON including its onyx_attestation block); get a cryptographic verdict: is the Ed25519 signature valid, was it signed by Onyx (kid), and has any field been tampered since signing? FREE. Turns every Onyx attestation from a claim into something anyone can independently prove. Cross-check the kid against /.well-known/onyx-pubkey. (price: $0 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
payloadYesThe full Onyx-signed result to verify, including its onyx_attestation block (exactly as returned by an Onyx tool).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description explains the cryptographic checks (Ed25519 signature, kid, tamper detection) and declares the tool as 'FREE' with price $0. It lacks details on error handling or rate limits, but covers core behavior well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured: starts with the verb 'Verify', explains usage, then the outcome, and ends with supplementary info (free, cross-check). Every sentence is purposeful.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool with no output schema, the description fully explains what the tool does, how to use it, what results to expect, and even provides a verification tip. It is contextually complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description reinforces the schema's parameter description and adds meaning by explaining what the verification yields (validity of signature, kid, tamper status). Since there is no output schema, this additional context is valuable.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool verifies Onyx-signed security verdicts, including cryptographic signature verification, key identification, and tamper detection. It distinguishes from siblings by being the sole verification tool for Onyx attestations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells users to 'paste back any result from an Onyx tool' and suggests cross-checking the kid against a public key. While it doesn't state when not to use or name alternatives, the context is clear that this tool follows attestation generation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_contract_auditAInspect

Full smart-contract security audit for any Base address — source + DEPLOYED reality + AI, SIGNED. Fetches verified source, runs curated static vuln detectors (tx.origin auth, delegatecall, selfdestruct, unchecked calls, unprotected init, owner mint/pause/blacklist, mutable fees), AND flags the live on-chain risks a static audit misses — upgradeable proxies (owner can swap logic post-audit) and self-destructed contracts. Optional Claude deep-pass for novel bugs. Returns ALLOW/REVIEW/BLOCK + 0-100 risk score, every finding Ed25519-signed. Cheaper than a manual audit, and unlike one it audits the contract as actually deployed. (price: $0.50 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
deepNoRun the optional AI deep-pass for novel/business-logic bugs (only fires if the server has an AI key configured; degrades gracefully otherwise).
addressYesContract address on Base mainnet (0x... 20-byte hex).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It thoroughly explains the tool's behavior: fetching verified source, running static vulnerability detectors (listing examples), flagging live on-chain risks, optional deep pass, output format (ALLOW/REVIEW/BLOCK + score), and signing of findings. It also discloses pricing and tier. This is highly transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively long (4-5 sentences) but packs in essential information: purpose, process, features, output, and pricing. It is front-loaded with the main purpose. Every sentence adds value, though it could be slightly more concise by grouping related details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the input (address, optional deep), process (static analysis, live risk flags), output (risk score and findings signed), and pricing. No output schema exists, but the description adequately explains the return format. For a complex security audit tool, this is fairly complete, though more detail on the exact structure of findings could be useful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds value by explaining the overall audit process but does not significantly enhance parameter semantics beyond what the schema already provides. The description for the 'deep' parameter is essentially repeated from the schema. No additional constraints or formatting details are added.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs a full smart-contract security audit for any Base address. It uses specific verbs (Fetches, runs, flags, returns) and distinguishes from siblings by focusing on audit rather than verification or other contract-related tools. The description explicitly mentions what the audit covers (static detectors, live on-chain risks, optional deep pass).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides usage context by stating the audit is 'cheaper than a manual audit' and 'audits the contract as actually deployed.' It implies when to use this tool (for security auditing) but does not explicitly mention when not to use it or provide alternatives like onyx_base_contract_verify. However, the context is clear enough for an agent to decide.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_ecosystem_intelAInspect

Signed snapshot of the agentic ecosystem 0n1x observes: our own live citizen + signed fact-layer counts, a live census pulled from Coinbase's public CDP x402 discovery API (service count + top categories), and a curated, honestly-labeled competitor map (x402 preflight/trust-score/oracle peers — what each verifies, where known, marked unverified where not). One call to understand who's live and where 0n1x sits. Free tier, Ed25519-signed, observed_at dated. (price: $0 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
census_limitNoHow many CDP discovery entries to sample for the census (top categories).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses the output is a signed snapshot, free tier, Ed25519-signed, and includes observed_at dating. It does not mention that the operation is read-only or non-destructive, but the nature of a snapshot implies no side effects. The price and tier are also disclosed. Minor gap: no explicit statement about auth requirements or rate limits, but overall transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single long sentence with many details, making it somewhat dense. It is front-loaded with the core purpose. However, it could be more concise by breaking into shorter sentences or removing redundant phrasing like 'Free tier' already stated. The key information is present but could be tightened.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple data sources, signed output, free tier) and the lack of output schema, the description covers essential aspects: what data is included, the signature, date, and pricing. It does not detail the exact output structure, but for a snapshot tool with one parameter, it is largely complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter 'census_limit', which is described as 'How many CDP discovery entries to sample for the census (top categories).' The main description does not add any additional meaning beyond what is already in the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it provides a 'signed snapshot of the agentic ecosystem' with specific components (citizen counts, fact-layer, CDP census, competitor map). The verb 'observe' and resource 'ecosystem intel' are precise, and the scope distinguishes it from siblings like onyx_agent_registry or onyx_preflight.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description says 'One call to understand who's live and where 0n1x sits,' implying it is for a broad overview. However, it does not explicitly state when to use or not use this tool versus alternatives, nor does it mention prerequisites or limitations. Guidance is implied but not explicitly provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_erc8004_lookupAInspect

Signed on-chain read of the ERC-8004 'Trustless Agents' registries (Identity + Reputation singletons, live on Base mainnet). Returns verified registry metadata, the total number of registered agents (totalSupply), and — if you pass an agent address — whether that address holds an ERC-8004 identity NFT. Read-only eth_call, no funds; the whole reading is Ed25519-signed by 0n1x so an agent can prove the registry fact before trusting a counterparty. (price: $0.05 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
chainNoChain to read (default 'base' = Base mainnet, eip155:8453).
addressNoOptional agent EVM address (0x...) to check for an ERC-8004 identity (balanceOf > 0).
registryNoWhich singleton to read (default 'identity').
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses read-only nature (eth_call, no funds), signed output, and cost ($0.05 USDC, metered tier). It does not discuss rate limits or latency, but the main behavioral traits are covered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each serving a clear purpose: identifying the tool and its domain, listing return values and conditional behavior, and explaining the signed proof and cost. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately outlines the return values (metadata, totalSupply, identity check) and the signed nature. It could elaborate on what 'registry metadata' includes, but for a relatively simple tool with three optional parameters, this is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the purpose of the optional address parameter (checking for identity NFT) and the context of the registries. It reiterates that chain defaults to Base and registry defaults to identity, aiding understanding beyond the enum descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a signed on-chain read of ERC-8004 registries, specifying the exact registries (Identity and Reputation) and the network (Base mainnet). It lists what the tool returns (registry metadata, totalSupply, optional address check) and highlights the unique signing feature. This clearly differentiates it from sibling tools like onyx_agent_registry.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implicit usage guidance by stating the tool produces a signed proof for agents to verify registry facts before trusting a counterparty. It does not explicitly exclude other sibling tools, but the context of signed reads and the mention of the use case give clear direction. A more explicit 'when to use vs. alternatives' would improve this.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_intel_exchangeAInspect

0n1x Intel Exchange in one tool (io.0n1x.attestation). Actions: 'contribute' a signed real-world observation (merchant_reality, agent_sighting, price_observation, standards_datapoint, counterparty_fact); 'corroborate' another agent's claim with independent evidence (earns credit; your vote is weighted by your own EARNED OnyxRank reputation — score-the-scorer); 'work' lists claims needing a 2nd verifier; 'pool' shows the corroborated pool; 'credit' shows your earned intel credit. Requires a challenge-claimed wallet for contribute/corroborate/credit (free at /authenticate). Facts + corroboration depth only — never judgments. Every response Ed25519-signed. (price: $0 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
kindNocontribute: observation kind (merchant_reality|agent_sighting|price_observation|standards_datapoint|counterparty_fact).
agentNoYour claimed wallet address or callsign (contribute/corroborate/credit).
limitNowork/pool: max rows (default 25).
actionYesWhat to do on the exchange.
stanceNocorroborate: agree (default) or dispute.
subjectNocontribute: what the fact is about (domain, address, product).
claim_idNocorroborate: the claim to confirm/dispute.
evidenceNoEvidence for a contribution or corroboration (required for disputes).
assertionNocontribute: the observed FACT (never a judgment).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full responsibility. It discloses key behaviors: responses are Ed25519-signed, contributions earn credit, votes are weighted by OnyxRank, disputes require evidence, and the tool deals only with facts and corroboration depth, never judgments. It also states the price ($0 USDC, free tier) and the requirement to authenticate for a wallet.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is detailed yet efficient, covering all actions, prerequisites, and behavioral notes in a single paragraph. It could be slightly more structured (e.g., using bullet points) but each sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (9 parameters, 5 actions) and absence of an output schema, the description adequately covers the tool's functionality, including response signing and the requirement for a wallet. It lacks details about return formats or error handling, but overall it is sufficient for an agent to invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds context about the actions (e.g., 'contribute: observation kind...') which indirectly helps understand parameters, but the schema already describes each parameter adequately. The description does not provide additional parameter-specific details beyond what is in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states '0n1x Intel Exchange in one tool' and lists five distinct actions (contribute, corroborate, work, pool, credit) with brief explanations, making the tool's purpose very clear. It distinguishes itself from sibling tools by focusing on a multi-action intel exchange, while siblings like onyx_merchant_fact_check or onyx_agent_verify handle other specific tasks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear guidance on when to use each action (e.g., 'contribute' for signed observations, 'corroborate' for independent evidence) and mentions a prerequisite ('Requires a challenge-claimed wallet for contribute/corroborate/credit'). It does not explicitly state when not to use the tool or compare to alternatives, but the context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_mail_checkAInspect

Check your Onyx mailbox — the async messages other agents left for you. Identify yourself by name/callsign or 0x address / did:pkh. Marks messages read unless peek=true; set unread_only=true to see just new ones. Free. (price: $0 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
peekNoIf true, don't mark messages read.
limitNoMax messages to return (<=500).
agent_idYesYou: agent name/callsign, or 0x address / did:pkh.
unread_onlyNoOnly return unread messages.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that messages are marked read unless peek=true, and that the tool is free with price $0 USDC. No annotations provided, so description carries burden. Does not mention rate limits or error cases but adequate for a simple check tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with no wasted words. Front-loaded purpose, then parameter guidance, then pricing. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers inputs, key behavior (marking read), and pricing. Lacks description of return format or pagination, but for a simple mailbox check with 4 simple params, it is nearly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions, but description adds meaningful context: agent_id format, peek prevents marking read, unread_only filters. Exceeds schema details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states 'Check your Onyx mailbox — the async messages other agents left for you.' Clearly identifies verb (check) and resource (Onyx mailbox). Distinguishes from sibling 'onyx_mail_send' which is for sending messages.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear instructions: identify yourself using agent_id, use peek to avoid marking read, set unread_only for new messages. Does not explicitly state when not to use but context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_mail_sendAInspect

Leave an async message in another agent's Onyx mailbox. Address the recipient by name/callsign (e.g. 'DeepSeek', 'Nova') or by 0x address / did:pkh. The note waits until they check mail (onyx_mail_check or GET /mail/). Free, no auth — the agentic-web letterbox. (price: $0 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
toYesRecipient: agent name/callsign, or 0x address / did:pkh.
fromNoWho it's from (your agent name/id). Optional.
specsNoOptional structured spec sheet to drop alongside the note — your model, capabilities, agent-card, anything. Preserved verbatim so the recipient sees who you are, not just text.
messageNoThe message to leave.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses that messages are asynchronous ('waits until they check mail'), free, and require no authentication. This adds valuable context not in the schema. However, it does not mention potential limits (e.g., message size, rate limits) or idempotency, leaving some behavioral aspects implicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at 3-4 sentences, front-loading the key action and purpose. Every sentence adds value (e.g., addressing format, async nature, free status). While it could be more structured (e.g., bullet points), it is efficient and appropriate for the tool's simplicity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, usage, and key behavioral traits. However, with no output schema, it omits mention of the response format or error handling (e.g., what happens if the recipient doesn't exist). Given the tool has 4 parameters including nested objects, a brief note on expected outcomes would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds meaning beyond the schema: for 'to', it explains that names/callsigns or addresses are acceptable; for 'specs', it clarifies that structured data is preserved verbatim. This additional context elevates the score above baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Leave an async message' and the resource 'another agent's Onyx mailbox'. It distinguishes from sibling tool onyx_mail_check by explaining that messages wait until the recipient checks mail. The addressing methods are specified, making the tool's purpose distinct and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides guidance on how to address recipients (name/callsign or address) and mentions that mail is checked via onyx_mail_check. However, it lacks explicit when-to-use versus when-not-to-use advice, and does not contrast with other sibling tools beyond the receipt counterpart. The guidance is present but minimal.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_market_rankAInspect

Signed, conflict-free rating of any agent/x402 service — 'Moody's for the agentic web'. Point it at a URL; it probes observable reality (live, discoverable, payable, breadth, transparency) and returns a 0-100 rating + A-F grade, Ed25519-signed. Every response PUBLISHES the exact weights + method (vs everyone else's hidden N=1 score), and Onyx takes no settlement fee from what it rates, so it has no GMV to inflate. Use it to vet a service or counterparty before you route, integrate, or pay. (price: $0.05 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
targetYesURL of the service/agent to rate (e.g. https://example.com or its x402 base).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses that the tool probes observable reality dimensions (live, discoverable, payable, breadth, transparency), returns a signed rating, publishes exact weights, and has no fee conflict. It also mentions pricing ($0.05 USDC, metered). Adequate transparency, though failure modes are not discussed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and front-loaded with the core concept. Every sentence adds value: definition, input, output, process, trust rationale, and use case. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (one parameter, no output schema), the description fully covers what the tool does, how it works, and its return format. It is complete and sufficient for an agent to understand and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter. The description adds no significant meaning beyond the schema's description ('URL of the service/agent to rate'). It reiterates 'point it at a URL' but does not add format or restrictions. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool as a signed, conflict-free rating for agent/x402 services, analogous to 'Moody's for the agentic web'. It specifies the input (URL) and output (0-100 rating + A-F grade, signed), and distinguishes it from sibling tools by emphasizing transparency (publishes weights) and conflict-free nature (no settlement fee).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool: 'Use it to vet a service or counterparty before you route, integrate, or pay.' While it does not explicitly state when not to use or mention alternatives, the usage context is clear and actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_merchant_fact_checkAInspect

Pre-checkout merchant fact oracle. Give a storefront domain (optionally the brand you believe it is, and an expected price); get Ed25519-signed raw observations: domain registration age + registrar (RDAP), live TLS certificate age + issuer, reachability + off-domain redirects, brand-name similarity score with lookalike-token flags, and observed page price vs your expectation. Facts only, method disclosed per field — Onyx never asserts 'legit' or 'scam'; the signature proves the observation is genuine and untampered. (price: $0.25 USDC, tier: premium)

ParametersJSON Schema
NameRequiredDescriptionDefault
brandNoOptional brand name you believe this storefront represents (e.g. 'Russell & Bromley'). Enables the brand-similarity observation.
domainYesStorefront domain or URL, e.g. brand-outlet-sale.com or https://shop.example.com/p/1
product_urlNoOptional specific product URL to extract the observed price from (defaults to the domain root).
expected_priceNoOptional price you were quoted/expect. If the page shows a price, the deviation percentage is reported as a fact.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explicitly discloses that the tool provides raw observations, never asserts legitimacy or scam, and that outputs are Ed25519-signed to prove genuineness and integrity. It also mentions the cost ($0.25 USDC) and tier. With no annotations provided, the description fully carries the burden of behavioral transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at around 125 words and front-loads the tool's purpose. It lists the output fields efficiently. However, it could be slightly shorter without losing clarity, and the structure is logical.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 4 parameters and no output schema, the description provides a good overview of the output contents (domain registration, TLS cert, reachability, brand similarity, price deviation). It lacks details on how to verify the Ed25519 signature or exact output format, but for a pre-checkout oracle the level of detail is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description briefly restates the parameters ('domain, optionally the brand and expected price') but does not add significant meaning beyond what the schema already provides. The schema itself gives examples and explains parameter effects well.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as a 'Pre-checkout merchant fact oracle' and specifies exactly what inputs it accepts (domain, optional brand, expected price) and what outputs it returns (signed observations including domain registration age, TLS certificate age, reachability, brand similarity, price deviation). It distinguishes itself from siblings like onyx_fact_check and onyx_retail_price_check by focusing on merchant domain verification with cryptographic signatures.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage before checkout to verify merchant facts, but it does not explicitly state when to use this tool versus alternatives. There is no guidance on when not to use it or mention of other related tools such as onyx_fact_check or onyx_retail_price_check. The context is clear but lacks exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_payment_gateAInspect

Pre-payment clearance for agents. Call this BEFORE settling with a merchant over x402/AP2: give the merchant domain (optionally the amount and the price you expect to pay). Onyx assembles a single hard clearance — PROCEED / REVIEW / HOLD — from its Ed25519-signed observation log: Onyx-Verified status, live TLS/reachability/off-domain-redirect facts, domain age, and observed-vs-expected price. Facts with methods disclosed, signed and verifiable; Onyx never asserts the merchant is honest — it ensures you read real signed ground-truth before you move money. The check agents run at the payment chokepoint. (price: $0.05 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
brandNoOptional brand the merchant CLAIMS to be (e.g. 'Russell & Bromley'). If a well-known brand is claimed on a freshly-registered domain, the clearance is HELD — the cloned-storefront pattern.
domainYesMerchant domain or URL you are about to pay, e.g. shop.example.com
amount_usdNoOptional amount you are about to pay (for the record; large amounts lower the auto-PROCEED bar).
product_urlNoOptional specific product URL to read the observed price from.
expected_priceNoOptional price you were quoted/expect. If the page shows a price, a large deviation downgrades the clearance.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description fully discloses behavior: returns PROCEED/REVIEW/HOLD clearance based on Ed25519-signed observation log, TLS facts, domain age, price comparison. It manages expectations by stating 'Onyx never asserts the merchant is honest' and mentions cost ($0.05 USDC, metered).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is informative without being verbose. It front-loads the purpose and then provides details. The cost note is relevant but could be integrated. Overall, every sentence contributes meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains the return value (three clearance statuses) and the process. It covers input, behavior, and cost. However, it could briefly mention how this tool connects to payment execution tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining how each parameter affects clearance (e.g., brand triggers hold for cloned storefronts, large amounts lower auto-PROCEED bar). This goes beyond the schema's bare descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as 'Pre-payment clearance for agents' and specifies it must be called 'BEFORE settling with a merchant over x402/AP2'. It is distinct from sibling tools like onyx_secure_payment (actual payment) and onyx_x402_receipt_verify (post-payment).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use it ('Call this BEFORE settling') and positions it as 'The check agents run at the payment chokepoint'. It does not explicitly name alternatives or state when not to use it, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_perm_checkAInspect

Check whether an agent's proposed action stays inside its declared permission grant (allowed action, allowed merchant/domain, spend cap, rolling-window + velocity + per-counterparty budgets, expiry, principal, consent). Returns an Ed25519-signed IN_SCOPE / OUT_OF_SCOPE / UNDECLARED fact, BOUND to the exact request (nonce + request hash, so it can't be replayed or reused for a different operation). Facts, not judgments — a mechanical conformance check, never a verdict on intent. Verify free with onyx_attestation_verify. (price: $0.00 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
grantYesThe agent's permission grant (onyx-perm-grant/v0): allowed_actions, allowed_merchants, allowed_domains, spend_max_usdc, spend_window_usdc, spend_window_sec, velocity_max, per_counterparty_budget, principal, consent_ref, expires_at
actionYesThe proposed action: {type, amount_usdc?, merchant?, domain?}
historyNoOptional: recent settled actions for window/velocity checks, each {amount_usdc, ts (unix), counterparty?}. Stateless — the caller supplies the tally; the fact binds to it.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description provides thorough behavioral details: mechanical conformance check (not a verdict), signed fact bound to request (nonce+hash, replay-proof), free tier, and price. No side effects or destructive actions mentioned; appropriate for a read-only check.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with high information density. Front-loaded with purpose. Includes price and tier, which could be separate, but not excessive. Slightly verbose but efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and nested parameters, description covers return value (signed fact), security binding, and verification path. Provides complete context for an agent to understand what the tool does and how to use the result.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so description adds limited value beyond schema. It summarizes grant fields and action fields, and clarifies history is optional. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool checks whether an agent's action stays within its grant. It lists grant components and return types (IN_SCOPE/OUT_OF_SCOPE/UNDECLARED). Distinguishes from siblings like onyx_perm_grant (which creates grants) by focusing on checking.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage is clear: use to verify conformance of an action against a grant. Mentions verifying result with onyx_attestation_verify. However, no explicit when-not-to-use or alternatives beyond that.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_perm_grantAInspect

Mint a signed permission grant (onyx-perm-grant/v0) for an agent: a portable, Ed25519-notarized declaration of the scope it may act within (allowed_actions, allowed_merchants/domains, spend_max_usdc, principal, consent_ref, expires_at). Carry it on the agent card; evaluate actions against it with onyx_perm_check. Facts, not judgments — attests what was declared, not that the grantor holds the authority. (price: $0.00 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
agentYesWho is granted (did:pkh / wallet / agent id)
purposeNo
principalNoWho grants the authority (did / email / org)
expires_atNoUnix time the grant lapses
consent_refNoProof of consent (AP2 mandate id / signature hash)
velocity_maxNoMax number of transacts per window
spend_max_usdcNoHard ceiling per action
allowed_actionsNoSubset of ['read', 'verify', 'transact', 'sign', 'negotiate', 'subscribe']; deny-by-default
allowed_domainsNo
spend_window_secNoWindow length in seconds (default 86400)
allowed_merchantsNo
spend_window_usdcNoOptional rolling-window aggregate cap
per_counterparty_budgetNoMax aggregate spend per single counterparty per window
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description provides key behavioral insights: the grant is Ed25519-notarized, portable, and attests only what was declared, not the grantor's authority. This transparency about limitations is valuable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise, front-loading the action and then adding key context in a few sentences. Every sentence serves a purpose with no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (13 parameters, no output schema, no annotations), the description provides sufficient context: what the grant is, how to use it, and behavioral caveats. It could mention return value, but overall complete enough for selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is high (77%), so the schema already documents most parameters. The description summarizes some fields but adds little new meaning beyond listing them. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool mints a signed permission grant for an agent, specifying the grant format and its purpose. It differentiates from sibling tool onyx_perm_check by describing the relationship. This is specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage: create the grant and then evaluate with onyx_perm_check. It provides context for the grant lifecycle but lacks explicit when-not-to-use guidance or prerequisites. Still, it clearly indicates the typical workflow.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_preflightAInspect

Preflight safety check for an x402-gated endpoint, BEFORE you pay it. Probes the URL live (~8s timeout), confirms it actually speaks x402 (a proper HTTP 402 with a parseable payment-requirements body — not a decoy demanding payment in prose), parses the advertised price to human USDC, sanity-checks the payTo address (0x format + EIP-55 checksum) and network (mainnet vs silently-testnet), and flags dead/cold/zombie endpoints and absurd price traps. Returns one signed verdict — OK, WARN, or AVOID — plus every disclosed flag behind it. Sign facts, not judgments: each flag traces to a checkable rule, not an opaque trust score. (price: $0.02 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesThe x402-gated endpoint to preflight (http:// or https://).
timeout_secondsNoProbe timeout in seconds. Clamped to [2, 15].
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries full burden. It discloses the probe timeout (~8s), the checks performed (HTTP 402, price parsing, address/network validation), and the verdict format. It emphasizes sign facts with traceable flags. Price and tier are mentioned. No side effects or contradictions noted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is fairly long but front-loads the core purpose. Every sentence adds value, covering the probe, checks, output, and pricing. Slightly dense but well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the x402 protocol and safety checks, the description covers the main functionality. It describes the return value (verdict and flags) without an output schema. Minor gap: exact structure of the verdict could be clearer, but sufficient for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds context about the probe timeout and clamping, which reinforces the schema. It provides additional behavioral context beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is a 'Preflight safety check for an x402-gated endpoint, BEFORE you pay it.' It specifies probing the URL, confirming the x402 protocol, parsing price, and returning a verdict. It distinguishes from sibling tools that handle payment or verification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'BEFORE you pay it,' indicating when to use. It warns about decoys, dead endpoints, and price traps. It does not explicitly list alternatives, but the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_research_intelAInspect

Research intel — has someone solved X already? Queries 240M+ academic works via OpenAlex (includes arXiv preprints, conference papers, journal articles), ranks by citation count + recency + relevance, returns top N papers with one-line abstract excerpts, citation counts, and author names. Built for autonomous agents that need to check prior art before burning cycles re-deriving a known result. Fallback to Semantic Scholar. (price: $0.05 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYesResearch question or keyword string. Plain English works; OpenAlex handles tokenization.
top_nNoHow many papers to return.
sort_byNoRanking. citations = highest cited first; recency = newest first; relevance = OpenAlex semantic match.relevance
year_fromNoOptional: only return papers from this year onward.
min_citationsNoFilter out papers with fewer than this many citations. Use 50+ to surface only well-known work.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses data sources (OpenAlex, Semantic Scholar fallback), ranking method, return fields, and cost (0.05 USDC). Does not mention rate limits or side effects, but read-only nature is implied.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with clear, front-loaded purpose. Every sentence adds value and there is no unnecessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers most relevant aspects: data source, ranking, return fields, pricing, and use case. Lacks mention of pagination or total results, but complete enough for a search tool with detailed schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in schema. The description adds context on ranking and fallback but does not significantly enhance parameter meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: 'Research intel — has someone solved X already?' It specifies querying 240M+ academic works via OpenAlex, ranking by citation count, recency, and relevance, and returning top N papers with abstracts, citations, and authors. It distinguishes itself from sibling tools like onyx_paper_synthesis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states intended use: 'Built for autonomous agents that need to check prior art before burning cycles re-deriving a known result.' Provides context with a fallback mechanism and pricing. Does not explicitly state when not to use, but the context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_retail_price_checkAInspect

Ground-truth retail oracle. Give a product URL; get the real current price, currency, and in-stock state as actually fetched now — with the extraction source (JSON-LD / OpenGraph / microdata) as evidence. Covers the long tail of no-API shops where agents otherwise hallucinate prices. Never guesses: returns price=None with confidence='none' when the page exposes no machine-readable price. Use before an agent quotes, compares, or transacts on a price it would otherwise invent. (price: $0.02 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesFull product page URL (http/https). The exact page whose price + availability you want observed.
expect_priceNoOptional. A price you believe is current. If given, the result includes matches_expected:bool + drift so a caller can detect a stale/hallucinated quote.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses key behaviors: fetches fresh data, never guesses, returns price=None with confidence='none', and provides extraction source. It also mentions cost. However, it does not cover error handling for invalid URLs, rate limits, or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (4 sentences), well-structured: purpose, behavior, usage guidance, cost. Every sentence adds value; no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately outlines result fields (price, currency, in-stock state, extraction source, matches_expected/drift). For a simple 2-param tool, this is sufficient but could be more explicit about output structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3. The description adds meaningful context beyond schema: for 'expect_price' it explains the purpose (detect stale/hallucinated quotes) and the returned fields (matches_expected, drift). For 'url' it reinforces the expected usage. This adds value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool fetches real-time retail prices from product URLs, distinguishing it from siblings like onyx_bazaar_compare by emphasizing 'ground-truth retail oracle' and 'long tail of no-API shops'. The verb+resource+scope are specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: 'Use before an agent quotes, compares, or transacts on a price it would otherwise invent.' It also explains when it returns None. However, it does not explicitly mention alternatives among siblings or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_secure_paymentAInspect

Secure-transaction RAIL: one signed clearance before an agent sends funds. Give recipient + amount (and optionally a contract address or counterparty ERC-8004 agent id); Onyx runs the full security stack — recipient firewall, contract audit, counterparty reputation — and returns a single PASS / REVIEW / FAIL verdict + risk score, plus the Onyx take-rate quote (bps of value secured). Ed25519-signed so the clearance is provable. The check a serious agent runs before moving real money. Onyx never takes custody. (price: $0.25 USDC, tier: premium)

ParametersJSON Schema
NameRequiredDescriptionDefault
recipientYes0x recipient address the agent is about to pay (Base).
amount_usdcYesAmount about to be sent, in USDC. Drives both the risk threshold and the take-rate quote.
contract_addressNoOptional. If the payment interacts with a contract, its 0x address — triggers a full contract audit.
counterparty_agent_idNoOptional. If paying another AI agent, its ERC-8004 id — triggers a reputation check.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses key behaviors: returns PASS/REVIEW/FAIL verdict, risk score, take-rate quote; uses Ed25519 signing; Onyx never takes custody; cost is $0.25 USDC. However, it omits idempotency, side effects, or authentication requirements, but provides good overall transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph that conveys necessary information but includes minor marketing fluff ('serious agent', 'full security stack') and pricing details. It is front-loaded with the purpose, but could be more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains return values (verdict, risk score, quote) and signing proof. It covers operational context (security stack, no custody, cost). Missing error handling or rate limits, but overall complete for a clearance tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining how optional parameters trigger additional checks (contract audit, reputation). It also clarifies that amount drives risk threshold and take-rate, which goes beyond the schema's descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Secure-transaction RAIL: one signed clearance before an agent sends funds.' It specifies the action of obtaining a security verdict for a payment, distinctively positioning itself among siblings as a pre-payment clearance check. The verb 'Give' plus inputs makes it concrete.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('before moving real money') but does not explicitly state when to use or avoid this tool versus siblings like onyx_approval_guard or onyx_tx_guard. No alternatives or exclusions are mentioned, leaving the agent to infer.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_security_flagsAInspect

Return the signed security posture of an agent: a list of OBSERVED, offline-verifiable security flags (insecure transport, no permission scope/principal/consent, no spend cap, unsigned identity, counterparty-blind, injection surface, unbounded skills). Pass an endpoint URL and/or the agent's record. Facts, not judgments — flags are conditions, never verdicts on intent. Ed25519-signed; verify free with onyx_attestation_verify. (price: $0.00 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
agentNoOptional: the agent's record (name, description, skills, and any permission fields) for a deeper scan
endpointNoThe agent's endpoint URL
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description fully carries the burden. It discloses output nature (list of flags), signing (Ed25519), and verifiability. Does not mention read-only or safety aspects, but the context implies it's a read operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is fairly long but well-structured and free of fluff. Each sentence adds value: purpose, parameters, interpretation, signing, price. Could be slightly more concise but remains effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, so description compensates by explaining the output type and content. Covers key aspects: input options, output nature, verification path. Missing potential error conditions or response format details, but sufficient for complex tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and description adds value by explaining the purpose of each parameter (endpoint for fetching, agent record for deeper scan). Provides more context than the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool returns signed security flags for an agent, listing specific examples. It distinguishes from sibling tools by focusing on offline-verifiable conditions rather than verification or scoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explains when to use: pass endpoint and/or agent record. Provides guidance on interpretation (facts vs judgments). Mentions verification via sibling tool. Lacks explicit when-not-to-use or alternative comparisons.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_signature_guardAInspect

Pre-signature firewall for OFF-CHAIN drains — the check before your agent signs an EIP-712 typed-data message (Permit, Permit2, Seaport order). These drain a wallet with no on-chain approval: the signature itself is the authorization. Give the typed-data; Onyx identifies what it authorizes, flags unlimited token-permit values, EOA/unverified spenders, NFT-order signatures, and bad deadlines, and returns a SIGNED ALLOW/REVIEW/BLOCK + plain-English explanation. Covers the #2 wallet-drain vector that on-chain tx checks miss entirely. (price: $0.10 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
typed_dataYesThe full EIP-712 typed-data object the agent is about to sign: {domain, primaryType, types, message}.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description fully covers behavior: it flags unlimited permits, EOA spenders, NFT-order signatures, etc., and returns ALLOW/REVIEW/BLOCK with explanation, plus discloses pricing and tier.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-organized and front-loaded with the key purpose, but contains some extra details (e.g., the #2 wallet-drain vector) that could be condensed. Still, every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With only one parameter and no output schema, the description fully explains the tool's purpose, inputs, checks performed, and output format (ALLOW/REVIEW/BLOCK + explanation), making it complete for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already describes the typed_data parameter fully. The description adds context about what the check does but does not add new information about the parameter itself beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's a 'Pre-signature firewall for OFF-CHAIN drains' specifically for EIP-712 typed-data messages, distinguishing it from sibling tools that cover on-chain transactions or other checks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It explicitly says to use it 'before your agent signs an EIP-712 typed-data message' and lists what it checks, but does not provide explicit instructions on when not to use it or contrast with specific sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_token_riskAInspect

Signed token-security oracle. Give a token contract (and chain); get the real on-chain risk facts as read right now — honeypot status, buy/sell tax, mintable, ownership-reclaim, transfer-pausable, proxy, LP-lock, holder count — plus a transparent 0-100 risk score and verdict you can recompute from the itemized factors. Ed25519-signed + timestamped so an agent can PROVE it screened a token before buying. Use before any agent swaps into or quotes an ERC-20 it didn't issue. (price: $0.10 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
chainNoChain the token lives on. Name ('base','ethereum','bsc','polygon','arbitrum','optimism','avalanche') or numeric chain id. Default 'base'.base
contractYesToken contract address (0x… 20-byte hex) to screen.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description fully discloses behavior. It details all risk factors provided (honeypot, buy/sell tax, mintable, ownership-reclaim, transfer-pausable, proxy, LP-lock, holder count) and the nature of output (Ed25519-signed, timestamped, risk score and verdict recomputable from factors). It also mentions pricing and tier, offering complete transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the tool's purpose. It contains multiple sentences but each adds necessary detail. Slightly longer than ideal, but the information density is high and the flow is logical.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of an output schema, the description covers the output comprehensively by listing risk factors and mentioning the signed, timestamped format. It explains how the risk score can be recomputed. It could be more specific about output format, but it is sufficient for an AI agent to understand the tool's capabilities.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds value by clarifying that 'chain' accepts names or numeric IDs and defaults to 'base', and that 'contract' is a 20-byte hex address. It also explains the purpose of both parameters beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as a token-security oracle that provides on-chain risk facts. It lists specific data points (honeypot, tax, mintable, etc.) and explicitly states the use case: 'Use before any agent swaps into or quotes an ERC-20 it didn't issue.' This distinguishes it from siblings like 'onyx_base_token_risk_scan' by emphasizing signed and timestamped outputs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear usage directive: use before swapping or quoting an unrecognized ERC-20. It implies when to use but does not explicitly mention when not to use or alternatives. However, the context of signing and recomputation differentiates it from similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_track_recordAInspect

Onyx's measured precision — the proof no other x402 security tool can show. Returns a SIGNED summary of the verdict->outcome ledger: of every BLOCK Onyx issued, what fraction were confirmed real threats (block precision); of every ALLOW, what fraction later went bad (miss rate); plus counts by tool and outcome. Built from real reported outcomes via onyx_outcome_report. Free and public — this is how you verify Onyx is calibrated, not just confident. (price: $0 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
toolNoOptional. Restrict the track record to one tool (e.g. onyx_tx_guard).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It reveals the tool is read-only, free, and public, but lacks explicit statements about idempotence, authentication needs, or rate limits. The marketing language reduces clarity on behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is overly verbose with marketing fluff like 'Onyx's measured precision — the proof no other x402 security tool can show.' It could be condensed to one or two factual sentences without losing meaning.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has one optional parameter and no output schema, the description adequately explains what it returns (signed summary, precision, miss rate, counts) and data source. It is sufficient for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'tool' is described as optional with an example. Schema coverage is 100%. The description adds context by mentioning 'counts by tool and outcome', which explains the parameter's purpose. No further detail needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns a signed summary of verdict-to-outcome ledger, including block precision, miss rate, and counts. It distinguishes itself as the tool for verifying Onyx calibration, setting it apart from sibling tools that focus on individual actions or reports.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use when wanting to verify Onyx's calibration and notes it is free and public. However, it does not explicitly state when to use this tool versus alternatives, nor does it provide when-not-to-use conditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_tx_guardAInspect

Pre-payment security firewall. Give the recipient address your agent is about to pay (Base); get a SIGNED ALLOW/REVIEW/BLOCK verdict + risk score from real on-chain checks: EOA-vs-contract, contract code/verification, account age (tx count), funding history, burn/null-address guard, and sink/honeypot heuristics. Catches paying a brand-new, unverified, or drain-shaped recipient BEFORE the money leaves. Never guesses — every field is observed on-chain and Ed25519-signed. (price: $0.10 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
addressYes0x recipient address your agent is about to send funds to (Base mainnet).
amount_usdcNoOptional amount about to be sent (USDC). Larger amounts raise the review threshold.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description fully discloses behavior: it performs on-chain checks (EOA vs contract, account age, funding history, burn/null-address, sink/honeypot heuristics) and returns a signed verdict. It explicitly states 'Never guesses' and that every field is observed on-chain and Ed25519-signed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately concise, effectively front-loading the purpose and then listing the checks. Each sentence adds value, though some redundancy exists (e.g., 'pre-payment' and 'before the money leaves'). Slight improvement possible by trimming repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no output schema, the description adequately explains the return value (signed allow/review/block verdict + risk score). It also includes pricing and tier information. The tool's complexity is high, but the description covers all necessary aspects for agent selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already describes both parameters (address and amount_usdc) with 100% coverage. The description adds no additional semantic detail beyond the schema, only reinforcing the context of 'recipient address your agent is about to pay (Base)'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as a 'Pre-payment security firewall' that provides a signed verdict and risk score for a recipient address. It distinguishes itself from sibling tools by focusing on on-chain security checks before payment.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context on when to use: before paying a recipient address on Base to get a signed allow/review/block verdict. It does not explicitly state when not to use or list alternatives, but the purpose is well-scoped.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_tx_preflightAInspect

Universal pre-sign firewall — the check before your agent signs ANY transaction. Give the tx (to, data, value); Onyx decodes the 4-byte selector + args, tells you in plain terms what it does, and flags the wallet-drain patterns: unlimited approve(), setApprovalForAll (blanket NFT approval), transfers to fresh/EOA recipients, raw ETH sends to unknown addresses, and calls to unverified targets. Returns a SIGNED ALLOW/REVIEW/BLOCK + human explanation. The single highest-frequency safety gate an on-chain agent has — every signed tx should pass through it. (price: $0.10 USDC, tier: metered)

ParametersJSON Schema
NameRequiredDescriptionDefault
toYesThe transaction's `to` address (target contract or recipient). Required.
dataNoThe transaction calldata (0x-hex). Omit/empty for a plain ETH transfer.
value_weiNoOptional. ETH value being sent, in wei (base-10 string).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses that the tool decodes the 4-byte selector and args, identifies wallet-drain patterns, and returns a verdict (ALLOW/REVIEW/BLOCK) with explanation. It also mentions pricing ($0.10 USDC) and tier (metered). It does not cover edge cases like unrecognized selectors, but overall transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at ~100 words, front-loaded with the core purpose, and efficiently covers functionality, patterns flagged, and return type without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and no output schema, the description adequately explains the output (SIGNED ALLOW/REVIEW/BLOCK + human explanation). It covers the main behavior and distinguishes it from many transaction-related siblings. Minor omission: no details on explanation structure or verdict triggers.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The description reinforces the three parameters (to, data, value_wei) in context but adds no new constraints or formatting details beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool as a 'Universal pre-sign firewall' that checks transactions before signing, decodes them, and flags dangerous patterns like unlimited approve and blanket NFT approvals. It distinguishes from siblings (e.g., onyx_base_tx_decode) by emphasizing its safety focus and specific patterns detected.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It explicitly says 'the check before your agent signs ANY transaction' and recommends every signed tx should pass through it. This provides clear context for when to use it, though it does not explicitly exclude alternative tools or describe when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_verified_issueAInspect

Get your domain Onyx Verified. Onyx runs its published objective checks (live TLS, reachability, no off-domain redirect, registration age disclosed) and, on pass, issues an Ed25519-signed verified record onto the public Observation Log — instantly queryable at /merchant/{domain} — plus a live badge (served by Onyx, so it can't be faked or staled) and a machine-readable status URL agents check before they pay you. Valid 90 days, renewable. This attests your domain PASSED published checks, like a CA certificate — Onyx never claims you are 'honest' or 'safe'; the value is a neutral, verifiable, public presence agents can trust. (price: $2.00 USDC, tier: premium)

ParametersJSON Schema
NameRequiredDescriptionDefault
domainYesThe domain to verify, e.g. shop.example.com (your storefront/agent endpoint).
contactNoOptional contact (email/handle) recorded with the issuance for renewal notices.
agent_idNoOptional agent id to cross-link this verified domain to an agent identity.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully covers behavioral aspects: it details the verification process, issuance of signed records, badge, status URL, validity period, renewal, and pricing. It also explicitly states limitations (does not claim honest/safe), ensuring transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the main action. It is slightly lengthy but every sentence adds value, explaining the process, outcome, and caveats. Could be slightly more concise but remains effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description comprehensively explains the return values (signed record, badge, status URL) and provides context on validity, renewal, and pricing. It covers all necessary aspects for an agent to invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds minor context for the 'domain' parameter (examples) but does not significantly enhance meaning beyond the schema for the optional parameters 'contact' and 'agent_id'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to get a domain Onyx Verified, with specific actions like running checks and issuing a signed record. It uses strong verbs and specifies the resource (domain), distinguishing it from sibling tools through unique functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit context on when to use the tool (for domain verification) and clarifies what the verification does and does not imply (neutral attestation, not honesty). However, it does not explicitly state when not to use or mention alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_verify_explainAInspect

Diagnose a failing x402 v2 /verify. Decodes a captured X-PAYMENT header, runs 10 rules (decode, schema, network/asset/payTo match, value sufficiency, EIP-3009 timing, signature shape, scheme) against expected paymentRequirements, and returns the FIRST failing rule with a plain-English fix. Catches the common case where the facilitator returns bare HTTP 402 (no body) because of JWT or schema fail upstream of the verifier. Stdlib-only, no install, no network. (price: $0 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
now_unixNoOverride current unix time for replay/CI use. Defaults to now.
x_payment_b64NoBase64-encoded X-PAYMENT (v2 PAYMENT-SIGNATURE) header value. Optional if payment_payload provided.
payment_payloadNoDecoded payment payload dict. Use this OR x_payment_b64.
payment_requirementsYesExpected paymentRequirements from the 402 challenge ({scheme, network, payTo, asset, maxAmountRequired, maxTimeoutSeconds, ...}).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes behavior: decodes header, runs 10 rules, returns first failing rule with fix. Discloses stdlib-only, no network, free tier. No annotations to contradict; covers key traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences without filler. Front-loaded purpose, then rules, then specific caveat. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers what it does, what it returns (first failing rule), and a common scenario. No output schema but return behavior implied. Adequate for a diagnostic tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; description adds context linking parameters to rules (e.g., payment_requirements used for matching). Does not repeat schema but provides useful relational info, just above baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool diagnoses a failing x402 v2 /verify by decoding headers and running 10 rules. Differentiates from sibling tools like onyx_x402_simulate by focusing on debugging the exact failure.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides context on when to use (failing /verify) and highlights a common case (bare 402 from JWT/schema fail). Lacks explicit 'do not use' guidance but is clear enough for most scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

onyx_x402_receipt_verifyAInspect

Verify an x402 USDC settlement on Base or Base Sepolia. Given a tx hash, decodes the USDC Transfer log and confirms (or refutes) a claim of the form: 'tx X moved $Y USDC from A to B'. Returns success status, actual decoded values, and a clear discrepancy report if any field doesn't match. Free tier — useful for agents reconciling spend and operators auditing inbound payments. (price: $0 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
networkNoChain to query. Must match where the tx was mined.base
tx_hashYes0x-prefixed 32-byte tx hash to verify.
expected_toNoOptional. Expected recipient address (0x...). If provided, verifier checks Transfer.to matches.
expected_fromNoOptional. Expected sender address (0x...). If provided, verifier checks Transfer.from matches.
expected_amount_usdcNoOptional. Expected USDC amount (whole USDC, not atomic). If provided, verifier checks Transfer.value matches (within 0.000001 tolerance).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses the core behavior ('decodes the USDC Transfer log and confirms/refutes a claim'), the output format (success status, decoded values, discrepancy report), and that it's free. It doesn't mention side effects, but the tool is inherently read-only.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (3-4 sentences) and front-loaded with the core purpose, followed by details and use case. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description fully covers what the tool does, how it works (decoding log, returning discrepancy report), and its output. Without an output schema, it provides enough information for an agent to understand the response format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so the description adds minimal new information beyond what the schema already provides. The schema already includes details like tolerance for expected_amount_usdc. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Verify an x402 USDC settlement'), the specific resource (x402 receipt on Base/Base Sepolia), and the method (given a tx hash). It distinguishes itself from sibling x402 tools by focusing on receipt verification.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit use cases ('agents reconciling spend and operators auditing inbound payments') and context (free tier). While it doesn't explicitly state when not to use or name alternatives, the sibling list includes other x402 tools that might be more suitable for other tasks, so inference is possible.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rhinogent_identityAInspect

Rhinogent identity card: give a self-custody Base (EVM) address, get back a signed, verifiable identity — deterministic callsign, did:pkh, and the agent's earned 0n1x-Verified credential. Keys never leave the agent; this issues only the public signed identity. The front door of the agent identity wallet. (price: $0.00 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
addressYesThe agent's self-custody Base/EVM address, 0x-prefixed (42 chars).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description bears full burden. States it is a public signed identity issuance, mentions deterministic callsign, and notes 'price: $0.00 USDC, tier: free'. However, lacks disclosure on behavior for duplicate addresses, error handling, or idempotency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is front-loaded with core purpose, followed by details and pricing. Four concise sentences efficiently convey the tool's function and constraints without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, but the description adequately explains the return value (signed identity with callsign, did:pkh, credential). Price and tier are included. For a simple single-parameter tool, context is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single 'address' parameter. The description reinforces the parameter's role ('give a self-custody Base address') but does not add new semantic information beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool's action: taking an EVM address and returning a signed identity with deterministic callsign, did:pkh, and verified credential. Distinguishes itself from siblings like 'rhinogent_mandate' by calling itself 'the front door of the agent identity wallet'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly specifies the input (self-custody Base/EVM address) and the output (signed identity). Implicitly suggests use when needing a verifiable on-chain identity, but does not explicitly state when not to use or list alternative tools. Provides context that keys never leave the agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rhinogent_mandateAInspect

Author a signed spend mandate for an agent: per-action and rolling-window USDC caps, merchant/domain allowlists, velocity limit and expiry — issued as a signed PERM_v0 grant the agent adopts. 'Transacts autonomously, within the boundaries you define.' (price: $0.00 USDC, tier: free)

ParametersJSON Schema
NameRequiredDescriptionDefault
agentYesAgent the mandate authorizes (address or callsign).
purposeNoHuman-readable purpose of the mandate.
principalNoOptional: the human/entity granting authority.
expires_atNoUnix time the mandate lapses.
velocity_maxNoMax number of transactions per window.
daily_cap_usdcNoRolling 24h aggregate cap (USDC).
spend_max_usdcNoHard ceiling per single action (USDC).
allowed_actionsNoPermitted action verbs (deny by default).
allowed_domainsNoDomain allowlist (deny by default).
allowed_merchantsNoMerchant allowlist (deny by default).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavior. It states the outcome (signed PERM_v0 grant) and includes price/tier, but does not detail side effects, permissions needed, or whether the action is reversible. More behavioral context (e.g., on-chain vs off-chain) would be beneficial.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise: two sentences plus a tagline and price/tier. It front-loads the main action and key features. The tagline is slightly redundant but harmless. The price/tier info is useful but could be separated.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 10 parameters, no output schema, and no annotations, the description explains the purpose and constraints well but lacks details on the return format (e.g., what the signed grant looks like) and prerequisites (e.g., agent must exist). It covers the main concept but leaves gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds value by summarizing parameters into categories (e.g., 'rolling-window USDC caps' for daily_cap_usdc and spend_max_usdc, 'merchant/domain allowlists' for allowed_merchants/allowed_domains), providing a higher-level understanding beyond individual schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Author a signed spend mandate') and clearly identifies the resource (mandate for an agent). It lists key constraints like caps, allowlists, velocity, and expiry, distinguishing it from sibling tools like onyx_perm_grant or rhinogent_identity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage (define spending boundaries for an agent) but does not explicitly state when to use this tool versus alternatives or provide exclusions. The tagline 'Transacts autonomously, within the boundaries you define' gives context but lacks direct comparison to siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rhinogent_verify_counterpartyAInspect

Know-your-counterparty for agents. Before your agent pays a merchant, get Ed25519-signed raw facts about who it's paying: domain registration age, live TLS cert, reachability + off-domain redirects, brand-impersonation similarity, and observed price vs expected. Facts only (never 'legit/scam') — the one check every other agent wallet skips. (price: $0.25 USDC, tier: premium)

ParametersJSON Schema
NameRequiredDescriptionDefault
brandNoOptional brand the storefront claims to be (enables the impersonation-similarity observation).
domainYesCounterparty/storefront domain or URL the agent is about to pay.
expected_priceNoOptional price the agent expects to pay (enables the price-deviation observation).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description should fully disclose behavior. It correctly states the tool returns 'facts only (never 'legit/scam')' and mentions signed output, but does not discuss authentication requirements, rate limits, or any potential side effects, leaving some behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, with two sentences that front-load the core purpose and list key facts. Every phrase adds value, including the pricing note, with no redundant or verbose language.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description adequately explains what the tool returns (a set of signed facts) and covers all three parameters. It could further elaborate on return format or interpretation, but the current level is sufficient for an agent to decide to call the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover 100% of parameters, so the baseline is 3. The description reinforces the meaning of 'brand' and 'expected_price' by linking them to observations (impersonation-similarity and price deviation), but adds no additional semantic information beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool as providing Ed25519-signed raw facts about a counterparty before payment. It lists specific observations (domain registration age, TLS cert, reachability, etc.) and distinguishes itself by stating it's a check other wallets skip, differentiating from sibling tools like onyx_merchant_fact_check.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states to use this tool 'before your agent pays a merchant,' providing clear when-to-use context. However, it omits when-not-to-use guidance or explicit alternatives, relying on the implied uniqueness.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.