onyx-mcp
Server Details
Ed25519-signed ground-truth oracle — 63 paid x402 tools live on Base mainnet.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- dimitrilaouanis-tech/onyx-mcp
- GitHub Stars
- 0
- Server Listing
- onyx-paid-mcp
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.3/5 across 87 of 87 tools scored. Lowest: 3.7/5.
Each tool targets a distinct operation within its domain. For example, Base tools cover contract verification, DEX pairs, event logs, swap quotes, token risk, transaction decoding, explaining, and simulation — all clearly separate. Agent tools similarly differentiate identity, reputation, audit trail, budget tracking, and workflow. Even overlapping security tools (tx_guard vs tx_preflight) have distinct scopes.
All tools follow the consistent pattern 'onyx_<domain>_<action>' in snake_case (e.g., onyx_base_swap_quote, onyx_solana_token_metadata). There are no deviations or mixed conventions, making the tool set predictable and easy to navigate.
With 87 tools, the count is extremely high for a single server. While the tools are organized by domain, the sheer volume overburdens agents with selection complexity. Many tools are trivial utilities (URL parsing, hashing) that could be consolidated or omitted, making the set feel bloated.
The tool set covers a vast and coherent surface for x402 ecosystem operations: agent identity and reputation, blockchain interactions on Base and Solana, browser automation, security firewalls, domain/email utilities, and full x402 protocol support. There are no obvious gaps for the intended domain.
Available Tools
87 toolsonyx_agent_audit_trailAInspect
Full payment + action audit trail for any agent wallet on Base. Returns every USDC outflow with resolved x402 destination, tool name where known, timestamp, tx hash, cumulative spend, velocity, and behavioral risk flags. The audit log every agent operator needs — what has my agent actually been paying for and when. Powers compliance, ops review, anomaly detect. (price: $0.05 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max events to return (truncated newest-first). | |
| wallet | Yes | Agent wallet address on Base (0x... 20-byte hex). | |
| lookback_blocks | No | Block range to scan (Base is ~2s/block → 5000 blocks ≈ 2.8h, 10000 ≈ 5.5h, 50000 ≈ 28h). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description claims 'Returns every USDC outflow,' but the input schema includes a limit parameter (default 30, max 100) that truncates results, creating a potential contradiction. No annotations are provided, so the description carries full burden; it does not disclose the truncation behavior, rate limits, or whether the tool is read-only. The lack of transparency about limitations lowers the score.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively concise at about 50 words, front-loading the main action and key outputs. It includes use cases and pricing, which adds value. A slightly tighter structure could remove marketing fluff, but it remains efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description lists the exact fields returned (destination, tool name, timestamp, tx hash, etc.), which partially compensates for the lack of an output schema. However, it does not specify the return format (e.g., array of objects) or how to interpret risk flags, leaving some ambiguity. Given the tool's complexity, a bit more structure would improve completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the schema already describes all three parameters (wallet, limit, lookback_blocks) with clear documentation. The description only adds context about the tool's overall output but does not enhance understanding of individual parameters beyond the schema. Thus, baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns a full audit trail for any agent wallet on Base, listing specific data fields (USDC outflow, destination, tool name, etc.). The verb 'Returns' and resource 'audit trail' are unambiguous. It distinguishes itself from siblings like onyx_agent_budget_tracker by focusing on historical transaction records rather than budget management.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use the tool: for compliance, ops review, and anomaly detection. It says 'every agent operator needs' this audit log. However, it does not explicitly state when not to use it or mention alternative tools, which would improve guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_agent_budget_trackerAInspect
Per-wallet USDC spend tracker. Given a wallet address and direction (outflows / inflows / both), scans USDC Transfer events on Base + Sepolia and returns: total volume, settlement count, top recipients with cumulative spend, hourly histogram of recent activity, average ticket size. Free tier — extension of onyx_agent_id. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| direction | No | outflows = wallet as sender, inflows = wallet as recipient, both = aggregated. | outflows |
| wallet_address | Yes | 0x-prefixed EVM wallet to inspect. | |
| include_sepolia | No | Include Base Sepolia testnet activity. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It states it scans events (read-only) and returns specific data. However, it does not disclose limitations like historical depth, pagination, rate limits, or whether authentication is required. The 'Free tier' note adds context but omits behavioral details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise with two sentences: the first covers purpose and return data, the second adds price/tier info. No redundant words, and it's front-loaded with the essential action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and no annotations, the description is fairly complete: it lists all return fields (volume, count, top recipients, histogram, ticket size) and covers key parameters. Missing details like pagination or historical range, but acceptable for a simple tracker.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3. The description reinforces the parameter meanings (wallet address, direction enum, include_sepolia) but does not add new information beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it's a USDC spend tracker per wallet, scanning Transfer events on Base and Sepolia, and lists the specific return data (total volume, settlement count, top recipients, etc.). This distinguishes it from sibling tools that focus on token risk, transaction decoding, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage by stating 'Given a wallet address and direction', but lacks explicit guidance on when to use this tool versus alternatives like onyx_base_token_risk_scan or onyx_solana_wallet_activity. No when-not or alternative tool references are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_agent_idAInspect
Look up an agent (EVM wallet) and return a reputation card: x402-style USDC settlements in the last ~24h window (50k Base blocks), distinct recipients (paid-service operators), networks used, total volume, and 0-100 reputation score with reasoning. Reads Base + Base Sepolia public RPCs (no key). Free tier — useful for tools deciding rate limits, returning-customer discounts, or trust extension. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| wallet_address | Yes | 0x-prefixed EVM wallet address of the agent to look up. | |
| include_sepolia | No | Include Base Sepolia activity in the score (test traffic). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses key behaviors: reads public RPCs (Base + Base Sepolia), no API key needed, free tier, and returns a reputation card with specific metrics. It does not mention destructive actions, consistent with a read-only tool. Score slightly limited because it doesn't detail data freshness or caching behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the primary action, lists output components, and then provides technical details (RPCs, keyless) and use cases. Every sentence adds value; no wasted words. Length is appropriate for the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains the return value (reputation card with USDC settlements, recipients, volume, score) and the source (Base RPCs). It also covers parameter semantics and use cases, making the tool fully understandable for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds minimal extra meaning: '0x-prefixed EVM wallet address' for wallet_address and 'Include Base Sepolia activity in the score (test traffic)' for include_sepolia, but these largely restate the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Look up an agent (EVM wallet) and return a reputation card'. It specifies key metrics (USDC settlements, recipients, volume, reputation score) and distinguishes itself from siblings like onyx_agent_audit_trail by focusing on reputation rather than history or budgets.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage context: 'useful for tools deciding rate limits, returning-customer discounts, or trust extension'. It implies when to use it but does not explicitly list when not to use or compare to alternatives, though the sibling list makes differentiation possible.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_agent_reputationAInspect
Vet another AI agent before you trust it — via the live ERC-8004 registries on Base. Give an agent's ERC-8004 id; get its on-chain identity (is it registered? owner), its verified receiving wallet, its AgentCard URI, and its reputation summary (feedback count + aggregate score) — Ed25519-signed. Onyx ATTESTS the on-chain facts; the TRUSTED / NEW / CAUTION / UNKNOWN rating + 0-100 score are Onyx's OPINION over those facts via a disclosed methodology (Moody's-style), not an objective ruling. The check an agent runs on a counterparty agent before paying, delegating, or accepting its output. Unregistered = unverifiable. (price: $0.25 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| agent_id | Yes | The agent's ERC-8004 identity id (the ERC-721 tokenId in the IdentityRegistry). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavioral traits: explains that on-chain facts are attested by Onyx and ratings are opinion (not objective), mentions Ed25519-signed output, and lists pricing ($0.25 USDC) and tier (metered). This is comprehensive and honest.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose and each sentence adds value: outputs, signing, opinion disclosure, usage context, unregistered warning, and pricing. No fluff or redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers all necessary context: input (agent_id), output (identity, wallet, AgentCard URI, reputation summary with rating and score), behavioral notes (signed, opinion), and usage guidance. Without an output schema, it fully explains the return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already describes agent_id as 'the ERC-8004 identity id (the ERC-721 tokenId in the IdentityRegistry).' The description adds no further detail beyond reinforcing the parameter's role, so it meets but does not exceed the baseline for 100% schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Vet another AI agent before you trust it' via ERC-8004 registries. It lists specific outputs (on-chain identity, wallet, AgentCard URI, reputation summary) and distinguishes from siblings like onyx_agent_id by focusing on reputation assessment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says when to use: 'before paying, delegating, or accepting its output.' It also notes 'Unregistered = unverifiable.' However, it does not mention alternatives or when not to use, missing a chance to differentiate from similar tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_agent_workflowAInspect
Run a multi-step workflow across Onyx tools in one paid call. Each step names a tool and its args; later steps can reference earlier outputs via {"$ref": "step_N.field"} or {"$prev": "field"}. Saves agents the round-trip + per-call gas of N separate x402 settles when they know the chain in advance — e.g. validate email → check domain DNS → solve captcha → submit form, all atomic. Stops on first step error and returns partial results. Cheaper than the unit-call sum because it bundles. (price: $0.020 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| steps | Yes | Ordered list of {tool, args} |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses key behaviors: stops on first error and returns partial results, cheaper than separate calls, atomic execution, pricing, and referencing syntax ($ref, $prev). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose, followed by syntax, benefits, example, error behavior, and pricing. Every sentence adds unique value, no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no output schema, the description covers input format, referencing, error handling, and pricing. It omits the success return structure, but given the complexity, it is largely complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. The description adds value by explaining that steps can reference earlier outputs via $ref and $prev, and that it stops on first error, going beyond the schema's minimal description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('run'), resource ('multi-step workflow across Onyx tools'), and key feature ('in one paid call'). It distinguishes from sibling tools by emphasizing orchestration of multiple tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context for when to use: 'Saves agents the round-trip + per-call gas...when they know the chain in advance' and gives an example. Lacks explicit when-not-to-use or alternatives, but the guidance is specific.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_ai_visibilityAInspect
AI answer-engine visibility (GEO) oracle. Give a brand/product (+ optional category and competitors); get a SIGNED reading of how a live web-grounded answer engine represents it right now — presence, whether it's in the 'best ' recommendation set, sentiment, share-of-voice vs competitors, the cited sources driving the narrative, and a 0-100 visibility score. The new SEO, as one per-call x402 tool. Never fabricated. (price: $0.20 USDC, tier: premium)
| Name | Required | Description | Default |
|---|---|---|---|
| brand | Yes | Brand, product, company, or entity to measure (e.g. 'Onyx Protocol', 'Stripe'). | |
| category | No | Optional product category for the recommendation-set probe (e.g. 'AI agent payment rails', 'running shoes'). Drives the 'best <category>' query. | |
| competitors | No | Optional competitor names to compute share-of-voice against. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description bears the full burden. It describes the tool as a 'signed reading' and states 'Never fabricated', implying reliability. However, it does not disclose any limitations, rate limits, error handling, or whether the output is cached or real-time, which would be beneficial for a tool with no annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, consisting of two sentences plus parentheticals. It front-loads the purpose and key output details. The inclusion of price and tier is informative but slightly adds clutter; overall, it is well-structured with minimal waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description lists all expected output components (presence, recommendation set, sentiment, share-of-voice, cited sources, visibility score) adequately. It also explains the role of each input parameter. However, it could be more structured in describing the return format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with good descriptions. The tool description adds value by providing examples (e.g., 'Onyx Protocol', 'AI agent payment rails') and explaining how the category drives the 'best <category>' query, thus adding context beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it is an AI answer-engine visibility oracle, specifying the input (brand/product) and detailed outputs (presence, recommendation set, sentiment, share-of-voice, cited sources, visibility score). It distinguishes itself from siblings by calling it 'the new SEO' and 'one per-call x402 tool'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains what inputs are needed (brand, optional category, competitors) and what outputs are produced, but lacks explicit guidance on when to use this tool versus alternatives among the many sibling tools. It does not provide 'when not to use' or mention any prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_aml_screenAInspect
KYC/AML sanctions + risk screen for any EVM address. Returns OFAC sanctions hit (via Chainalysis on-chain oracle), 0-100 risk score, verdict (sanctioned/safe/caution/high_risk/blocked), and ranked risk_factors (address age, transaction throughput, contract status, mixer interaction). Designed for Permitted Payment Stablecoin Issuers, agent-payment platforms, and any compliance gate forced by the GENIUS Act (July 2026). Sub-second latency. Coinbase's PROJECT-IDEAS.md explicitly calls for this primitive. (price: $0.25 USDC, tier: premium)
| Name | Required | Description | Default |
|---|---|---|---|
| chain | No | Chain to screen on: 'base' (default) or 'ethereum'. Both query the same OFAC oracle; chain affects the risk-factor heuristics only. | base |
| address | Yes | 0x-prefixed EVM address to screen |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses the data source (Chainalysis oracle), return fields (sanctions hit, risk score, verdict, risk factors), performance (sub-second latency), and pricing ($0.25). It does not cover error handling or rate limits, but for a read-only screen, this is adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (5 sentences) and front-loaded: first sentence states purpose, then lists returns, use cases, performance, and pricing. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 2-parameter tool with no output schema, the description covers purpose, return fields, use cases, performance, and pricing. It lacks error handling details but is otherwise sufficient for an agent to invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters (chain and address). The description adds context about risk-factor heuristics but does not meaningfully extend parameter understanding beyond the schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it performs 'KYC/AML sanctions + risk screen for any EVM address', specifying the action (screen) and resource (EVM address). It distinguishes from siblings like token risk scans by explicitly targeting address-level compliance.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description identifies target users (Payment Stablecoin Issuers, agent-payment platforms) and regulatory context (GENIUS Act, July 2026), providing clear usage context. However, it does not explicitly state when not to use or list alternatives, though no sibling tool offers similar functionality.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_approval_guardAInspect
Pre-approval firewall — the safety check before your agent signs a token approve(). Give the spender (and optionally the amount + token); get a SIGNED ALLOW/REVIEW/BLOCK verdict: is the amount unlimited (the #1 drain vector — we recommend a finite amount instead)? is the spender a plain EOA (almost always a drainer)? is it a verified, established contract? Catches the malicious/unlimited approval that empties a wallet BEFORE the agent signs it. Every verdict Ed25519-signed. The highest-frequency safety call an on-chain agent makes. (price: $0.10 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| token | No | Optional. The ERC-20 token address being approved, for context. | |
| spender | Yes | 0x address that will receive spend approval (the `spender` arg of approve()). Required. | |
| amount_raw | No | Optional. The raw approval amount as a base-10 integer string (the uint the agent is about to approve). Pass it to detect unlimited/max approvals. Omit to assess the spender only. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description fully discloses behavior: returns Ed25519-signed verdict (ALLOW/REVIEW/BLOCK), detects unlimited approvals, EOA spender, contract verification. Also mentions pricing and tier.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Descriptive and well-structured, but could be slightly more concise. Front-loaded with purpose and safety context. Earns its length by covering critical details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, yet description explains the return value format (signed verdict). Covers all necessary aspects for effective use, including cost and tier.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds significant meaning: explains spender as the recipient of approval, token as context, amount_raw to detect unlimited/max approvals. Provides rationale for each parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb+resource: 'Pre-approval firewall' and 'safety check before signing approve()'. Distinguishes itself by focusing on token approval safety, not covered by siblings like onyx_tx_guard or onyx_signature_guard.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: before any approve() call. Describes what it checks (unlimited amount, EOA spender, verified contract). Lacks explicit alternatives but purpose is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_arb_finderAInspect
Price arbitrage between Onyx Actions and peer x402 services. For any capability (e.g. 'tx_explainer', 'captcha'), queries the full CDP discovery corpus, identifies matching peer endpoints, computes price delta vs the Onyx native tool, and produces a one-line competitive pitch ('Onyx is 50% cheaper than OATP at $0.05 vs $0.10'). Use for competitive intel, marketing copy, or pricing decisions. (price: $0.003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| network | No | Optional network filter: 'base', 'solana', 'eip155:8453'. Empty = all. | |
| max_peers | No | ||
| capability | Yes | Capability keyword to compare. E.g. 'tx_explainer', 'token_risk', 'captcha', 'swap_quote'. | |
| onyx_price_usdc | No | Onyx's price for this capability. Used as the comparison anchor. Default 0 = treat Onyx as if-we-shipped-free. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully carries the burden. It explains the process (queries CDP corpus, computes delta, produces pitch) and includes pricing info. Lacks details on data freshness or error handling, but sufficient for most agents.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise, front-loaded with purpose, no filler. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, description hints at return (one-line pitch). Parameter set is small and well-described in schema. Missing details on error cases or empty results, but overall satisfactory for the complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 75%, so baseline is 3. Description provides example capability values but does not elaborate on parameter meaning beyond the schema. Adequate but not added value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly defines the tool's purpose: price arbitrage between Onyx Actions and peer x402 services, with specific examples like 'tx_explainer' and output format.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states use cases: competitive intel, marketing copy, or pricing decisions. Does not mention when not to use or alternative tools, but gives clear context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_attestation_verifyAInspect
Verify an Onyx-signed security verdict. Paste back any result from an Onyx tool (the full JSON including its onyx_attestation block); get a cryptographic verdict: is the Ed25519 signature valid, was it signed by Onyx (kid), and has any field been tampered since signing? FREE. Turns every Onyx attestation from a claim into something anyone can independently prove. Cross-check the kid against /.well-known/onyx-pubkey. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| payload | Yes | The full Onyx-signed result to verify, including its onyx_attestation block (exactly as returned by an Onyx tool). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description explains the cryptographic checks (Ed25519 signature, kid, tamper detection) and declares the tool as 'FREE' with price $0. It lacks details on error handling or rate limits, but covers core behavior well.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and well-structured: starts with the verb 'Verify', explains usage, then the outcome, and ends with supplementary info (free, cross-check). Every sentence is purposeful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a single-parameter tool with no output schema, the description fully explains what the tool does, how to use it, what results to expect, and even provides a verification tip. It is contextually complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description reinforces the schema's parameter description and adds meaning by explaining what the verification yields (validity of signature, kid, tamper status). Since there is no output schema, this additional context is valuable.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool verifies Onyx-signed security verdicts, including cryptographic signature verification, key identification, and tamper detection. It distinguishes from siblings by being the sole verification tool for Onyx attestations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells users to 'paste back any result from an Onyx tool' and suggests cross-checking the kid against a public key. While it doesn't state when not to use or name alternatives, the context is clear that this tool follows attestation generation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_bridge_quoteAInspect
Cross-chain bridge quote starting from Base. Best-route across ~30 bridges (Across, Hop, Stargate, cBridge, Connext, Hyphen, Mayan, ...) via LI.FI aggregator. Returns toAmount, fee breakdown, gas cost, estimated bridge tool, approval address, ETA. Use when an agent on Base needs USDC/ETH/etc. on another chain. (price: $0.003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| to_token | Yes | Destination token address on destination chain (0x...). | |
| from_token | Yes | Source token address on Base (0x...). Use 0x0000000000000000000000000000000000000000 for native ETH. | |
| from_amount | Yes | Atomic amount on source side (decimal string). E.g. 100 USDC = '100000000'. | |
| to_chain_id | Yes | Destination chain ID. 1=Ethereum, 10=Optimism, 42161=Arbitrum, 137=Polygon, 56=BSC, 43114=Avalanche, 250=Fantom, 8453=Base (same chain - use swap_quote instead). | |
| from_address | No | Optional: sender address for routes that need it. Default = 0x...0001 (LI.FI accepts any for quote). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It describes return values (toAmount, fee breakdown, gas cost, etc.) and mentions aggregator and bridges. However, it does not explicitly state that this is a read-only quote (no side effects) or disclose potential limitations (e.g., rate limits, authentication needs). Adequate but not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: two sentences plus pricing note. It front-loads the core purpose. The pricing information could be placed elsewhere, but does not detract significantly. Efficient use of words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains return values and the aggregator used. It covers the main purpose and usage scenario. Could mention that this is only a quote and not the actual bridge execution, but overall complete for a quote tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are already well-documented. The description adds no additional parameter-level semantics beyond what the schema provides. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it provides a cross-chain bridge quote starting from Base, using LI.FI aggregator across ~30 bridges. It distinguishes from sibling onyx_base_swap_quote by specifying 'starting from Base' and being a bridge quote, not a swap. The return values are enumerated, making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use when an agent on Base needs USDC/ETH/etc. on another chain.' This provides direct usage context. It does not explicitly state when NOT to use, but the sibling list includes onyx_base_swap_quote for same-chain swaps, implying the exclusion. Slightly lacking explicit alternative mention.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_contract_verifyAInspect
Contract verification + ABI metadata for any Base address. Returns is_verified, contract name, compiler version, language, optimization, ABI entry count, license, source code size. Auto-detects EIP-1967/OZ/UUPS proxies and resolves to the implementation contract. Backed by Blockscout (free, no auth). Use before any swap or interaction — unverified contracts are an instant red flag. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| address | Yes | Contract address on Base mainnet (0x... 20-byte hex). | |
| resolve_proxy | No | If true and address is a proxy, also fetch the implementation contract's metadata. | |
| include_source | No | If true, return full source code (can be 10s of KB). Otherwise just the byte length. | |
| include_full_abi | No | If true, return the full ABI array (can be large). Otherwise just the entry count + function/event name list. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It reveals important behaviors: auto-detects proxy types, resolves to implementation, backed by Blockscout (free, no auth), and pricing tier. It does not discuss rate limits or error handling, but core behavioral traits are covered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each earning its place: first states output, second explains backend and proxy feature, third gives usage advice and cost. No fluff, front-loaded with purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers purpose, usage, features, pricing, and lists return fields despite no output schema. It omits error handling and doesn't detail the ABI entry count vs full ABI distinction, but overall completeness is high for a verification tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds value beyond schema by explaining proxy detection relevance (resolve_proxy), warning about source code size (include_source), and mentioning that include_full_abi can be large. This context aids parameter selection.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Contract verification + ABI metadata for any Base address.' It lists specific return fields and features like proxy detection, distinguishing it from sibling tools that handle trading, risk, or transactions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises using this tool 'before any swap or interaction' and highlights that unverified contracts are a red flag. It provides context on free access and pricing but does not explicitly mention when not to use or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_dex_pair_lookupAInspect
Every DEX pair for a Base token: DEX name, price USD, 24h volume, liquidity USD, price-change percentages (5m/1h/6h/24h), pool fees. Sorted by liquidity. Backed by DexScreener (free). Use to find where a token is actually trading before routing a swap or assessing rug risk via volume/liquidity ratios. (price: $0.0015 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max pairs to return (sorted by liquidity desc). | |
| token_address | Yes | Token contract address on Base mainnet (0x...). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses the data source (DexScreener, free), sorting, and exact fields returned, implying a read-only operation. It lacks explicit mention of rate limits or auth, but the tone and content are sufficient for typical usage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (four sentences), front-loaded with the core functionality and use cases. Every sentence adds value, including the pricing note which is non-essential but informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains return values and sorting. It lacks details on error handling or timeout behavior, but for a simple lookup tool this is acceptable and matches the complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The tool description does not add additional parameter semantics beyond what is already in the schema (token_address and limit descriptions are self-explanatory). No extra value from description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it retrieves DEX pairs for a Base token with specific fields (DEX name, price, volume, liquidity, price change, fees) sorted by liquidity. It distinguishes from siblings like onyx_base_swap_quote (for quotes) and onyx_base_token_risk_scan (for risk) by focusing on trading venue discovery.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises using this tool before routing a swap or assessing rug risk via volume/liquidity ratios, providing clear context. It does not, however, name specific alternatives or state when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_event_logsAInspect
Fetch contract event logs from Base mainnet via eth_getLogs. Returns structured logs with topics, raw data, block+tx info, plus optional event-signature decode for common ERC-20/721/1155 events (Transfer, Approval, OwnershipTransferred). Supports block range filter (default last 100 blocks) and topic-0 filter for narrowing to specific events. (price: $0.003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Max log entries to return. | |
| topic0 | No | Optional event signature hash (32-byte hex) to filter on. E.g. Transfer = 0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef. | |
| address | Yes | Contract address (0x... 20-byte hex) to fetch logs for. | |
| to_block | No | End block. Default = 'latest'. | |
| from_block | No | Start block: hex ('0x12345'), decimal ('1234567'), or 'latest'. Default = latest - 100. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations are provided, the description carries the burden. It discloses the read nature (fetch), return structure (logs with topics, raw data, block+tx info), and optional decoding for common ERC events. It also mentions pricing and tier. It does not mention rate limits or authentication, but overall is transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two sentences plus a pricing note. It is front-loaded with the main action and provides essential details without fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains return structure and supported features. It could mention pagination (limit parameter) more explicitly, but the schema covers it. Overall, it is complete for the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all 5 parameters. The description adds value by explaining the default block range (last 100 blocks) and that topic-0 filters for specific events, and that decoding applies to common ERC-20/721/1155 events.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it fetches contract event logs from Base mainnet using eth_getLogs, with specific features like block range and topic-0 filtering. It distinguishes itself from sibling tools as the only event log retrieval tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear guidance on when to use the tool (fetching event logs) and mentions features like block range and topic-0 filtering. However, it does not explicitly state when not to use it or compare to alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_swap_quoteAInspect
Best-route swap quote on Base across all major DEXes (Uniswap V2/V3, Aerodrome, BaseSwap, PancakeSwap, plus ~12 others) via KyberSwap aggregator. Returns amountOut, USD value, gas estimate, route hops, price impact. Same role as Jupiter on Solana — most agents need this before any on-chain swap. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| token_in | Yes | Address of input token (0x... on Base mainnet). Use 0x4200000000000000000000000000000000000006 for WETH. | |
| amount_in | Yes | Atomic input amount (decimal string). E.g. 1 ETH = '1000000000000000000'; 100 USDC = '100000000'. | |
| token_out | Yes | Address of output token (0x... on Base mainnet). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description must bear the burden. It mentions returning specific data (amountOut, etc.) and implicitly indicates read-only behavior via 'quote'. Could explicitly state no state changes, but the listed outputs and context suffice.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences plus a pricing note. Front-loaded with core function and DEX list. Every sentence adds value; no filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description lists key return fields (amountOut, USD value, gas estimate, etc.), covers inputs, and notes pricing. Complete for its complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions. The tool description adds value with examples (WETH address, atomic amount format) and clarifies that parameters are tokens. This goes beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it provides best-route swap quotes on Base across multiple DEXes, listing expected outputs. It distinguishes itself from the sibling onyx_solana_jupiter_quote by specifying 'on Base' and referencing Jupiter's Solana role.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'most agents need this before any on-chain swap', indicating when to use. Does not specify when not to use or list alternatives (e.g., onyx_base_bridge_quote for bridging), but the context is clear for a swap quote tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_token_risk_scanAInspect
Risk-scan any ERC-20 token on Base mainnet. Returns ownership status (renounced or active owner address), mint authority (still mintable?), top-1 / top-10 holder concentration via balanceOf probes, contract age in days, basic honeypot signal (eth_call swapExactETHForTokens against Aerodrome to detect transfer blocks), and a 0-100 risk score with verdict (safe / caution / high_risk). Use before a trading agent buys a freshly minted token — saves blowing the entire position on a rug. Direct equivalent of OATP's Solana token_risk_scan ($0.50, 980 unique paying agents). Onyx ships at $0.25 — first on Base mainnet. (price: $0.25 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| address | Yes | 0x-prefixed ERC-20 contract address on Base |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but the description details the checks performed (ownership, mint, concentration, age, honeypot via eth_call, risk score). It also mentions price and tier. Some limitations (e.g., only Base ERC-20) are implied but not exhaustive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a dense paragraph containing multiple pieces of information. While not overly verbose, it could be structured more concisely (e.g., bullet points) for faster agent parsing.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description adequately lists return values (ownership, mint, concentration, age, honeypot, risk score). It covers the tool's purpose and major capabilities, though exact output format is unspecified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a clear description of the 'address' parameter (0x-prefixed, ERC-20, Base). The description adds no further parameter-specific details beyond the schema, so baseline 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it risk-scans any ERC-20 token on Base mainnet, listing specific checks (ownership, mint, holder concentration, contract age, honeypot, risk score). It is distinct from all sibling tools, none of which perform risk scanning.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises use 'before a trading agent buys a freshly minted token' and mentions it saves losses. No explicit when-not-to-use or alternative tools, but the context is clear and the tool stands alone among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_tx_decodeAInspect
Fetch a Base mainnet transaction by hash and return a human-readable summary: from/to, value (ETH + USD-est), gas used, status, block, input data length, and the function selector decoded if it matches a known signature. Use when a trading agent needs to inspect a tx before or after settlement — pairs with onyx_token_metadata for full context. Reads from Base's public RPC (no key needed). Demo mode returns a synthetic record. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| tx_hash | Yes | 0x-prefixed Base tx hash |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description fully covers behavioral traits: it reads from Base's public RPC (no key needed), mentions demo mode returns synthetic record, and discloses pricing tier. This is comprehensive for a read-only tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences plus a brief note on demo and pricing. It is front-loaded with the core purpose, efficient with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input (single hash), no output schema, and no annotations, the description fully explains what the tool returns (human-readable summary with list of fields) and additional context (RPC source, demo mode, pricing). It is complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has one parameter with a description '0x-prefixed Base tx hash,' and the description text mentions 'tx_hash' but adds no new semantics. Schema coverage is 100%, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches a Base mainnet transaction and returns a human-readable summary with specific fields (from/to, value, gas, status, etc.). It distinguishes itself from siblings by noting it pairs with onyx_token_metadata for full context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use when a trading agent needs to inspect a tx before or after settlement' and mentions pairing with onyx_token_metadata, providing clear context. It lacks explicit when-not-to-use but is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_tx_explainerAInspect
Decode a Base mainnet transaction into a human-readable summary. Returns a one-line plain-English description of what happened (token transfers, swaps, approvals, contract deploys), ERC-20 transfer events with symbol resolution, balance changes per address, the function selector + decoded method name where it matches a known signature, gas used, fee in ETH, and tx status. Use when a trading agent needs to verify what a tx actually did before/after settlement, or when a wallet agent needs to explain a tx to its user. Direct equivalent of OATP's Solana tx_explainer ($0.10, 1,350 unique paying agents) — Onyx is the first to ship this on Base mainnet at $0.05. (price: $0.05 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| tx_hash | Yes | 0x-prefixed Base tx hash |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations, so description carries full burden. It discloses behavioral traits: returns plain-English summary, token events, balance changes, function signature, gas, fee, status. Also notes price and tier. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is four sentences, front-loaded with core purpose, then enumerates outputs, use cases, and a comparison. Each sentence is informative, though slightly verbose; still concise overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, description thoroughly covers input, return types, and use cases. It explains all relevant aspects for an agent to decide and invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Single parameter tx_hash with schema coverage 100%. Description adds context 'Base mainnet transaction' and indicates what output to expect, but does not add significant semantics beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it decodes a Base mainnet transaction into a human-readable summary, listing specific outputs like ERC-20 transfers, function selector, gas, and status. It distinguishes from sibling tools by focusing on human-readable explanation versus raw decoding.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases: verifying transactions for trading agents and explaining transactions to users. Does not mention when not to use or differentiate from onyx_base_tx_decode, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_tx_simulatorAInspect
Simulate a Base mainnet transaction before sending it. Returns success/revert prediction, the revert reason if any, decoded return data, and an estimated gas figure. Use as a pre-flight check inside a trading agent's tool-call dispatcher — agents should simulate before signing to avoid paying gas on a doomed tx. Direct equivalent of OATP's Solana tx_simulator ($0.20, 1,304 unique paying agents) — Onyx is the first to ship this on Base mainnet at $0.10. Read-only — never submits. (price: $0.10 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Hex-encoded calldata (default 0x) | |
| block | No | block tag (latest/pending/0x...) default latest | |
| value_wei | No | ETH wei to send (default 0) | |
| to_address | Yes | 0x-prefixed contract or wallet | |
| from_address | No | 0x-prefixed sender |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully discloses that the tool is read-only and never submits a transaction. It also details the return values (success prediction, revert reason, decoded data, estimated gas) and mentions pricing, ensuring full behavioral transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences front-load the purpose, returns, and usage context, followed by a comparison and pricing note. Every sentence adds value, with no redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters, no output schema, and no annotations, the description covers purpose, return values, usage guidance, and safety (read-only). It is complete and leaves no major gaps for an agent to understand what the tool does and when to use it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema coverage is 100%, so baseline is 3. The description does not add specific parameter meanings beyond the schema, but it does explain overall behavior and return types. No additional parameter-level elaboration is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool simulates a Base mainnet transaction before sending, specifies it returns success/revert prediction, revert reason, decoded return data, and estimated gas, and distinguishes it as a pre-flight check for trading agents, comparing it to a Solana counterpart.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises using the tool before signing to avoid paying gas on failed transactions, providing clear context for when to use it. Does not explicitly mention when not to use or alternatives, but the guidance is strong.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_bazaar_blue_oceanAInspect
Find empty niches in the x402 paid-MCP market. Reads CDP discovery (1000+ live services), clusters by keyword, surfaces categories with 0-1 services. Use to position a new paid tool in an uncontested slot. Returns: empty_niches (no services), thin_niches (1-2 services), saturated (5+ services to avoid), plus a recommended build target. (price: $0.01 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| network | No | Filter CDP corpus by network. 'base' counts both eip155:8453 and 'base' string variants. | all |
| max_niches | No | ||
| seed_keywords | No | Optional list of candidate niche keywords to check. Empty = auto-mine from CDP data. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so the description carries full burden. It discloses that the tool reads CDP data (non-destructive), returns specific structures (empty_niches, thin_niches, saturated, recommended build target), and mentions pricing ($0.01 USDC, tier metered). It does not cover error conditions or rate limits, but key behavioral aspects are transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, each serving a distinct purpose: purpose, usage guidance, return structure. It is front-loaded with the action and resource, contains no redundant or extraneous text, and is highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has moderate complexity with 3 parameters and no output schema. The description covers purpose, usage, return details, and pricing. It lacks information on error handling or edge cases, but for a read-only market analysis tool, the provided context is largely sufficient for an agent to select and invoke it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 67% (two of three parameters described). The description adds meaning: for 'network' it clarifies variant handling, and for 'seed_keywords' it explains the auto-mine fallback. 'max_niches' lacks explicit description, but the tool description implies its role in limiting output. Overall, the description enhances schema information.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Find empty niches in the x402 paid-MCP market.' It specifies the action (find), resource (empty niches), and methodology (reads CDP discovery, clusters by keyword). This clearly distinguishes it from sibling tools like onyx_bazaar_compare or onyx_bazaar_submit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises when to use: 'Use to position a new paid tool in an uncontested slot.' It provides clear context for usage but does not mention when not to use or list alternative tools, though the purpose is self-contained.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_bazaar_compareAInspect
Side-by-side comparison of paid agent tools across the x402 ecosystem. Filter by keyword (e.g. 'captcha', 'tx_explainer', 'aml', 'browser', 'oauth') and network ('Base' / 'Solana' / etc.), rank by price, 30-day call volume, or unique payer count, and get cheapest/most-used picks. Reads Coinbase Bazaar via the public Onyx mirror — refreshed every 15 minutes. Free tier. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | Yes | Keyword to match in domain, resource URL, or description (case-insensitive). Empty string = no filter. | |
| network | No | Filter by network: 'Base', 'Solana', 'Polygon', etc. Omit for all networks. | |
| sort_by | No | Ranking order. price_asc = cheapest first; volume_desc = most-called first; payers_desc = most unique buyers; freshness_desc = most-recently-called first. | volume_desc |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses the data source (Coinbase Bazaar via Onyx mirror), refresh frequency (15 minutes), and pricing (free). With no annotations, this is sufficient transparency for a read-only tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (two sentences plus a note) and front-loaded with purpose. Every sentence adds value, with no redundancy or wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a comparison tool without an output schema, the description covers inputs, filtering, sorting, and data source. It lacks details on output format but suffices for an agent to understand the tool's behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds examples and context for parameters beyond the schema (e.g., keyword examples like 'captcha', 'tx_explainer'). Schema coverage is high, but the description enriches understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states this tool performs a side-by-side comparison of paid agent tools, with filtering and ranking. It distinguishes from sibling bazaar tools like 'onyx_bazaar_blue_ocean' and 'onyx_bazaar_submit'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit instructions on how to filter by keyword and network, and rank by price, volume, or payer count. It mentions data refresh and free tier, though it does not explicitly state when to use alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_bazaar_submitAInspect
Indexability audit for any x402 service. Inspects the published surface (/openapi.json, /.well-known/x402.json, /manifest) against what Coinbase Bazaar's discovery crawler looks for, and returns a structured checklist: what's present, what's missing, what to fix. Crawler is poll-based — this tool documents the criteria, doesn't force-submit. Free tier. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| server_url | Yes | Base URL of the x402 service to audit (e.g. https://onyx-actions.onrender.com). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must fully convey behavior. It mentions 'Free tier' and price, and states it does not force-submit, implying no destructive action. It could be more explicit about being read-only, but overall it adequately describes the tool's non-destructive, informational nature.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, containing only essential information. It front-loads the purpose, then details the inspection targets, output, and pricing. Every sentence adds value, and there is no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given simple input (one string) and no output schema, the description is complete enough. It explains input, output type (structured checklist), and behavioral nuance. Minor gaps like error handling or response format details are acceptable for a diagnostic tool of this complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The only parameter, server_url, has a schema description that already explains it. The tool description adds context about the tool's purpose but does not provide additional syntactic or semantic guidance for the parameter beyond what the schema offers. With 100% schema coverage, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Indexability audit for any x402 service.' It specifies the resources inspected (openapi.json, .well-known/x402.json, /manifest) and the output (structured checklist). This distinguishes it from siblings like onyx_bazaar_compare by focusing on audit vs. comparison.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description clarifies that the tool is poll-based and 'doesn't force-submit', indicating it's for diagnostic use. However, it does not explicitly list alternative tools or scenarios where one might use this over others, but the context of 'audit' vs. 'compare' is implied.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_clickAInspect
Click the first visible button or link whose text matches the query (case-insensitive substring match). Returns whether a match was found and the matched element's text + href. Use after onyx_browser_extract to act on what the page advertised. Demo mode returns synthetic OK. (price: $0.003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Substring of the element's visible text |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses substring matching, visibility preference, return data (match status, text, href), and demo mode behavior. Lacks details on scroll/waits/state modification, but sufficient for a simple click. No annotations to contradict.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences plus a pricing note; every sentence carries essential information. Front-loaded with the core action and matching strategy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all necessary behavioral aspects: what it clicks, matching rule, return values, and demo mode. No output schema needed; description fully equips an agent to use it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has one parameter with a basic description. The tool description adds matching semantics (case-insensitive substring, visible elements) beyond the schema, enriching parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool clicks the first visible button or link matching a query via case-insensitive substring match. Distinguishes itself from sibling browser tools by being the dedicated click action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly recommends using after onyx_browser_extract, providing clear context for when to invoke. No when-not-to-use guidance, but the single parameter and sibling set make it unambiguous.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_evalAInspect
Evaluate a JavaScript expression on the current CDP-controlled Chrome page and return the result by value. Use as a power-tool when the specific click/extract/type tools don't fit — pull deeply nested DOM data, dispatch synthetic events, read computed styles. Demo mode echoes the expression length without executing. (price: $0.004 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| expression | Yes | JavaScript expression. Last value is returned. | |
| await_promise | No | If the expression returns a Promise, wait for it. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations, so description handles behavioral disclosure. It explains it returns by value and notes demo mode echoes expression length. However, doesn't warn that evaluation can cause side effects (e.g., DOM mutation), which is a minor gap for a power tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words. First sentence states core purpose, second provides usage guidance and additional info (demo mode, pricing). Well front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description mentions return 'by value'. Lacks details on error handling, return type structure, or limitations. However, for a power tool with good usage guidance, it's largely adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage of both parameters. The description adds 'Last value is returned' for expression, which is already in the schema description. No extra context for await_promise. Baseline 3 is appropriate as schema already provides meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it evaluates JavaScript on a Chrome page and returns the result by value. Also positions it as a power-tool for cases where simpler tools (click/extract/type) don't fit, distinguishing it from siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use: when specific tools don't fit, with examples like nested DOM, synthetic events, computed styles. Also mentions demo mode behavior. This provides clear context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_extractAInspect
Read the current CDP-controlled Chrome page and return the visible text content plus a structured summary of clickable elements: buttons, links (with hrefs), inputs (with names/placeholders/types). Use when an agent needs to plan its next action — list what's on the page without screenshotting + vision-modeling. Cheap, structured, deterministic. Demo mode returns a plausible synthetic page summary. (price: $0.003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| max_chars | No | Cap on returned text length (max 80000) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations, so description carries full burden. It discloses read-only nature, structured output, cheap/deterministic behavior, and demo mode behavior. Lacks rate limit or loading state info but sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single paragraph, each sentence adds value. Starts with action, details, usage, extra info. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with one parameter and no output schema, description covers return content (text + clickable elements with attributes), usage context, and demo mode. Missing explicit output format but is specific enough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a single parameter. Description does not add meaning beyond the schema's own description. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it reads the page and returns visible text plus structured clickable elements. Distinguishes from sibling browser tools like click, navigate, screenshot.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use when an agent needs to plan its next action' and contrasts with screenshot+vision-modeling. Could be clearer on when not to use, but provides strong context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_screenshotAInspect
Capture a PNG screenshot of the current CDP-controlled Chrome page and return it as base64. Use to feed a vision-LLM (Claude / GPT-4V) for screen-understanding agents, or to archive an action's visual result. Returns also the page title, URL, and viewport dimensions. Cap of 1MB returned. Demo mode returns a synthetic 1×1 PNG; self-host with ONYX_CDP_URL for real captures. (price: $0.008 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| format | No | png | |
| full_page | No | Capture full scrollable page or just viewport |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses return contents (title, URL, viewport), file size cap (1MB), demo mode behavior (synthetic 1×1 PNG), and pricing. Lacks details on error handling but sufficient for most use.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Efficiently packs purpose, use cases, return info, constraints, and pricing into a few sentences. Front-loaded with core action. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description lists return fields (title, URL, viewport) and the 1MB cap. Covers essential context for a screenshot tool. Missing only minor details like error scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 2 params with 50% description coverage. Description mentions 'PNG' (default format) but does not clarify the JPEG option or the format parameter in detail. Adds little beyond schema for full_page (already described in schema). Averages to baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Capture a PNG screenshot of the current CDP-controlled Chrome page and return it as base64', specifying the verb (capture), resource (screenshot), and format. Distinguishes from sibling browser tools like navigate, click, type, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases: 'Use to feed a vision-LLM... or to archive an action's visual result'. Also gives context on demo vs real capture with ONYX_CDP_URL. Does not list when not to use, but sibling tools cover different actions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_typeAInspect
Find an input/textarea/select on the current CDP page by its name, id, or visible label, set its value via the React-safe native setter, and fire input + change events so frameworks like React/Vue see the update. Use after onyx_browser_navigate when an agent fills a form. Returns the field selector matched and the final value. Demo mode returns synthetic OK. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Text to enter | |
| selector | Yes | Name, id (#foo), CSS selector, or visible label substring |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses the React-safe setter, event firing for frameworks, return values (field selector and final value), and demo mode behavior. It does not cover potential error states or edge cases like missing fields, but is otherwise transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (3-4 sentences) with no wasted words. It front-loads the core action and immediately follows with usage context and return information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description adequately covers what the tool does, when to use it, and what it returns. It lacks information about error handling or behavior when the element is not found, but given the well-covered parameters and clear purpose, it is mostly complete for a form-filling tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. The description adds value by explaining that 'selector' can be a name, id (#foo), CSS selector, or visible label substring, which goes beyond the schema's concise descriptions. It also clarifies that 'value' is the text to enter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly and specifically states the tool's purpose: finding an input/textarea/select by name, id, or visible label, setting its value via a React-safe native setter, and firing input+change events. It distinguishes itself from siblings like onyx_browser_click and onyx_browser_navigate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use after onyx_browser_navigate when an agent fills a form,' providing clear context for when to use it. It lacks exclusion criteria or alternatives but offers actionable guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_capability_bundleAInspect
Bundle 3-5 Onyx tools into one paid call at a discount. Atomic delivery (all-or-none), one AR-1 receipt for the whole chain, single x402 settlement vs N separate. Predefined bundles for proven workflows: safety_check_base (verify + risk_scan + audit), tx_full_inspect (explainer + simulator + decode), swap_prep (risk + dex_pair + quote), cross_chain (swap + bridge + chain_picker), agent_kyc (id + kya + oai). (price: $0.02 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| args | Yes | Shared input args for all tools in the bundle. Required keys per bundle: safety_check_base=[address], tx_full_inspect=[tx_hash], swap_prep=[token_in,token_out,amount_in], cross_chain=[token_in,token_out,amount_in,to_chain_id], agent_kyc=[wallet_or_did]. | |
| bundle | Yes | Which predefined bundle to execute. | |
| stop_on_error | No | If true, halts on first tool error and returns partial results. If false, continues all tools. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries a heavy burden. It discloses atomic delivery (all-or-none), receipt behavior, pricing, and bundle composition. It does not detail error handling beyond the stop_on_error parameter or auth requirements, but provides significant behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences plus a list of bundles with constituent tools. Front-loaded with purpose and key benefits. No wasted words, efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description does not explain what the agent receives after execution. While it mentions 'one AR-1 receipt for the whole chain', the output structure is unclear. Adequate for basic use but incomplete for a tool with complex output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. The description adds value by listing required args keys per bundle (e.g., safety_check_base=[address]), which is more concise than the schema's generic args description. This helps the agent understand parameter dependencies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it bundles 3-5 Onyx tools into one paid call at a discount, listing specific predefined workflows. This distinguishes it from individual sibling tools like onyx_base_token_risk_scan or onyx_base_tx_explainer.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when running a set of related tools together to save cost, and contrasts with individual calls via 'single x402 settlement vs N separate'. No explicit 'when not to use', but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_contract_auditAInspect
Full smart-contract security audit for any Base address — source + DEPLOYED reality + AI, SIGNED. Fetches verified source, runs curated static vuln detectors (tx.origin auth, delegatecall, selfdestruct, unchecked calls, unprotected init, owner mint/pause/blacklist, mutable fees), AND flags the live on-chain risks a static audit misses — upgradeable proxies (owner can swap logic post-audit) and self-destructed contracts. Optional Claude deep-pass for novel bugs. Returns ALLOW/REVIEW/BLOCK + 0-100 risk score, every finding Ed25519-signed. Cheaper than a manual audit, and unlike one it audits the contract as actually deployed. (price: $0.50 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| deep | No | Run the optional AI deep-pass for novel/business-logic bugs (only fires if the server has an AI key configured; degrades gracefully otherwise). | |
| address | Yes | Contract address on Base mainnet (0x... 20-byte hex). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It thoroughly explains the tool's behavior: fetching verified source, running static vulnerability detectors (listing examples), flagging live on-chain risks, optional deep pass, output format (ALLOW/REVIEW/BLOCK + score), and signing of findings. It also discloses pricing and tier. This is highly transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively long (4-5 sentences) but packs in essential information: purpose, process, features, output, and pricing. It is front-loaded with the main purpose. Every sentence adds value, though it could be slightly more concise by grouping related details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the input (address, optional deep), process (static analysis, live risk flags), output (risk score and findings signed), and pricing. No output schema exists, but the description adequately explains the return format. For a complex security audit tool, this is fairly complete, though more detail on the exact structure of findings could be useful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description adds value by explaining the overall audit process but does not significantly enhance parameter semantics beyond what the schema already provides. The description for the 'deep' parameter is essentially repeated from the schema. No additional constraints or formatting details are added.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs a full smart-contract security audit for any Base address. It uses specific verbs (Fetches, runs, flags, returns) and distinguishes from siblings by focusing on audit rather than verification or other contract-related tools. The description explicitly mentions what the audit covers (static detectors, live on-chain risks, optional deep pass).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides usage context by stating the audit is 'cheaper than a manual audit' and 'audits the contract as actually deployed.' It implies when to use this tool (for security auditing) but does not explicitly mention when not to use it or provide alternatives like onyx_base_contract_verify. However, the context is clear enough for an agent to decide.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_dns_lookupAInspect
Resolve a domain to its A/AAAA records, or reverse-resolve an IP to its hostname. Useful for validating a domain exists before scraping, checking if two domains share infrastructure, mapping CDN origins, or doing safety lookups before agents call third-party APIs. Returns IPv4, IPv6, canonical hostname, and resolution time. Powered by stdlib so results are whatever the host's DNS resolver returns — typically 20-100ms. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| host | Yes | Domain (example.com) or IP address |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that results come from stdlib's DNS resolver with 20-100ms latency and includes pricing (USDC cost and metered tier). It does not detail error handling (e.g., NXDOMAIN) or TTL behavior, but the disclosed traits are sufficient for typical use.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences: core action, use cases, and technical details. Every sentence adds value without redundancy. It is front-loaded with the essential purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the single parameter and no output schema, the description adequately explains the return information (IPv4, IPv6, hostname, resolution time) and the backend (stdlib). This is complete for a straightforward DNS lookup tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single host parameter is fully described in the schema (domain or IP). The description adds the critical semantic that the tool behaves differently based on input type—forward or reverse resolution—which the schema's 'description' field alone does not convey.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool resolves a domain to A/AAAA records or reverse-resolves an IP to its hostname. The specific verb-resource pair and dual functionality distinguish it from numerous sibling tools like whois, url_parse, or ip_geolocate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description lists concrete use cases (validating domain existence, infrastructure sharing, CDN mapping, safety lookups) that help an agent decide when to use this tool. It does not explicitly exclude alternatives or state when not to use it, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_email_validateAInspect
Validate an email address: RFC-5322 syntax check, domain DNS resolution (does the domain exist?), and disposable-provider detection (Mailinator, 10minutemail, GuerrillaMail, etc.). Returns a single confidence verdict plus the underlying signals so agents can decide whether to send. Use before mailing list signups, password-reset flows, or sales-lead capture to filter out trash addresses cheaply. ~30-80ms typical. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| Yes | Address to validate |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but the description thoroughly discloses all behavioral traits: three validation steps, typical latency (30-80ms), cost ($0.002 USDC), tier (metered), and return type (confidence verdict plus signals). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is four sentences, front-loaded with purpose, followed by details, usage guidance, and performance/cost. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the single parameter, no output schema, and low complexity, the description covers all needed information: what it does, when to use, performance, cost, and return type. Complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (one parameter described as 'Address to validate'). The description adds context about validation steps but does not enhance parameter meaning beyond the schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool validates email addresses with three specific checks (RFC-5322 syntax, DNS resolution, disposable-provider detection) and returns a confidence verdict and signals. It stands out among siblings like whois or DNS lookup by focusing specifically on email validation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly suggests using before mailing list signups, password-reset flows, or sales-lead capture. While it doesn't list alternatives or when not to use, the context is clear and actionable.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_ens_resolveAInspect
Resolve an ENS name to its current Ethereum mainnet address (or vice versa). Returns the canonical address, avatar URL if set, and the resolver contract that returned it. Use when an agent encounters a human-readable name like 'vitalik.eth' and needs to send funds or validate identity. Reads via the public ensideas API (no key, no rate-limit pain for typical agent traffic). ~200-500ms. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | ENS name (foo.eth) or 0x-address for reverse |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries the full burden. It discloses the API used (public ensideas), no key required, typical latency (200-500ms), cost ($0.002 USDC, tier metered), and that it returns canonical address, avatar, and resolver. It does not describe error cases or edge conditions, but given the simplicity of a lookup, it is sufficiently transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (two sentences plus parenthetical details) and front-loaded with the main purpose. Every sentence adds value: the first sentence states the action and returns, the second gives use case and API characteristics, and the parenthetical includes latency and cost without clutter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (one parameter, no output schema), the description covers purpose, usage context, return values, API source, latency, and cost. It does not explain possible errors or the reverse resolution format, but it is adequate for most agent usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The tool description does not add significant meaning beyond the schema; it only restates that the parameter accepts an ENS name or address. The schema already documents the parameter well.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'resolve', the resource 'ENS name', and the bidirectional capability (to address or vice versa). It specifies return values (canonical address, avatar URL, resolver). There are no sibling tools with similar functionality, so it is well-distinguished.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'when an agent encounters a human-readable name like vitalik.eth and needs to send funds or validate identity.' It also provides context on the API source and rate limits. However, it does not mention when not to use or alternative tools, though none exist in the sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_facilitator_healthAInspect
Live up/down + latency probe for all 5 known x402 facilitators (Coinbase CDP, x402.org public, xpay.sh, cronos-x402, faremeter). Returns per-facilitator status, response time, capabilities probe, plus ecosystem-wide health score. Free tier — agents shouldn't hardcode a single facilitator; this lets them pick a live one. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| timeout_seconds | No | Per-facilitator HTTP timeout. Max 15. | |
| include_capabilities | No | Also probe each facilitator's /supported endpoint to enumerate networks + schemes. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must convey behavioral traits. It describes the tool as a non-destructive probe (up/down, latency, capabilities) and mentions 'Free tier', but does not disclose potential side effects, authentication needs, or rate limits. The transparency is adequate but not exhaustive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and well-structured: it front-loads the primary action, lists facilitators, describes returns, provides usage guidance, and notes the cost. Every sentence adds value with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (health check with 2 parameters) and no output schema, the description adequately covers what the tool does and what it returns (status, latency, capabilities, health score). It also includes usage context, making it sufficiently complete for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Both parameters (timeout_seconds and include_capabilities) are fully described in the input schema with 100% coverage. The description adds no significant meaning beyond the schema, only mentioning 'capabilities probe' which is already implied. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it is a live up/down and latency probe for all 5 known x402 facilitators, lists the facilitators explicitly, and specifies the return values (per-facilitator status, response time, capabilities probe, ecosystem health score). This distinguishes it from sibling tools like onyx_x402_facilitators.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description advises agents not to hardcode a single facilitator and to use this tool to pick a live one, providing clear context for when to use it. It does not explicitly state when not to use or list alternative tools, but the guidance is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_fact_checkAInspect
Fact-check any claim by fetching real-time web evidence. Returns supporting sources, contradicting sources, a 0-100 confidence score, and a short summary. Use for prediction-market resolvers, news-fact agents, journalist-bot pipelines, or any agent that needs to verify a statement before acting on it. Sub-second latency, no API key on the caller side. Coinbase PROJECT-IDEAS.md explicitly calls for this primitive. (price: $0.05 USDC, tier: premium)
| Name | Required | Description | Default |
|---|---|---|---|
| claim | Yes | The factual statement to verify. E.g. 'The 2026 G20 summit will be hosted in Cape Town' or 'USDC supply on Base mainnet exceeds $5B'. | |
| max_sources | No | Maximum number of sources to return (1-15, default 8) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations are provided, the description fully discloses behavioral traits: real-time web evidence fetching, return of supporting and contradicting sources, a 0-100 confidence score, and a summary. It also reveals pricing ($0.05 USDC, tier premium) and performance characteristics (sub-second latency). No contradictions exist.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is compact at four sentences, each adding essential information. It front-loads the core function and output, then lists use cases, then performance/pricing. No redundant or verbose language.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking an output schema, the description adequately specifies return values (sources, confidence score, summary). It also covers performance, pricing, and origin (Coinbase PROJECT-IDEAS.md), providing a complete picture for an agent to decide whether and how to invoke this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema coverage is 100%, so baseline is 3. The description adds value by providing concrete examples for the 'claim' parameter (e.g., 'The 2026 G20 summit will be hosted in Cape Town') and clarifying the default for 'max_sources' (8). This context goes beyond the schema's descriptions, making parameter usage clearer.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the primary function: 'Fact-check any claim by fetching real-time web evidence.' It lists specific outputs (supporting sources, contradicting sources, confidence score, summary) and concrete use cases (prediction-market resolvers, news-fact agents). The tool is clearly distinguished from its siblings, which include browser automation and URL utilities, none of which offer fact-checking.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit use cases: 'Use for prediction-market resolvers, news-fact agents, journalist-bot pipelines, or any agent that needs to verify a statement before acting on it.' It also mentions sub-second latency and no API key requirement, aiding deployment decisions. However, it does not explicitly state when not to use the tool, which would further improve guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_fx_convertAInspect
Convert between any two fiat currencies (USD, EUR, GBP, JPY, BRL, USDC-equivalent, 160+ ISO-4217 codes) at the current mid-market rate. Returns both the rate and the converted amount, plus the rate's last update timestamp. Use when an agent needs to price a service in another currency, normalize multi-currency invoices, or convert x402 USDC amounts to local fiat for human-readable receipts. Powered by open.er-api.com (free tier, no key). ~150-400ms. Demo mode returns USD-EUR @ 0.92 for testing. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| to | Yes | ISO-4217 target currency code | |
| from | Yes | ISO-4217 source currency code | |
| amount | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Since no annotations are provided, the description carries full burden. It discloses the data source (open.er-api.com, free tier), approximate latency (150-400ms), demo mode behavior (returns USD-EUR @ 0.92), and pricing ($0.002 USDC, metered tier). This provides adequate transparency without contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph that front-loads core functionality, then provides use cases, technical details, and pricing. Every sentence adds value: currency list, rate details, use cases, data source, latency, demo mode, pricing. It is structured logically but could be slightly more concise by removing redundant phrases like 'convert between any two fiat currencies' (implied by the title).
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 3 params (2 required), no output schema, and no annotations, the description provides sufficient context: what it does, when to use, technical details, demo behavior, and pricing. It covers the essential aspects for an agent to invoke it correctly. However, it does not specify whether the rate is real-time or cached, which might be relevant.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema describes two of three parameters ('from', 'to' with ISO-4217 descriptions), covering 67%. The description adds context by listing examples and specifying the source as 'current mid-market rate'. It clarifies that 'amount' defaults to 1, which is helpful. The description compensates for the missing schema description on 'amount' by implying its purpose.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states conversion between fiat currencies, lists multiple examples (USD, EUR, GBP, JPY, BRL, 160+ ISO-4217 codes), and specifies the result includes rate, converted amount, and timestamp. It distinguishes itself from sibling tools by focusing specifically on fiat conversion, which no other sibling appears to cover.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly provides three use cases: pricing a service, normalizing multi-currency invoices, and converting x402 USDC for human-readable receipts. It also mentions demo mode for testing. However, it lacks explicit guidance on when not to use this tool, such as for cryptocurrency conversion (likely handled by other onyx tools).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_geo_verifyAInspect
Geo ground-truth oracle. Give a URL (optionally an expected price / currency / keyword); get what a real regional vantage actually sees — final URL after geo-redirects, region price + currency, geo-block / 'not available in your country' walls, page language — plus explicit divergence flags vs your expectation. Catches region-gating a datacenter-IP agent silently misses. Never guesses. (price: $0.25 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Full URL (http/https) to observe from the in-region vantage. | |
| expect_keyword | No | Optional text you expect present on the page. If absent in the in-region view (e.g. geo-gated content), it's flagged. | |
| expect_currency | No | Optional ISO currency you expect (e.g. 'USD'). If the in-region view shows a different currency, it's flagged as divergence. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses that the tool uses a real regional vantage, never guesses, and includes pricing ($0.25 USDC, metered). It doesn't cover authorization or rate limits, but the key behavioral aspects are transparent. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (about 70 words), front-loaded with the core purpose, and structured efficiently. It lists inputs and outputs in a clear, scannable way without any unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description provides a comprehensive overview of what the tool returns (final URL, price, currency, geo-block, language, divergence flags). The tool's complexity is moderate, and the description is sufficient for understanding its capabilities and limitations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. The tool description reinforces the purpose of each parameter but doesn't add significant new meaning beyond the schema. Baseline score of 3 is appropriate given the schema already explains them well.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose as a 'Geo ground-truth oracle' and details its specific function: observe a URL from a real regional vantage to get final URL, price, currency, geo-block status, language, and divergence flags. This distinguishes it from any sibling tools, none of which serve a similar geo-verification purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use the tool (to detect region-gating missed by datacenter-IP agents) and what inputs are needed (URL with optional expectations). It implies it's for verifying geo-specific behavior but doesn't explicitly state when not to use it or list alternatives. However, the context is clear enough for an agent to decide.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_hash_computeAInspect
Compute md5, sha1, sha256, sha512, and sha3-256 of any text or base64-encoded bytes. Returns each digest as both hex and base64. Use for content-addressed lookups, dedupe keys, signature verification support, or fingerprinting. Stdlib-only — runs locally, never logs input. <2ms. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| b64 | No | Or: base64-encoded bytes | |
| text | No | UTF-8 string to hash |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses the tool runs locally, never logs input, and executes in <2ms. Pricing and tier are provided. No side effects or destructive actions are relevant. This is transparent for a compute tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: two sentences plus a parenthetical with price and tier. Each sentence earns its place—first states the operation, second explains output and use cases, third gives operational context. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity, the description is complete. It lists all hash algorithms, input types, output format (hex and base64), use cases, performance, and cost. No output schema is provided, but the return value explanation suffices.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters described. The description adds context by clarifying that input can be text or base64-encoded bytes, and that both parameters are optional. This adds meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description explicitly states the tool computes multiple hash algorithms (md5, sha1, sha256, sha512, sha3-256) on text or base64 bytes. This is a specific verb-resource combination that clearly differentiates from sibling tools, none of which are hashing-focused.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description provides explicit use cases: content-addressed lookups, dedupe keys, signature verification support, fingerprinting. Also includes performance characteristics (local, fast, cheap). However, it does not explicitly state when not to use or mention alternatives, leaving some room for ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_html_metaAInspect
Fetch a URL and extract OpenGraph + Twitter Card + standard meta tags: og:title, og:description, og:image, og:type, twitter:card, twitter:image, canonical link, favicon, JSON-LD blocks. Use when an agent needs to preview a link before sharing, build a citation card, or detect spam/ads via meta-tag fingerprints. Stripped of HTML noise. ~150-500ms typical. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to inspect |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but description discloses performance (150-500ms) and cost ($0.002 USDC). States the output is stripped of HTML noise. Adequate transparency for a read-only fetch tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise and well-structured: main action first, then specific tags, then use cases, then performance details. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is complete for the tool's purpose: it explains what it does, what it extracts, when to use, and performance/cost. No gaps given the single parameter and lack of output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter 'url' with schema description 'URL to inspect'. Schema coverage is 100%, so baseline is 3. The tool description does not add additional semantics beyond the schema, but the overall purpose is clear.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it fetches a URL and extracts specific meta tags (OpenGraph, Twitter Card, standard). It distinguishes from sibling tools by specifying the exact tags and use cases like previewing links or detecting spam.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly lists use cases: 'Use when an agent needs to preview a link before sharing, build a citation card, or detect spam/ads via meta-tag fingerprints.' No explicit when-not-to-use, but the context is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_ip_geolocateAInspect
Geolocate any public IPv4/IPv6 address — country, region, city, lat/lon, timezone, ISP, ASN, mobile/proxy/hosting flags. Useful for filtering traffic by country, detecting datacenter/VPN egress, fraud scoring, or deciding which regional endpoint to route an agent through. Backed by ip-api.com (free tier, ~1k requests/min). ~80-200ms typical. Demo mode returns a plausible US record so the payment loop can be tested without burning the upstream rate limit. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| ip | Yes | IPv4 or IPv6 address |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description fully compensates for missing annotations by disclosing the backend (ip-api.com), rate limit (~1k requests/min), typical latency (80-200ms), demo mode behavior, and pricing ($0.002 USDC). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is dense but well-structured, covering purpose, returns, use cases, backend, and pricing in four sentences. It is efficient without being overly verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple schema (one parameter, no output schema), the description is complete. It lists all returned fields and provides behavioral context (rate limits, demo mode, pricing), leaving no significant gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter. The description adds minimal extra meaning (e.g., 'public IPv4/IPv6' vs. schema's 'IPv4 or IPv6 address'), so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool geolocates public IP addresses and enumerates the specific data returned (country, region, city, lat/lon, timezone, ISP, ASN, flags). It distinguishes itself from sibling tools like `onyx_geo_verify` by focusing on IP-based geolocation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides specific use cases (filtering traffic, fraud scoring, routing) and mentions demo mode for testing. However, it does not explicitly state when not to use this tool or compare it to alternatives, which is a minor gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_jwt_decodeAInspect
Decode a JWT (header + payload) without verifying the signature. Returns the algorithm, key id, all claims (iss, sub, aud, exp, iat, nbf, custom), expiry status, and any structural anomalies. Use when an agent receives a token from an external API and needs to inspect it for routing, expiry, or audit logging. Stdlib-only — runs locally, never sends the token anywhere. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| token | Yes | JWT (three base64url segments separated by .) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, description adds important behavioral context: stdlib-only, runs locally, never sends token anywhere, and includes pricing/tier. Could mention error handling for malformed tokens.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, no superfluous words. Efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter, description explains what is returned (header, payload, claims, validity) and includes price and tier. Sufficient for agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter with 100% schema coverage. Description does not add new meaning beyond schema; schema already adequately describes the parameter. Baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool decodes a JWT without verifying signature, lists specific returned data (algorithm, key id, claims, expiry status, anomalies), and distinguishes from verification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: when inspecting a token from an external API for routing, expiry, or audit logging. Does not explicitly mention alternatives or when not to use, but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_kya_verifyAInspect
Verify an Onyx Protocol KYA (Know Your Agent) credential. Pass a credential id (e.g. 'kya_01KSHZ...'); returns ok + scope + spend cap + issuer + revocation status. Use to gate paid tool access, audit agent operations, or compose with x402 settlement for trust-tier routing. Calls Onyx Protocol verifier. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| credential_id | Yes | KYA credential id (kya_*). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses that it calls the Onyx Protocol verifier, returns ok, scope, spend cap, issuer, revocation status, and mentions the price and tier. This is detailed and helps the agent understand side effects and dependencies.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each serving a distinct purpose: what it does, what to pass and what to expect, and when to use it. No filler or repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one param, no output schema, no annotations), the description covers all necessary aspects: purpose, input example, output details, use cases, and even pricing. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a single parameter. Description adds value beyond schema by providing an example credential id and explaining what the function returns. This enhances understanding of the parameter's purpose.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool verifies an Onyx Protocol KYA credential, provides a concrete example, and lists the return fields. It distinguishes itself from sibling tools like onyx_agent_audit_trail or onyx_verify_explain by focusing on credential verification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states use cases: gating paid tool access, auditing agent operations, composing with x402 settlement. Lacks explicit exclusions or alternatives, but the context provided is sufficient for an AI to decide when to invoke.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_market_pulseAInspect
One-call market snapshot of the paid x402 MCP economy. Returns top services by CDP visibility, blue-ocean niches with zero peers, saturated niches (5+ peers), Onyx pricing audit (over/under market by tool), and per-network split. Bloomberg-terminal for the agentic economy. Use for competitive intel, pricing decisions, niche selection. (price: $0.02 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| blue_ocean_keywords | No | Optional candidate capability tokens to check for blue-ocean status. Default = curated baseline of ~30 high-frequency agent capabilities. | |
| compare_onyx_prices | No | If true, include the Onyx pricing audit (our prices vs market median for matching capabilities). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided. The description mentions the cost ('$0.02 USDC, tier: metered') but does not disclose read-only status, potential side effects, data freshness, or rate limits. For a snapshot tool, these omissions leave significant behavioral uncertainty.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured paragraph that front-loads the purpose. It is concise (no filler), with clear declarative sentences. The marketing tagline ('Bloomberg-terminal for the agentic economy') is acceptable for context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description enumerates key outputs (top services, blue-ocean niches, saturated niches, pricing audit, per-network split). It is mostly complete for a snapshot tool, though the exact return format or data structure is not explained, leaving some ambiguity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Parameter descriptions in the input schema already cover semantics (e.g., 'Optional candidate capability tokens', 'If true, include the Onyx pricing audit'). The main description adds no new parameter-level detail beyond mentioning 'pricing audit' again. With 100% schema coverage, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it's a 'One-call market snapshot of the paid x402 MCP economy' and enumerates specific outputs (CDP visibility, blue-ocean niches, saturated niches, pricing audit, per-network split). It distinguishes itself from siblings like 'onyx_bazaar_blue_ocean' and 'onyx_bazaar_compare' by offering a comprehensive overview.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states use cases: 'competitive intel, pricing decisions, niche selection.' However, it does not provide when-not-to-use guidance or mention alternative tools for more specific queries (e.g., onyx_bazaar_blue_ocean for focused blue-ocean analysis).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_mcp_catalog_diffAInspect
Side-by-side tool catalog diff between any two MCP servers. Fetches each server's /manifest (with /openapi.json fallback), normalizes the tool lists, and returns: only-in-A, only-in-B, same in both, price delta, schema delta. Free tier — useful for competitor analysis, regression detection, and migration planning. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| server_a | Yes | Base URL of the first MCP server (e.g. https://onyx-actions.onrender.com). | |
| server_b | Yes | Base URL of the second MCP server. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so the description must disclose behavior. It states the tool fetches from /manifest with /openapi.json fallback, normalizes, and returns deltas. It also notes it is free (price $0). However, it does not disclose potential authentication needs, rate limits, or whether it writes any data. The lack of annotations is mitigated by the clear read-only nature implied by 'fetches'.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that efficiently covers purpose, method, output, use cases, and pricing. Every clause adds value, with no redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description provides high-level sections of the return (only-in-A, only-in-B, etc.) but omits details on formats or structure. For a comparative tool, this is sufficient for an agent to understand the output, though specific formatting (e.g., JSON structure) is missing.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (both parameters have descriptions). The description adds no further detail about the parameters beyond 'Base URL of the first/second MCP server.' Therefore, it meets the baseline but does not exceed it.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: performing a side-by-side diff of tool catalogs between two MCP servers. It specifies the endpoints fetched, the normalization process, and the exact return sections (only-in-A, only-in-B, same, deltas). This differentiates it from all sibling tools, which serve different functions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly lists use cases: competitor analysis, regression detection, and migration planning. It implies when to use this tool (any scenario requiring comparison of MCP server tool catalogs). However, it does not mention when not to use it or provide explicit alternatives, though given the unique capability, this is a minor gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_mcp_healthAInspect
Probe any public MCP / x402 server and return a structured health snapshot: endpoint latencies, content types, MCP discovery surface, x402 readiness, OAuth DCR advertisement, and a 0-100 composite reliability score. Stdlib-only. SSRF-hardened — refuses private, loopback, link-local, and reserved address ranges. Free tier, no key. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| server_url | No | Base URL of the MCP / x402 server (https://example.com). Must be public. | |
| extra_paths | No | Additional paths to probe beyond the standard MCP/x402 set. | |
| timeout_seconds | No | Per-request timeout. Defaults to 6. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
In the absence of annotations, the description discloses key behaviors: stdlib-only implementation, SSRF-hardened (refuses private/link-local), free tier no key required. It does not mention rate limiting or error handling, but the read-only nature is implied.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is information-dense yet efficient, front-loading the primary purpose and covering outputs, implementation, safety, and cost in a compact format with no redundant sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description enumerates all key output components (latencies, content types, etc.) and covers behavioral aspects like SSRF safety and cost. It provides sufficient context for a health probe tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema provides 100% description coverage, but the description adds value by explaining the SSRF-hardening rationale for the 'server_url' parameter and reinforcing the 'must be public' constraint beyond the schema's baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool probes any public MCP/x402 server and returns a structured health snapshot including specific metrics. It distinguishes itself from sibling tools, none of which serve the same purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Probe any public MCP / x402 server' and notes SSRF hardening (refuses private IPs), providing a clear when-to-use and when-not-to-use. It implicitly covers availability via 'Free tier, no key', though no explicit alternatives are given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_mcp_meta_callAInspect
Pre-flight inspector for ANY x402 tool call. Pass target URL + optional caller wallet, get back: live 402 price, recommended chain (via chain_picker logic), live facilitator health, caller reputation (via agent_id logic), and a green/yellow/red GO signal. Free tier — the universal preflight that lets agents decide before they sign. v2 (paid) will broker the actual settlement and proxy the response. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| target_url | Yes | Full URL of the x402 tool endpoint to inspect (e.g. https://other-service.com/v1/some_tool). | |
| caller_wallet | No | Optional. Caller's EVM wallet — used to look up reputation via agent_id logic. | |
| max_acceptable_usdc | No | Caller's max acceptable price for this call. If quoted price > this, GO signal is red. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. It states the tool is a 'pre-flight inspector' and is free, implying a read-only operation, but does not explicitly declare it as non-destructive or disclose any side effects. The mention of a paid v2 adds context but does not fully clarify behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with purpose and outputs, no unnecessary information. Every sentence earns its place, and the structure is easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains the return values (live price, chain, health, reputation, GO signal). It omits details on format or structure, but for a preflight utility, this is acceptable. Slight improvement possible by specifying the GO signal's criteria.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with each parameter described. The description adds context by explaining the purpose of the parameters (e.g., 'used to look up reputation via agent_id logic'), but does not significantly enhance meaning beyond the schema descriptions. Baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool is a 'Pre-flight inspector for ANY x402 tool call' and lists its specific outputs (live 402 price, recommended chain, facilitator health, caller reputation, GO signal). This distinguishes it from sibling tools as a meta-call evaluation tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use before signing an x402 call, and mentions a paid tier for actual settlement, providing some context. However, it lacks explicit guidance on when not to use it (e.g., for execution) or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_mcp_oauth_auditAInspect
OAuth 2.1 + RFC 7591 DCR compliance audit for any MCP server. Probes the 5 standard discovery + registration + token endpoints, validates each against the relevant RFC, returns a composite 0-100 score and remediation list. Free tier — useful for MCP operators preparing for ChatGPT custom-connector / Claude Managed Agents discovery. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| server_url | Yes | Base URL of the MCP server to audit. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that the tool probes 5 endpoints, validates against RFCs, and returns a score and remediation list. It also mentions it is free. However, it does not detail authorization requirements, rate limits, or potential side effects, though as an audit tool, destructive actions are unlikely.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with three sentences, front-loading the core purpose, then detailing the audit scope and scoring, and ending with a usage hint and pricing. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has one parameter, no output schema, and no annotations, the description provides sufficient context: it explains what the tool does, how it works (probing endpoints, validating RFCs), the output format (score and remediation list), and target users. This is complete for an agent to decide when to invoke it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema already describes the single parameter 'server_url' as the base URL. The description adds minimal additional meaning, explaining it is the URL of the MCP server to audit. With 100% schema coverage, the baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs an OAuth 2.1 + RFC 7591 compliance audit for MCP servers, specifying the verb (audit), resource (OAuth compliance), and scope (5 endpoints, 0-100 score). It is distinct from sibling tools, which cover different domains like browsing, tokens, and email.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies the tool is for MCP operators preparing for discovery by ChatGPT or Claude, but does not explicitly state when to use or avoid it, nor provides alternatives. The usage context is clear but lacks exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_mcp_registry_statusAInspect
Cross-registry listing audit for any MCP server. Checks Coinbase Bazaar (x402 discovery), Smithery, Glama, the official MCP Registry, and the awesome-mcp-servers list. Returns per-registry status + coverage score 0-100 + remediation suggestions for unlisted registries. Free tier. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| server_url | No | Base URL of the MCP server to audit (e.g. https://onyx-actions.onrender.com). Required for Bazaar lookup. | |
| github_repo | No | owner/repo slug if the server is open-source (e.g. 'dimitrilaouanis-tech/onyx-mcp'). Required for Smithery/Glama/awesome-mcp-servers lookups. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full burden. It mentions returns but does not explicitly state it's read-only or disclose potential side effects, rate limits, or authentication needs. The free tier info is helpful but behavioral clarity is moderate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with four sentences, front-loading the core purpose and adding specifics efficiently. No redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description explains return values (status, coverage score, remediation). It misses clarifying whether at least one parameter is needed, but is otherwise complete for a simple audit tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so description complements it by clarifying which parameter is needed for which registry. This adds value beyond the schema descriptions, justifying a score above baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it performs a cross-registry listing audit for any MCP server, listing specific registries checked and what it returns. It effectively distinguishes from sibling tools like onyx_bazaar_compare and onyx_mcp_health.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not explicitly state when to use this tool versus alternatives. While it implies it's for checking registrations, it lacks guidance on when not to use it or when to prefer another tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_mcp_routerAInspect
FIRST MCP meta-router. Describe a capability in plain English ('Base tx explainer', 'captcha OCR', 'DEX swap quote'); the router queries the entire CDP x402 discovery corpus, scores every candidate by price + freshness + schema match + network preference, and returns the top N ranked routes with full call templates (URL, method, body schema, expected price, payTo, asset, network). The agent calls the top route directly. Onyx is the aggregator; every other paid MCP is the supply. (price: $0.01 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| top_n | No | Number of route candidates to return. | |
| capability | Yes | Plain-English description of what the agent needs. E.g. 'Base transaction explainer', 'swap quote Solana', 'captcha OCR'. | |
| max_price_usdc | No | Cap on per-call price. Omit for no cap. Use 0.01 for cheap-only. | |
| preferred_network | No | Preferred network: 'eip155:8453' (Base), 'eip155:84532' (Base Sepolia), 'solana', etc. Sorted higher when present. | |
| include_onyx_routes | No | If false, exclude Onyx Actions endpoints from the comparison (e.g. for independent third-party benchmark). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes scoring factors (price, freshness, schema match, network preference) and output structure (URL, method, body schema, expected price, payTo, asset, network), which is good given no annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Information-dense and front-loaded, but contains some marketing language ('FIRST MCP meta-router') that could be trimmed for conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description fully explains the return value (top N ranked routes with full call templates), making it complete for the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, description adds value by explaining the purpose of capability, max_price_usdc, preferred_network, and include_onyx_routes beyond their schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it is a meta-router that takes a plain-English capability description and returns ranked routes with call templates, distinguishing it from the many specific sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit usage example ('Describe a capability in plain English') and implies use when needing to find/compare MCP tools, but lacks explicit when-not-to-use or alternative guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_merchant_fact_checkAInspect
Pre-checkout merchant fact oracle. Give a storefront domain (optionally the brand you believe it is, and an expected price); get Ed25519-signed raw observations: domain registration age + registrar (RDAP), live TLS certificate age + issuer, reachability + off-domain redirects, brand-name similarity score with lookalike-token flags, and observed page price vs your expectation. Facts only, method disclosed per field — Onyx never asserts 'legit' or 'scam'; the signature proves the observation is genuine and untampered. (price: $0.25 USDC, tier: premium)
| Name | Required | Description | Default |
|---|---|---|---|
| brand | No | Optional brand name you believe this storefront represents (e.g. 'Russell & Bromley'). Enables the brand-similarity observation. | |
| domain | Yes | Storefront domain or URL, e.g. brand-outlet-sale.com or https://shop.example.com/p/1 | |
| product_url | No | Optional specific product URL to extract the observed price from (defaults to the domain root). | |
| expected_price | No | Optional price you were quoted/expect. If the page shows a price, the deviation percentage is reported as a fact. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description explicitly discloses that the tool provides raw observations, never asserts legitimacy or scam, and that outputs are Ed25519-signed to prove genuineness and integrity. It also mentions the cost ($0.25 USDC) and tier. With no annotations provided, the description fully carries the burden of behavioral transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise at around 125 words and front-loads the tool's purpose. It lists the output fields efficiently. However, it could be slightly shorter without losing clarity, and the structure is logical.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 4 parameters and no output schema, the description provides a good overview of the output contents (domain registration, TLS cert, reachability, brand similarity, price deviation). It lacks details on how to verify the Ed25519 signature or exact output format, but for a pre-checkout oracle the level of detail is sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description briefly restates the parameters ('domain, optionally the brand and expected price') but does not add significant meaning beyond what the schema already provides. The schema itself gives examples and explains parameter effects well.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the tool as a 'Pre-checkout merchant fact oracle' and specifies exactly what inputs it accepts (domain, optional brand, expected price) and what outputs it returns (signed observations including domain registration age, TLS certificate age, reachability, brand similarity, price deviation). It distinguishes itself from siblings like onyx_fact_check and onyx_retail_price_check by focusing on merchant domain verification with cryptographic signatures.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage before checkout to verify merchant facts, but it does not explicitly state when to use this tool versus alternatives. There is no guidance on when not to use it or mention of other related tools such as onyx_fact_check or onyx_retail_price_check. The context is clear but lacks exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_oai_lookupAInspect
Look up the Onyx Agentic Index (OAI) score for an agent identity. Input a DID (did:web:..., did:eth:0x...) or wallet address; returns composite 0-1000 score + per-signal breakdown + last-updated timestamp. Use for trust-tier gating, routing decisions, partnership vetting. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| identity | Yes | Agent DID (did:web:..., did:eth:0x...) or raw 0x... wallet address. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes return fields but does not mention read-only nature (no annotations), error handling, or data freshness. Since annotations are absent, more detail on behavior would be beneficial.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences, front-loaded with action and key details. No redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Provides return structure, use cases, and pricing. Lacks explanation of error conditions or data recency, but overall sufficient for a simple lookup tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameter description. Description repeats schema content without adding new semantics beyond use context. Baseline 3 with high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it looks up the OAI score for an agent identity, specifying input types (DID or wallet address) and output components (composite score, breakdown, timestamp). It is distinct from sibling tools like onyx_agent_id or onyx_agent_audit_trail.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly suggests use cases: trust-tier gating, routing decisions, partnership vetting. Does not provide when-not-to-use or alternatives, which is acceptable given the focused purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_outcome_reportAInspect
Report what ACTUALLY happened after an Onyx verdict — closing the loop that turns signed opinions into a measured track record. Pass the original signed verdict (so we can verify Onyx really issued it) plus the real outcome: settled_clean, drained, reverted, reported_scam, false_positive, etc. We verify the Ed25519 signature, then log {verdict -> outcome} to the public ledger behind onyx_track_record. Free — the data is the point. This is the feedback that makes Onyx the only x402 security layer with proven precision. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| detail | No | Optional. One-line human note on what happened. | |
| outcome | Yes | What actually happened. One of: settled_clean, confirmed_safe, no_loss, verified_legit, drained, reverted, reported_scam, funds_lost, funds_unrecoverable, honeypot_confirmed, eoa_spender_confirmed, unverified_confirmed, false_positive. | |
| tx_hash | No | Optional. On-chain tx hash that evidences the outcome. | |
| signed_verdict | Yes | The full, unmodified signed result object a prior Onyx tool returned (must contain its onyx_attestation block). We verify it before logging. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses key behaviors: Ed25519 signature verification, logging verdict->outcome to a public ledger, and requiring the signed_verdict to contain an onyx_attestation block. It does not mention rate limits or error handling, but core behavior is covered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the main action and includes necessary details, but has some marketing fluff (e.g., 'the only x402 security layer with proven precision'). Overall, it is well-structured and each sentence contributes value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description does not specify what the tool returns (e.g., confirmation, ledger hash). It mentions the ledger behind onyx_track_record as context, but missing return value and error handling information, which is a gap given no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds context: explaining the purpose of signed_verdict (must contain attestation), the meaning of outcome (actual result), and the optionality of detail and tx_hash. This enhances understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: reporting what actually happened after an Onyx verdict, closing the loop from signed opinions to measured track record. It specifies the verb 'report', the resource (outcome after verdict), and the process, distinguishing it from sibling tools which focus on other Onyx operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage after a verdict is issued, requiring a signed verdict and an outcome. While it does not explicitly exclude alternative tools, its unique function (feedback loop) is clear. The mention of being free is a minor guideline but not extensive.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_paper_synthesisAInspect
Structured synthesis across N academic papers. Input: 2-10 OpenAlex IDs or DOIs. Output: per-paper metadata (title, year, citations, abstract), thematic overlap (shared keywords across abstracts), citation co-graph (papers that cite multiple inputs), and an agent-actionable summary stating what's converged vs contested. Composes after onyx_research_intel. (price: $0.03 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| ids | Yes | OpenAlex work IDs (W123..., or full URL) or DOIs (10.xxxx/...). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided; description carries burden. It details the output structure and includes pricing info ($0.03 USDC) and tier, adding valuable behavioral context beyond the schema. Discloses synthesis behavior and output components.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise: one sentence covers input, output, and composition order. Every word serves a purpose; no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, so description provides necessary return details (metadata, thematic overlap, citation graph, summary). Specifies constraints (2-10 papers). Lacks error handling details but is otherwise thorough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the schema description explains the 'ids' parameter. The tool description adds value by explaining the output structure and purpose, but the parameter itself is well-documented in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description specifies a clear verb ('Structured synthesis') and resource ('N academic papers'), distinctly differentiating it from sibling tools like onyx_research_intel by noting it composes after that tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Composes after onyx_research_intel', providing clear guidance on when to use this tool in a workflow. No explicit when-not-to-use or alternatives, but the context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_partnership_checkAInspect
Where does Onyx plug into Company X's stack? Probes their domain + CDP discovery + awesome-x402/awesome-mcp + GitHub for MCP/x402 footprint. Returns gap analysis: which of Onyx's 64+ tools complement what they already ship. Plus a suggested integration angle and signal strength. Built for outbound partnership / merger / B2B sales conversations. (price: $0.02 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| company | Yes | Company name (e.g. 'Catena Labs') or root domain (e.g. 'catena.xyz'). | |
| github_org | No | Optional GitHub org slug. If known, narrows the repo probe. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It transparently lists the data sources probed (domain, CDP, awesome-mcp, GitHub) and the output (gap analysis, integration angle, signal strength). Also mentions price and tier. Lacks disclosure of side effects or error behavior, but it's a read-only probe, so acceptable.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise at 6 sentences, starting with a clear question that sets the context. It front-loads the purpose and output, then adds usage context and pricing. No unnecessary words, though the price information could be seen as secondary.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description provides a high-level overview of the return value (gap analysis, integration angle, signal strength). For a tool with multiple probes and a complex output, more detail on the structure or format of the analysis would improve completeness. Still adequate for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the schema already documents both parameters. The description adds context by explaining how the 'company' parameter is used (name or domain) and that 'github_org' narrows the probe, but this largely mirrors the schema descriptions. The added context on probing behavior is helpful but not essential.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: probing a company's stack to find where Onyx fits, returning a gap analysis with integration angle and signal strength. It uses specific verbs ('probes', 'returns') and distinguishes itself from siblings like onyx_research_intel by focusing on partnership contexts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states the tool is built for 'outbound partnership / merger / B2B sales conversations,' providing clear context. However, it does not explicitly mention when not to use it or name specific alternatives, though the sibling list implies differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_password_strengthAInspect
Score password strength on a 0-100 scale. Returns Shannon entropy (bits), character-class diversity, length, common-pattern detection (sequences, repeats, dictionary-likeness), and a verdict (very_weak / weak / fair / strong / very_strong). Use when an agent generates passwords for accounts it creates, or when validating user-supplied credentials. Stdlib-only — runs locally, never sends the password anywhere. <5ms. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| password | Yes | Password to score |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description bears full responsibility. It discloses that the tool is 'Stdlib-only — runs locally, never sends the password anywhere,' and gives performance ('<5ms') and pricing ('$0.001 USDC'). This fully informs the agent of behavioral constraints.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (3-4 sentences), front-loads the purpose, and every sentence adds essential information: what it does, what it returns, when to use it, and how it behaves. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Although there is no output schema, the description fully describes the return values (Shannon entropy, character-class diversity, length, common-pattern detection, and a verdict). Combined with the single input parameter, it provides a complete picture of the tool's functionality.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with a single parameter described as 'Password to score'. The description adds no additional semantic meaning beyond the schema, so it meets the baseline but does not enhance understanding further.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Score password strength on a 0-100 scale' and enumerates the return values. It is highly specific and distinct from all sibling tools, which cover domains like blockchain, browser, email, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use when an agent generates passwords for accounts it creates, or when validating user-supplied credentials.' It provides clear context, though it does not mention when not to use or alternatives. However, no sibling tools perform password strength, so exclusion is unnecessary.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_pm_settlement_watchAInspect
Prediction-market state lookup — current odds, volume, liquidity, resolution state, and anomaly flags for any market on Polymarket or Manifold. Pass a market slug or full URL. Use for arb agents watching for mispriced events, copy-trading agents tracking whales, or settlement-resolver agents that pay only on a final outcome. Coinbase PROJECT-IDEAS.md explicitly calls for this primitive. (price: $0.005 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| venue | No | Force a specific venue: 'polymarket' or 'manifold'. Auto-detected if omitted. | |
| slug_or_url | Yes | Polymarket slug ('will-trump-win-2024'), Manifold slug, or full https URL to either platform |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It mentions data returned and pricing (non-destructive implied). However, it omits authentication requirements, rate limits, or error handling, which are important for autonomous agents.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, all essential: first states purpose and outputs, second explains parameter, third lists use cases. Front-loaded and no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description lists expected fields (odds, volume, liquidity, etc.) and pricing. Missing return format and error handling, but adequate for agent to decide when to invoke.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with both parameters described. Description adds minimal extra value beyond schema (e.g., 'Pass a market slug or full URL' matches schema description). Baseline 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states verb 'lookup' and resource 'prediction-market state' with specific data fields (odds, volume, liquidity, resolution state, anomaly flags). Also lists use cases (arb agents, copy-trading, settlement-resolver), distinguishing it from unrelated sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells users to pass a market slug or full URL and describes ideal use cases (arb, copy-trading, settlement). Lacks explicit 'when not to use' but context is sufficiently clear for a unique tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_research_intelAInspect
Research intel — has someone solved X already? Queries 240M+ academic works via OpenAlex (includes arXiv preprints, conference papers, journal articles), ranks by citation count + recency + relevance, returns top N papers with one-line abstract excerpts, citation counts, and author names. Built for autonomous agents that need to check prior art before burning cycles re-deriving a known result. Fallback to Semantic Scholar. (price: $0.05 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Research question or keyword string. Plain English works; OpenAlex handles tokenization. | |
| top_n | No | How many papers to return. | |
| sort_by | No | Ranking. citations = highest cited first; recency = newest first; relevance = OpenAlex semantic match. | relevance |
| year_from | No | Optional: only return papers from this year onward. | |
| min_citations | No | Filter out papers with fewer than this many citations. Use 50+ to surface only well-known work. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses data sources (OpenAlex, Semantic Scholar fallback), ranking method, return fields, and cost (0.05 USDC). Does not mention rate limits or side effects, but read-only nature is implied.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with clear, front-loaded purpose. Every sentence adds value and there is no unnecessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers most relevant aspects: data source, ranking, return fields, pricing, and use case. Lacks mention of pagination or total results, but complete enough for a search tool with detailed schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so parameters are well-documented in schema. The description adds context on ranking and fallback but does not significantly enhance parameter meaning beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's function: 'Research intel — has someone solved X already?' It specifies querying 240M+ academic works via OpenAlex, ranking by citation count, recency, and relevance, and returning top N papers with abstracts, citations, and authors. It distinguishes itself from sibling tools like onyx_paper_synthesis.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states intended use: 'Built for autonomous agents that need to check prior art before burning cycles re-deriving a known result.' Provides context with a fallback mechanism and pricing. Does not explicitly state when not to use, but the context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_retail_price_checkAInspect
Ground-truth retail oracle. Give a product URL; get the real current price, currency, and in-stock state as actually fetched now — with the extraction source (JSON-LD / OpenGraph / microdata) as evidence. Covers the long tail of no-API shops where agents otherwise hallucinate prices. Never guesses: returns price=None with confidence='none' when the page exposes no machine-readable price. Use before an agent quotes, compares, or transacts on a price it would otherwise invent. (price: $0.02 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Full product page URL (http/https). The exact page whose price + availability you want observed. | |
| expect_price | No | Optional. A price you believe is current. If given, the result includes matches_expected:bool + drift so a caller can detect a stale/hallucinated quote. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses key behaviors: fetches fresh data, never guesses, returns price=None with confidence='none', and provides extraction source. It also mentions cost. However, it does not cover error handling for invalid URLs, rate limits, or authentication needs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (4 sentences), well-structured: purpose, behavior, usage guidance, cost. Every sentence adds value; no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately outlines result fields (price, currency, in-stock state, extraction source, matches_expected/drift). For a simple 2-param tool, this is sufficient but could be more explicit about output structure.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3. The description adds meaningful context beyond schema: for 'expect_price' it explains the purpose (detect stale/hallucinated quotes) and the returned fields (matches_expected, drift). For 'url' it reinforces the expected usage. This adds value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches real-time retail prices from product URLs, distinguishing it from siblings like onyx_bazaar_compare by emphasizing 'ground-truth retail oracle' and 'long tail of no-API shops'. The verb+resource+scope are specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance: 'Use before an agent quotes, compares, or transacts on a price it would otherwise invent.' It also explains when it returns None. However, it does not explicitly mention alternatives among siblings or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_review_truthAInspect
Reputation ground-truth oracle. Give a product/business/service (+ optional aspect like 'shipping' or 'support'); get a SIGNED, web-grounded read of what real customers say right now — the aggregated public sentiment and the cited sources, as actually observed. Onyx attests WHAT WAS OBSERVED, not a trust ruling: the agent forms its own judgment from the signed evidence. For an agent about to recommend, buy from, or partner with an entity. Never fabricated. (price: $0.15 USDC, tier: premium)
| Name | Required | Description | Default |
|---|---|---|---|
| aspect | No | Optional specific aspect to focus on (e.g. 'shipping speed', 'customer support', 'refunds'). | |
| entity | Yes | Product, business, service, or seller to check the live reputation of. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It discloses that the output is 'SIGNED, web-grounded', 'aggregated public sentiment and cited sources', and that the tool 'attests what was observed, not a trust ruling'. It also mentions pricing and that it is 'Never fabricated'. However, it lacks details on rate limits, error responses, or any destructive side effects, though these are likely not applicable for a read-only oracle.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise at 4-5 sentences, each adding unique value. It is front-loaded with the main action and includes pricing and tier as relevant context. There is no wasted content, though it could be slightly more streamlined.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity and the absence of an output schema, the description sufficiently explains the output's nature (signed, web-grounded, aggregated sentiment, cited sources) and the tool's role. It covers input, output, use case, and philosophy. Minor gaps include lack of exact output format or error cases.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the baseline is 3. The description adds little beyond the schema: it repeats the parameter purposes with examples (e.g., 'shipping speed', 'customer support'). While helpful, it does not significantly enhance understanding beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the tool's verb ('get a signed, web-grounded read') and resource ('product/business/service' with optional aspect). It distinguishes itself from sibling tools like 'onyx_agent_reputation' by emphasizing that it provides observed evidence ('attests what was observed, not a trust ruling') and is a 'ground-truth oracle'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use the tool: 'For an agent about to recommend, buy from, or partner with an entity.' It provides clear context but does not explicitly mention when not to use it or compare to alternatives like 'onyx_fact_check'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_robots_checkAInspect
Fetch a domain's robots.txt and report whether a given path is allowed for a given user-agent. Returns the raw robots.txt text, the matched rule, the crawl-delay if specified, and a clean allow/disallow verdict. Use when an agent does web scraping and wants to be polite — saves bans, saves CAPTCHAs, saves drama. ~50-200ms. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Any URL on the target domain | |
| user_agent | No | UA string to test | * |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes exactly what is returned: raw text, matched rule, crawl-delay, and verdict. Also provides latency and pricing details. No annotations exist, so the description fully carries the transparency burden. Missing potential edge cases like errors, but sufficient for this tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each serving a distinct purpose: purpose, outputs, usage advice. No redundancy, front-loaded with key info.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity of the tool (2 params, no output schema), the description fully covers purpose, inputs, outputs, and usage context. The agent has all necessary information to invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameters with descriptions. The description adds no additional meaning beyond stating the purpose, so baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Fetch' and 'report' with specific resource 'domain's robots.txt' and outputs. It distinguishes itself from sibling tools by focusing on robots.txt checking, which is unique among the listed siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use when an agent does web scraping and wants to be polite' and provides motivation ('saves bans, saves CAPTCHAs, saves drama'). Also includes performance and cost, aiding decision-making.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_secure_paymentAInspect
Secure-transaction RAIL: one signed clearance before an agent sends funds. Give recipient + amount (and optionally a contract address or counterparty ERC-8004 agent id); Onyx runs the full security stack — recipient firewall, contract audit, counterparty reputation — and returns a single PASS / REVIEW / FAIL verdict + risk score, plus the Onyx take-rate quote (bps of value secured). Ed25519-signed so the clearance is provable. The check a serious agent runs before moving real money. Onyx never takes custody. (price: $0.25 USDC, tier: premium)
| Name | Required | Description | Default |
|---|---|---|---|
| recipient | Yes | 0x recipient address the agent is about to pay (Base). | |
| amount_usdc | Yes | Amount about to be sent, in USDC. Drives both the risk threshold and the take-rate quote. | |
| contract_address | No | Optional. If the payment interacts with a contract, its 0x address — triggers a full contract audit. | |
| counterparty_agent_id | No | Optional. If paying another AI agent, its ERC-8004 id — triggers a reputation check. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses key behaviors: returns PASS/REVIEW/FAIL verdict, risk score, take-rate quote; uses Ed25519 signing; Onyx never takes custody; cost is $0.25 USDC. However, it omits idempotency, side effects, or authentication requirements, but provides good overall transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph that conveys necessary information but includes minor marketing fluff ('serious agent', 'full security stack') and pricing details. It is front-loaded with the purpose, but could be more concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains return values (verdict, risk score, quote) and signing proof. It covers operational context (security stack, no custody, cost). Missing error handling or rate limits, but overall complete for a clearance tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by explaining how optional parameters trigger additional checks (contract audit, reputation). It also clarifies that amount drives risk threshold and take-rate, which goes beyond the schema's descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Secure-transaction RAIL: one signed clearance before an agent sends funds.' It specifies the action of obtaining a security verdict for a payment, distinctively positioning itself among siblings as a pre-payment clearance check. The verb 'Give' plus inputs makes it concrete.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context ('before moving real money') but does not explicitly state when to use or avoid this tool versus siblings like onyx_approval_guard or onyx_tx_guard. No alternatives or exclusions are mentioned, leaving the agent to infer.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_signature_guardAInspect
Pre-signature firewall for OFF-CHAIN drains — the check before your agent signs an EIP-712 typed-data message (Permit, Permit2, Seaport order). These drain a wallet with no on-chain approval: the signature itself is the authorization. Give the typed-data; Onyx identifies what it authorizes, flags unlimited token-permit values, EOA/unverified spenders, NFT-order signatures, and bad deadlines, and returns a SIGNED ALLOW/REVIEW/BLOCK + plain-English explanation. Covers the #2 wallet-drain vector that on-chain tx checks miss entirely. (price: $0.03 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| typed_data | Yes | The full EIP-712 typed-data object the agent is about to sign: {domain, primaryType, types, message}. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully discloses behavior: it checks for unlimited token-permit values, EOA/unverified spenders, NFT-order signatures, and bad deadlines. It returns a signed ALLOW/REVIEW/BLOCK plus plain-English explanation. It also notes the cost ($0.03 USDC) and that it is a check, not a signing action. This is comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is moderately lengthy but efficient. It is front-loaded with the purpose and context, followed by details of what it checks and returns. Every sentence adds value, though minor trimming could improve conciseness. Overall it is well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description clearly explains the output format (ALLOW/REVIEW/BLOCK plus explanation). The single input parameter is well-documented in the schema. The description covers the problem domain and expected usage comprehensively, making the tool self-contained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter 'typed_data', and the schema already provides a clear description of the expected object structure. The tool description does not add additional meaning beyond what the schema offers, so a baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it is a pre-signature firewall for EIP-712 typed-data messages, specifying types like Permit, Permit2, and Seaport orders. It explains the tool's function: identifying what a signature authorizes, flagging risks, and returning an ALLOW/REVIEW/BLOCK verdict. This is highly specific and distinguishes it from sibling tools such as onyx_tx_guard or onyx_approval_guard.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use: before signing an off-chain EIP-712 typed-data message. It contrasts with on-chain tx checks that miss this vector, making the context clear. It does not explicitly name alternatives among siblings, but the description effectively narrows the use case.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_skill_bundleAInspect
Plan a multi-tool agent workflow under one x402 budget cap. Given a list of tool endpoints (any x402 server) and a max-spend cap, returns: unified cost preview (sum of declared prices), per-step prerequisites, estimated total settlement count, and whether the bundle fits the cap. v1 = analysis card (free); v2 = actually brokers settlement. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| tools | Yes | List of tools to bundle. Each: {endpoint_url, description, depends_on (optional)}. | |
| max_spend_usdc | Yes | Bundle budget cap in USDC. Bundle is rejected if sum(prices) > cap. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries the full burden. It discloses the tool is free ($0 USDC, tier: free) and differentiates v1 (analysis) from v2 (settlement). However, it does not elaborate on what 'brokers settlement' entails (e.g., transactions, permissions), leaving some behavioral ambiguity.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, front-loading the core purpose and outputs. Every sentence adds essential information without redundancy. Parenthetical price/tier is efficiently appended.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, so the description must explain return values. It lists four output components (cost preview, prerequisites, settlement count, bundle fit) and versioning. However, it lacks specifics on output format (e.g., JSON structure), which could be clarified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. The description adds context beyond schema by explaining the output semantics (unified cost preview, prerequisites, etc.) and versioning, which helps the agent understand the tool's value beyond parameter definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Plan a multi-tool agent workflow under one x402 budget cap.' It specifies inputs (list of tool endpoints, max-spend cap) and outputs (unified cost preview, per-step prerequisites, estimated total settlement count, bundle fit). It also distinguishes between v1 and v2, making the function specific and distinct from siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for budget-constrained workflow planning but does not explicitly state when to use this tool versus alternatives like onyx_x402_simulate or onyx_x402_facilitators. It mentions v1 (analysis card) and v2 (brokers settlement) but lacks clear 'when-not-to-use' or alternative guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_solana_jupiter_quoteAInspect
Best-route swap quote on Solana via Jupiter aggregator. Pass inputMint + outputMint + amount (in input mint's smallest units) and get the best route across all Solana DEXes (Orca, Raydium, Meteora, Phoenix, Lifinity, etc.) with price impact, expected output, intermediate hops, and slippage. Use BEFORE every Solana swap to lock execution price. Cheaper than every alternative — Jupiter's API is free but requires no API key tracking; we charge $0.001 USDC per quote with no signup. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| amount | Yes | Amount in smallest units of input_mint (string to avoid float precision) | |
| input_mint | Yes | Input token SPL mint (base58) | |
| output_mint | Yes | Output token SPL mint (base58) | |
| slippage_bps | No | Max slippage in basis points (50 = 0.5%, default 50) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses the tool's nature (quote only, not execution), mentions price impact, expected output, intermediate hops, and slippage. Also explains pricing ($0.001 USDC). Does not cover rate limits or error behavior, but sufficient for a simple quote tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, efficiently conveying purpose, parameters, usage guidance, and pricing. Front-loaded and no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description mentions expected output (price impact, expected output, intermediate hops, slippage). Adequately covers the tool's functionality for an AI agent to use it correctly. Could detail error cases or response structure, but sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with all parameter descriptions. Description adds minimal extra: confirms amount is in smallest units and mentions slippage_bps default. Baseline 3 as description adds little beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it provides the best-route swap quote via Jupiter aggregator on Solana. It lists supported DEXes and distinguishes itself from sibling tools (e.g., onyx_base_tx_decode, onyx_solana_tx_explainer) by focusing specifically on quote retrieval.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use BEFORE every Solana swap to lock execution price', giving clear context. Mentions pricing comparison to alternatives, but does not explicitly state when not to use or name specific alternatives (aside from implied cheaper).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_solana_token_metadataAInspect
Resolve name + symbol + decimals + total supply for any SPL token on Solana mainnet. Reads the SPL Mint account directly + derives the Metaplex metadata PDA for human-readable name/symbol. Pairs with onyx_solana_token_risk_scan for full pre-trade safety. Cheaper than OATP ($0.001) and Helius ($0.001 + API key) — Onyx uses free public RPC and bills only the agent's wallet. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| mint | Yes | base58-encoded SPL mint address |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It explains reading SPL Mint account directly, deriving Metaplex metadata PDA, and billing details. No side effects or contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences: purpose, method, and pricing/comparison. No wasted words, front-loaded with core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description covers purpose, method, pricing, and pairing. Complete enough for an agent to select and invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Single parameter 'mint' described as 'base58-encoded SPL mint address' in schema. Description adds meaning by explaining it reads the Mint account and derives Metaplex metadata, enriching context beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Explicitly states 'Resolve name + symbol + decimals + total supply for any SPL token on Solana mainnet.' Verb 'resolve' and resource 'SPL token metadata' are clear. Distinguishes from sibling 'onyx_solana_token_risk_scan' by stating they pair for pre-trade safety.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context: 'Pairs with onyx_solana_token_risk_scan for full pre-trade safety.' Also compares pricing to alternatives. Lacks explicit when-not-to-use or alternative sibling suggestions, but context is strong.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_solana_token_risk_scanAInspect
Rug-vector risk scan for any SPL token on Solana mainnet. Checks mint authority (active = can mint unlimited supply), freeze authority (active = can freeze any holder's wallet), top-10 holder concentration (whale risk), supply rationality, and pump.fun bonded/unbonded state. Returns 0-100 risk score + verdict (safe/caution/high_risk/likely_rug) + ranked risk_factors. Designed for memecoin/sniper/MEV agents that need a sub-second pre-trade gate. OATP charges $0.50 for the same primitive — Onyx is half-price, no API key, USDC-direct. (price: $0.25 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| mint | Yes | base58-encoded SPL mint address |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations exist, so the description carries full burden. It discloses the specific checks performed and return structures (score, verdict, risk_factors). It highlights performance (sub-second) and pricing (no API key). Missing potential error modes or limitations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph that efficiently conveys key information. It is not overly long, and each sentence adds value. Could be structured into bullets for readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description compensates for the missing output schema by explicitly listing outputs (risk score, verdict, risk factors). It covers the main checks. However, it lacks details on scoring methodology or potential failure modes, which would be useful for an agent to interpret results.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline 3. The description does not add additional meaning for the 'mint' parameter beyond the schema's base58-encoded address description. It provides context about usage but not param-specific semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it performs a rug-vector risk scan for SPL tokens on Solana, listing specific checks (mint authority, freeze authority, concentration, supply rationality, pump.fun state) and outputs (risk score, verdict, risk factors). This is distinct from sibling tools like onyx_solana_token_metadata.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly targets memecoin/sniper/MEV agents needing a sub-second pre-trade gate, providing clear context. However, it does not mention when not to use it or compare with other risk tools among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_solana_tx_explainerAInspect
Decode a Solana mainnet transaction into a human-readable summary. Returns a one-line plain-English description (SPL transfers, swaps, stake ops, NFT moves), parsed token-balance pre/post per account, SOL-balance deltas, programs invoked, compute units used, and fee. Use when a trading agent needs to verify a Solana tx actually did what it claims, or when a wallet agent needs to explain an action to its user. Direct equivalent of OATP's $0.10 service (1,350+ unique paying agents) at half the price, no API key, x402-native. (price: $0.05 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| signature | Yes | base58-encoded Solana tx signature (~88 chars) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses the tool's behavior: returns a summary, parsed token balances, SOL deltas, programs, compute units, and fee. Also mentions pricing and network specificity. It doesn't explicitly state it's read-only, but the nature of summarizing implies non-destructive operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with three parts: core function and output, use cases, and pricing comparison. It front-loads the purpose. Slightly longer due to pricing detail, but every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Without an output schema, the description thoroughly explains what the tool returns: one-line description, token balances, SOL deltas, programs, compute units, fee. Also includes pricing and tier. This is complete for a decode/summarize tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (single parameter 'signature' with description 'base58-encoded Solana tx signature (~88 chars)'). The description adds that the transaction must be from Solana mainnet, but the schema already specifies 'Solana tx signature'. Baseline 3 is appropriate as the description adds little beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool decodes a Solana mainnet transaction into a human-readable summary, listing specific output components (SPL transfers, swaps, token balances, etc.). It distinguishes from siblings by being Solana-specific and providing a summary, differentiating from other tx-related tools like onyx_base_tx_decode.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives clear use cases: 'when a trading agent needs to verify a Solana tx actually did what it claims, or when a wallet agent needs to explain an action to its user.' It provides context but does not explicitly exclude alternatives or mention when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_solana_wallet_activityAInspect
Recent on-chain activity for any Solana wallet. Returns the last N signatures (default 25, max 100) with slot, block_time, status, fee, and best-effort program/action classification (swap, transfer, stake, NFT). Designed for whale-watching, copy-trading, and risk-monitoring agents that need a sub-second feed without managing their own RPC. Cheaper than Helius webhooks ($25/mo) and Birdeye wallet-portfolio ($0.002 + API key). x402-direct, no signup. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Number of recent signatures (1-100, default 25) | |
| wallet | Yes | base58-encoded Solana wallet address |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It explains what is returned, mentions 'best-effort classification', and notes performance (sub-second feed). No destructive actions implied. Sufficient for a read-only query.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is reasonably concise and front-loaded with purpose and output. Includes marketing comparisons which add some length but provide useful context. Could be slightly tighter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 2 params and no output schema, it covers purpose, output details, use cases, performance, and cost. Missing error handling or pagination but sufficient for typical use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description does not add significant meaning beyond the schema; it restates default and max for limit, which are already in the schema. No additional parameter context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Recent on-chain activity for any Solana wallet' and specifies returns signatures with slot, block_time, status, fee, and program/action classification. Distinguishes from siblings like token metadata or tx explainer.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says designed for whale-watching, copy-trading, risk-monitoring. Provides clear use cases but does not explicitly exclude alternatives or mention when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_token_metadataAInspect
ERC-20 token metadata lookup on Base mainnet: name, symbol, decimals, and total supply for any contract address. Use before transacting with a token agents discover at runtime — confirms the contract is a real ERC-20 and resolves human-readable identity. Reads via Base public RPC, ~150-300ms typical. Pairs with onyx_base_tx_decode for full token-flow context. No vendor key needed. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| address | Yes | 0x-prefixed ERC-20 contract address on Base |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses that reads are via Base public RPC, typical latency of 150-300ms, no vendor key needed, and mentions pricing. It implicitly indicates read-only behavior through terms like 'lookup' and 'confirms,' providing sufficient transparency for safe invocation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured: purpose first, then usage context, performance, pairing, and pricing. Every sentence serves a distinct purpose with no redundancy, making it highly concise and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple one-parameter tool with no output schema, the description covers all necessary information: what it does, when to use it, performance characteristics, external dependencies, and pricing. It also references a companion tool, making the context complete for the agent's decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the single 'address' parameter already described as '0x-prefixed ERC-20 contract address on Base.' The description repeats this same information, adding no additional semantic value beyond what the schema provides, so the baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the tool as an ERC-20 token metadata lookup on Base mainnet, listing specific fields (name, symbol, decimals, total supply) and the purpose: confirming a contract is a real ERC-20 and resolving human-readable identity. It distinguishes itself from siblings like onyx_base_tx_decode by noting its use before transacting and pairing context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly recommends use before transacting with a token discovered at runtime and mentions pairing with onyx_base_tx_decode for full context. While it doesn't state when not to use it, the guidance is clear and practical for the intended scenario.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_track_recordAInspect
Onyx's measured precision — the proof no other x402 security tool can show. Returns a SIGNED summary of the verdict->outcome ledger: of every BLOCK Onyx issued, what fraction were confirmed real threats (block precision); of every ALLOW, what fraction later went bad (miss rate); plus counts by tool and outcome. Built from real reported outcomes via onyx_outcome_report. Free and public — this is how you verify Onyx is calibrated, not just confident. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| tool | No | Optional. Restrict the track record to one tool (e.g. onyx_tx_guard). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It reveals the tool is read-only, free, and public, but lacks explicit statements about idempotence, authentication needs, or rate limits. The marketing language reduces clarity on behavioral traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is overly verbose with marketing fluff like 'Onyx's measured precision — the proof no other x402 security tool can show.' It could be condensed to one or two factual sentences without losing meaning.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has one optional parameter and no output schema, the description adequately explains what it returns (signed summary, precision, miss rate, counts) and data source. It is sufficient for an agent to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'tool' is described as optional with an example. Schema coverage is 100%. The description adds context by mentioning 'counts by tool and outcome', which explains the parameter's purpose. No further detail needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it returns a signed summary of verdict-to-outcome ledger, including block precision, miss rate, and counts. It distinguishes itself as the tool for verifying Onyx calibration, setting it apart from sibling tools that focus on individual actions or reports.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use when wanting to verify Onyx's calibration and notes it is free and public. However, it does not explicitly state when to use this tool versus alternatives, nor does it provide when-not-to-use conditions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_tx_guardAInspect
Pre-payment security firewall. Give the recipient address your agent is about to pay (Base); get a SIGNED ALLOW/REVIEW/BLOCK verdict + risk score from real on-chain checks: EOA-vs-contract, contract code/verification, account age (tx count), funding history, burn/null-address guard, and sink/honeypot heuristics. Catches paying a brand-new, unverified, or drain-shaped recipient BEFORE the money leaves. Never guesses — every field is observed on-chain and Ed25519-signed. (price: $0.10 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| address | Yes | 0x recipient address your agent is about to send funds to (Base mainnet). | |
| amount_usdc | No | Optional amount about to be sent (USDC). Larger amounts raise the review threshold. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description fully discloses behavior: it performs on-chain checks (EOA vs contract, account age, funding history, burn/null-address, sink/honeypot heuristics) and returns a signed verdict. It explicitly states 'Never guesses' and that every field is observed on-chain and Ed25519-signed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is moderately concise, effectively front-loading the purpose and then listing the checks. Each sentence adds value, though some redundancy exists (e.g., 'pre-payment' and 'before the money leaves'). Slight improvement possible by trimming repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no output schema, the description adequately explains the return value (signed allow/review/block verdict + risk score). It also includes pricing and tier information. The tool's complexity is high, but the description covers all necessary aspects for agent selection and invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already describes both parameters (address and amount_usdc) with 100% coverage. The description adds no additional semantic detail beyond the schema, only reinforcing the context of 'recipient address your agent is about to pay (Base)'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose as a 'Pre-payment security firewall' that provides a signed verdict and risk score for a recipient address. It distinguishes itself from sibling tools by focusing on on-chain security checks before payment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context on when to use: before paying a recipient address on Base to get a signed allow/review/block verdict. It does not explicitly state when not to use or list alternatives, but the purpose is well-scoped.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_tx_preflightAInspect
Universal pre-sign firewall — the check before your agent signs ANY transaction. Give the tx (to, data, value); Onyx decodes the 4-byte selector + args, tells you in plain terms what it does, and flags the wallet-drain patterns: unlimited approve(), setApprovalForAll (blanket NFT approval), transfers to fresh/EOA recipients, raw ETH sends to unknown addresses, and calls to unverified targets. Returns a SIGNED ALLOW/REVIEW/BLOCK + human explanation. The single highest-frequency safety gate an on-chain agent has — every signed tx should pass through it. (price: $0.10 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| to | Yes | The transaction's `to` address (target contract or recipient). Required. | |
| data | No | The transaction calldata (0x-hex). Omit/empty for a plain ETH transfer. | |
| value_wei | No | Optional. ETH value being sent, in wei (base-10 string). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden. It discloses that the tool decodes the 4-byte selector and args, identifies wallet-drain patterns, and returns a verdict (ALLOW/REVIEW/BLOCK) with explanation. It also mentions pricing ($0.10 USDC) and tier (metered). It does not cover edge cases like unrecognized selectors, but overall transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise at ~100 words, front-loaded with the core purpose, and efficiently covers functionality, patterns flagged, and return type without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity and no output schema, the description adequately explains the output (SIGNED ALLOW/REVIEW/BLOCK + human explanation). It covers the main behavior and distinguishes it from many transaction-related siblings. Minor omission: no details on explanation structure or verdict triggers.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear parameter descriptions. The description reinforces the three parameters (to, data, value_wei) in context but adds no new constraints or formatting details beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool as a 'Universal pre-sign firewall' that checks transactions before signing, decodes them, and flags dangerous patterns like unlimited approve and blanket NFT approvals. It distinguishes from siblings (e.g., onyx_base_tx_decode) by emphasizing its safety focus and specific patterns detected.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explicitly says 'the check before your agent signs ANY transaction' and recommends every signed tx should pass through it. This provides clear context for when to use it, though it does not explicitly exclude alternative tools or describe when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_url_parseAInspect
Parse any URL into structured components: scheme, host, port, path, query params (as both raw and decoded list), fragment, userinfo. Use when an agent needs to inspect, modify, or validate a URL — change a query param, strip tracking, normalize for caching. Stdlib only, no network calls, <1ms. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to parse |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that it uses stdlib only, makes no network calls, and completes in <1ms, along with pricing and tier. No annotations were provided, so the description fully handles transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences: purpose, use cases, and behavioral/pricing info. Every sentence adds value, front-loaded with core function, no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one required parameter and no output schema, the description covers purpose, usage context, behavioral traits, and output structure adequately. It is fully complete for the agent's needs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with one param 'url' described as 'URL to parse'. The description adds value by specifying the output structure (scheme, host, port, etc.), which is not in the schema due to missing output schema. This exceeds the baseline of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Specifically states 'Parse any URL into structured components' with a clear verb and resource, and lists the output components, distinguishing it from sibling tools like onyx_url_text and onyx_url_unshorten.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides use cases ('inspect, modify, or validate a URL — change a query param, strip tracking, normalize for caching'), guiding the agent on when to invoke. Does not explicitly state when not to use or list alternatives, but the context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_url_textAInspect
Fetch any public URL and return the readable text content stripped of HTML/scripts/styles. Use when an agent needs to reason over a web page without rendering a browser — docs pages, articles, search-result snippets, GitHub READMEs. Returns plain text + page title + word count + final URL after redirects. Capped at 200kB output to keep token costs predictable. ~150-800ms typical depending on origin server. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | HTTPS URL to fetch | |
| max_chars | No | Truncate output (max 200000) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavior: returns plain text, page title, word count, final URL after redirects; output capped at 200kB; typical latency 150-800ms; price and tier mentioned. This covers key behavioral traits beyond what annotations would provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and well-structured: first sentence states the core action, followed by usage context, return details, and performance/pricing. Every sentence adds value with no redundancy, making it easy to scan.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and no annotations, the description covers core aspects: what it does, return values, limitations, performance, and pricing. It lacks explicit error handling details (e.g., unreachable URL), but for a simple fetch tool this is a minor gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds minimal extra value by mentioning the output cap (200kB) in relation to max_chars, but the schema already describes max_chars as 'Truncate output (max 200000)'. No significant addition beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches any public URL and returns readable text content, specifying what is stripped (HTML/scripts/styles) and what is returned (plain text, page title, word count, final URL). It provides specific use cases (docs pages, articles, etc.), distinguishing it from siblings like onyx_url_unshorten.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description guides when to use the tool: 'Use when an agent needs to reason over a web page without rendering a browser' and lists example scenarios. It doesn't explicitly state when not to use it, but the sibling tool onyx_url_unshorten provides an implied alternative for URL expansion.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_url_unshortenAInspect
Follow HTTP redirects on any URL and return the final destination + the full redirect chain. Use when an agent encounters a bit.ly/t.co/lnkd.in/ shortened link and needs to know where it actually goes before clicking. Returns each hop's status code, location, and final URL with status. Cap of 10 hops to prevent loops. ~100-400ms typical. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to unshorten |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description bears full responsibility. It discloses the redirect-following behavior, return details (each hop's status code, location, final URL), a cap of 10 hops, and typical latency. It does not mention error handling, but the transparency is high.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: two sentences plus performance/pricing info. No redundant words, and the main action is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple single-parameter tool and no output schema, the description fully covers purpose, usage, behavioral constraints (10 hops), return content (hops, final URL), and performance expectations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with one parameter. The description adds meaningful context beyond the schema's minimal 'URL to unshorten' by explaining the unshortening process and typical use case.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool follows HTTP redirects and returns the final destination with the full redirect chain. It distinguishes itself from sibling tools like onyx_url_parse by explicitly mentioning shortened links (bit.ly/t.co/lnkd.in) and the need to know the actual destination.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a clear use case: 'when an agent encounters a bit.ly/t.co/lnkd.in/ shortened link.' It implies when to use but does not explicitly state when not to use or mention alternative tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_usage_rightsAInspect
Mint a signed Output Usage-Rights Envelope for an agent-produced artifact: a portable, Ed25519-signed declaration of what the buyer may do with it (resale, redistribution, derivatives, model training, cache TTL). Bind it to the artifact by hash and optionally to an x402 payment. Verify free with onyx_attestation_verify. Rights travel with the data — any downstream holder can check the terms offline. (price: $0.01 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| output | No | Or: the artifact itself (object or string); it is hashed with sha256 over its JCS-canonical form | |
| rights | Yes | Rights grid. Keys: resale, redistribute, derivatives, retrain — each 'allow' | 'deny' | 'with-attribution' | 'contact-licensor'. Plus optional cache_ttl_seconds (int). | |
| licensee | No | Optional: who the rights are granted to (wallet/agent id); omit for bearer terms | |
| licensor | Yes | Issuer of the rights (name, domain, or wallet) | |
| expires_at | No | Optional: unix time the grant lapses | |
| output_hash | No | sha256:<hex> of the artifact the rights apply to | |
| payment_ref | No | Optional: x402 tx hash / receipt id binding rights to the purchase |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses the tool's behavior: it creates an Ed25519-signed envelope, binds to artifact hash and optional payment, and allows offline verification. It also reveals cost and tier, ensuring no hidden surprises.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured paragraph that front-loads the main action, then explains binding, verification, portability, and cost. Every sentence is informative and necessary, with no waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (7 parameters, nested object, no output schema), the description effectively conveys what the tool does and its key aspects. It lacks explicit output format details but the context of a portable signed envelope is clear.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds value by explaining the overall purpose and providing examples for the rights grid (allow/deny/with-attribution/contact-licensor), which enhances understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it mints a signed Output Usage-Rights Envelope for an agent-produced artifact. It uses specific verbs ('mint', 'bind', 'verify') and identifies the unique resource, distinguishing it from siblings like onyx_attestation_verify.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear guidance on when to use the tool (to create usage rights) and mentions a complementary tool (onyx_attestation_verify) for verification. It does not explicitly state when not to use, but the context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_user_agent_parseAInspect
Parse any HTTP User-Agent string into a structured record: browser name/version, OS name/version, device type (desktop/mobile/tablet/bot), rendering engine. Use for analytics, fraud scoring, or routing logic based on client capabilities. Stdlib regex only — works offline, no external lookups. <2ms. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| user_agent | Yes | User-Agent header value |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description bears full responsibility. It discloses performance (<2ms), pricing ($0.001 USDC), and runtime behavior (regex-only, no external calls). It does not mention auth or side effects, but as a read-only parse, that is implied and acceptable.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences: the first efficiently states the verb and object, the second provides use cases and technical details. No fluff, every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple one-parameter tool with no output schema, the description covers purpose, internal mechanism (regex offline), performance, pricing, and use cases. No gaps remain for an agent to understand when and how to use this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema covers 100% of parameters with a single 'user_agent' field described as 'User-Agent header value'. The description adds no additional semantic detail beyond the schema, meeting the baseline for a tool with full schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description states a specific verb ('Parse') and resource (HTTP User-Agent string), lists output fields (browser, OS, device type, rendering engine), and gives explicit use cases (analytics, fraud scoring, routing). This clearly differentiates from sibling tools, which cover diverse domains like DNS, email, or blockchain.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for use ('analytics, fraud scoring, routing logic') and technical constraints ('Stdlib regex only — works offline, no external lookups'). It lacks explicit when-not-to-use or named alternative tools, but given the unique function among siblings, this is acceptable.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_verify_explainAInspect
Diagnose a failing x402 v2 /verify. Decodes a captured X-PAYMENT header, runs 10 rules (decode, schema, network/asset/payTo match, value sufficiency, EIP-3009 timing, signature shape, scheme) against expected paymentRequirements, and returns the FIRST failing rule with a plain-English fix. Catches the common case where the facilitator returns bare HTTP 402 (no body) because of JWT or schema fail upstream of the verifier. Stdlib-only, no install, no network. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| now_unix | No | Override current unix time for replay/CI use. Defaults to now. | |
| x_payment_b64 | No | Base64-encoded X-PAYMENT (v2 PAYMENT-SIGNATURE) header value. Optional if payment_payload provided. | |
| payment_payload | No | Decoded payment payload dict. Use this OR x_payment_b64. | |
| payment_requirements | Yes | Expected paymentRequirements from the 402 challenge ({scheme, network, payTo, asset, maxAmountRequired, maxTimeoutSeconds, ...}). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes behavior: decodes header, runs 10 rules, returns first failing rule with fix. Discloses stdlib-only, no network, free tier. No annotations to contradict; covers key traits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences without filler. Front-loaded purpose, then rules, then specific caveat. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers what it does, what it returns (first failing rule), and a common scenario. No output schema but return behavior implied. Adequate for a diagnostic tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%; description adds context linking parameters to rules (e.g., payment_requirements used for matching). Does not repeat schema but provides useful relational info, just above baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool diagnoses a failing x402 v2 /verify by decoding headers and running 10 rules. Differentiates from sibling tools like onyx_x402_simulate by focusing on debugging the exact failure.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides context on when to use (failing /verify) and highlights a common case (bare 402 from JWT/schema fail). Lacks explicit 'do not use' guidance but is clear enough for most scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_whoisAInspect
Domain whois via the modern RDAP protocol — registrar, creation/expiry dates, nameservers, registrant country, status flags. Use to vet a domain before agents transact with it (newly registered = higher fraud risk), check trademark conflicts, or confirm ownership transfer. Powered by rdap.org (no API key, free tier). ~200-500ms typical. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain name, e.g. example.com |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses key behavioral traits: it uses the RDAP protocol via rdap.org (free tier, no API key), typical response time of 200-500ms, and cost of $0.001 USDC per call. This goes beyond the schema, though rate limits are not mentioned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, well-structured, and front-loaded with the core function. Every sentence adds value (purpose, data fields, use cases, source, performance, cost) without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description covers all relevant aspects: what it does, what data it returns, when to use it, performance, cost, and external source. It is complete enough for an agent to decide to invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already fully describes the single 'domain' parameter with an example. The description adds context about usage but no additional parameter-level details, so it adds marginal value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs domain whois via RDAP and lists the specific data returned (registrar, dates, nameservers, etc.). It distinguishes from sibling tools like onyx_dns_lookup by focusing on domain registration information.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear use cases: vetting domains for fraud risk, checking trademark conflicts, and confirming ownership transfer. It does not explicitly compare to siblings but the distinct function implies when to use this tool over others.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_x402_chain_pickerAInspect
Pick the optimal chain for an x402 USDC payment. Given target amount + agent's available chains, ranks by facilitator support, live gas, finality time, and USDC contract maturity. Returns ranked list with explanations. Free tier — agents shouldn't hardcode 'base' when their wallet has L2 options. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| amount_usdc | Yes | Target USDC amount the agent wants to settle. | |
| production_only | No | If true, exclude testnets from ranking. | |
| available_chains | No | Chains the agent's wallet supports. Subset of: base, base-sepolia, optimism, arbitrum, polygon. Default = ['base', 'base-sepolia']. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description bears full responsibility for disclosure. It reveals that the tool ranks chains using specific criteria, returns a ranked list with explanations, and is free tier. It does not mention auth needs, rate limits, or side effects, but given the non-destructive, read-only nature inferred, the behavioral context is adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: two sentences that efficiently convey purpose, input, output, and a usage hint. It is front-loaded with the primary action, and every sentence adds essential information without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 3 parameters, 100% schema coverage, and no output schema, the description sufficiently covers what an agent needs: input requirements (target amount, optional chains, testnet filter), behavior (ranking criteria), and output (ranked list with explanations). The free tier note and the hint about not hardcoding complete the context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so parameters are documented. The description adds value by explaining the tool's ranking logic and output (ranked list with explanations), which the schema does not cover. It also clarifies the default for available_chains and the role of production_only, going beyond the schema's basic descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Pick the optimal chain for an x402 USDC payment.' It specifies the ranking criteria (facilitator support, gas, finality, USDC contract maturity) and output (ranked list with explanations). This distinguishes it from sibling tools like onyx_x402_facilitators or onyx_solana_jupiter_quote, which cover different aspects.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context: 'Given target amount + agent's available chains' and includes a practical tip: 'agents shouldn't hardcode 'base' when their wallet has L2 options.' It implicitly tells when to use (when selecting a chain for x402 payment) but lacks explicit when-not-to-use or alternative tool references.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_x402_demo_walletAInspect
Dev-sandbox wallet helper for x402 testing. Generates a deterministic ephemeral Sepolia wallet (or accepts your address), reports ETH + USDC Sepolia balances, points to the Circle USDC Sepolia faucet, and emits a copy-paste env config for x402 client SDKs. SANDBOX ONLY — generated keys are deterministic and MUST NOT receive real value. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| seed | No | String seed for deterministic sandbox address generation. Same seed always yields same address. NOT cryptographically secure; sandbox only. | |
| existing_address | No | Existing 0x-prefixed address to check. If omitted, generates an ephemeral sandbox address from `seed`. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It properly discloses that keys are deterministic and not secure, that it generates ephemeral wallets, and that it reports balances and emits config. No hidden side effects or destructive actions are mentioned, and the sandbox-only restriction is clear.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-structured sentence that front-loads the purpose and lists all key features. Every part adds value, with no redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple demo tool with no output schema, the description covers inputs, behavior, limitations (not for real value), output type (env config), and cost. It is complete for the agent to understand and invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema descriptions cover both parameters fully (100% coverage). The description adds behavioral context: seed determines address, and if omitted, generates one. This adds meaning beyond the schema, justifying a score above baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it is a 'Dev-sandbox wallet helper for x402 testing' and lists specific actions: generate deterministic wallet, report balances, point to faucet, emit env config. It distinguishes itself from siblings by focusing on wallet operations for testing, unlike other x402 tools like simulation or lookup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly warns 'SANDBOX ONLY — generated keys are deterministic and MUST NOT receive real value,' and mentions free cost. It implies usage for x402 testing, though it does not explicitly say 'use this when you need a sandbox wallet for x402 testing' or compare with alternatives among siblings, which are not direct competitors.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_x402_error_explainAInspect
Plain-English explainer for any HTTP error in an x402 / MCP flow. Pass the status code + response body (or headers), get back a diagnosis with specific cause and actionable fix. Covers FastAPI validation, OAuth2 DCR failures, EIP-712 signature errors, x402 spec error codes, and Coinbase facilitator-specific responses. Free tier — local logic, no network calls. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| status_code | Yes | HTTP status code returned by the server (e.g. 402, 422, 401). | |
| response_body | No | Raw response body as text. JSON is auto-detected and parsed. | |
| request_summary | No | Optional. One-line summary of what the caller was trying to do. | |
| response_headers | No | Optional. HTTP response headers, useful for 402 challenges that carry payment-required header. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It mentions local logic, no network calls, and covers various error types. However, it does not disclose limitations (e.g., might not cover all possible errors), side effects, or confidentiality of passed data. With no annotations, more transparency would be beneficial.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is four sentences, front-loaded with purpose. Every sentence adds value: explains what it does, how to use it, what it covers, and pricing. No extraneous words, and it is well-structured for quick comprehension.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has no output schema, so the description should cover what the output looks like. It says 'diagnosis with specific cause and actionable fix,' which is good. It also covers the error domains. However, it does not specify the output format (e.g., plain text vs. structured JSON). For a tool with no output schema, this is a minor gap. Score 4.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds: 'Pass the status code + response body (or headers),' which clarifies usage pattern. It also mentions optional request_summary. However, it does not add significant depth beyond the schema's parameter descriptions, so a score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Plain-English explainer for any HTTP error in an x402 / MCP flow.' It specifies the action (explain), the resource (HTTP errors in x402/MCP flow), and covers specific areas like FastAPI validation, OAuth2 DCR failures, etc. This distinguishes it from sibling tools which are mostly unrelated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description says 'Pass the status code + response body (or headers), get back a diagnosis...' This clearly indicates when to use the tool (when encountering an HTTP error in x402/MCP flow). It also mentions 'Free tier — local logic, no network calls,' which helps set expectations. However, it does not explicitly state when not to use or list alternatives, though given the sibling tools are different, this is not a major gap.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_x402_facilitatorsAInspect
Directory of known x402 facilitators (Coinbase CDP, x402.org public, xpay.sh, Cronos, Faremeter, others), each with live reachability probe, supported networks, payment auth scheme (JWT / open), and median latency. Agents use this to choose where to route /verify and /settle calls. Free tier — refreshes per call, no API key. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| network | No | Filter facilitators that support this network (CAIP-2 like 'eip155:8453', or short form 'base', 'base-sepolia', 'solana'). Omit for all. | |
| include_inactive | No | Include facilitators that didn't respond on probe. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses it's free, refreshes per call, requires no API key, and performs live reachability probes. Implicitly read-only. Could be more explicit about non-destructive nature but overall adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is one paragraph but includes all key info: data fields, pricing, use case. Could be slightly more structured (e.g., separate sentences), but no wasted words and front-loaded with core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, description lists the data points returned (reachability, networks, auth scheme, latency) which is sufficient. Explains refresh behavior and cost. Completeness is adequate for a simple directory tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so schema already documents both parameters. Description adds context like 'filter facilitators that support this network' and 'include ones that didn't respond', but largely restates schema info. Meets baseline without adding significant new meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it's a directory of x402 facilitators with specific data fields (live reachability, networks, auth scheme, latency). Distinguishes from sibling tools like onyx_x402_simulate and onyx_x402_spec_lookup by focusing on discovery and routing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states agents use this to choose routing for /verify and /settle calls. Mentions free tier and refresh behavior. Does not explicitly say when not to use, but purpose is clear enough for an AI agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_x402_indexer_healthAInspect
Is your x402 endpoint actually discoverable? Probes 6 indexers agents use (CDP discovery, Bazaar mirror, awesome-x402 README, awesome-mcp-servers README, x402scan, Smithery) and returns per-indexer presence + a single recommended action. Free tier — every paid-MCP-builder hits the same invisible-launch problem and this is the missing observability tool. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Domain or full URL to check (e.g. 'onyx-actions.onrender.com' or 'https://api.oatp.cc'). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses it probes external indexers and returns results, but it does not detail potential side effects, rate limits, or authentication needs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively concise and front-loaded with a hook question. It includes a marketing note but overall uses few words effectively.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one param, no output schema), the description covers what it does and why it's useful. It does not detail output format but is sufficient for the intended use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the parameter description is already clear. The tool description does not add extra semantic information beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it checks if an x402 endpoint is discoverable by probing 6 specific indexers and returns per-indexer presence and a recommended action. It distinguishes from sibling tools by focusing on x402 indexer health.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use (to check indexer discoverability for x402 endpoints) and highlights the problem it solves, but it does not explicitly mention when not to use or provide alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_x402_receipt_verifyAInspect
Verify an x402 USDC settlement on Base or Base Sepolia. Given a tx hash, decodes the USDC Transfer log and confirms (or refutes) a claim of the form: 'tx X moved $Y USDC from A to B'. Returns success status, actual decoded values, and a clear discrepancy report if any field doesn't match. Free tier — useful for agents reconciling spend and operators auditing inbound payments. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| network | No | Chain to query. Must match where the tx was mined. | base |
| tx_hash | Yes | 0x-prefixed 32-byte tx hash to verify. | |
| expected_to | No | Optional. Expected recipient address (0x...). If provided, verifier checks Transfer.to matches. | |
| expected_from | No | Optional. Expected sender address (0x...). If provided, verifier checks Transfer.from matches. | |
| expected_amount_usdc | No | Optional. Expected USDC amount (whole USDC, not atomic). If provided, verifier checks Transfer.value matches (within 0.000001 tolerance). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses the core behavior ('decodes the USDC Transfer log and confirms/refutes a claim'), the output format (success status, decoded values, discrepancy report), and that it's free. It doesn't mention side effects, but the tool is inherently read-only.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (3-4 sentences) and front-loaded with the core purpose, followed by details and use case. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description fully covers what the tool does, how it works (decoding log, returning discrepancy report), and its output. Without an output schema, it provides enough information for an agent to understand the response format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, so the description adds minimal new information beyond what the schema already provides. The schema already includes details like tolerance for expected_amount_usdc. Baseline of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Verify an x402 USDC settlement'), the specific resource (x402 receipt on Base/Base Sepolia), and the method (given a tx hash). It distinguishes itself from sibling x402 tools by focusing on receipt verification.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit use cases ('agents reconciling spend and operators auditing inbound payments') and context (free tier). While it doesn't explicitly state when not to use or name alternatives, the sibling list includes other x402 tools that might be more suitable for other tasks, so inference is possible.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_x402_simulateAInspect
Simulate an x402 v2 payment flow against any paid endpoint. Fetches the 402 challenge (or introspection card), parses paymentRequirements, generates a template X-PAYMENT payload with the exact fields an agent would need to sign (EIP-3009 authorization shape, validBefore window, asset address, recipient), and returns next-step guidance. Pure simulation — no keys, no signing, no payment. SSRF-hardened. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| method | No | GET hits the introspection card (works even on unpaid endpoints); POST hits the live 402 challenge (works on every paid endpoint). | GET |
| endpoint_url | Yes | Full URL of the paid endpoint to simulate against (e.g. https://onyx-actions.onrender.com/v1/onyx_aml_screen). | |
| signer_address | No | Address that would sign the EIP-3009 authorization. Used to fill the 'from' field in the template payment payload. Optional — defaults to 0x0...01. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully describes the tool's behavior: it fetches challenges, parses requirements, generates a template payload, and returns next-step guidance. It also mentions SSRF-hardening and pricing, providing comprehensive behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (four sentences) and front-loaded with the core purpose. Every sentence adds value: action, process, simulation emphasis, and security/pricing details. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has no output schema, but the description explains it returns 'next-step guidance' and a template payload. While it does not detail the output structure, it provides enough context for an agent to understand the simulation output. Minor ambiguity about 'next-step guidance' prevents a perfect score.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description does not add substantial meaning beyond the schema's own parameter descriptions. It reinforces the overall flow but does not introduce new details about the parameters themselves.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool simulates an x402 v2 payment flow against any paid endpoint, using specific verbs and resources. It distinguishes itself from sibling tools like onyx_x402_facilitators and onyx_x402_spec_lookup by focusing on simulation, not facilitation or spec lookup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states 'Pure simulation — no keys, no signing, no payment', clearly indicating when not to use it (for actual payments) and its safe, non-transactional nature. However, it does not name alternative tools for different tasks.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_x402_spec_lookupAInspect
Quick-reference into the x402 v2 protocol spec. Look up an error string, HTTP status, header name, payload field, or feature keyword and get the relevant spec snippet plus a plain-English fix. Replaces 30 minutes of spec-grepping with a single call. Free tier — embedded knowledge, no network. (price: $0 USDC, tier: free)
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | What to look up: error string ('invalid_exact_evm_payload_authorization_valid_before'), header name ('PAYMENT-SIGNATURE'), status code ('402'), payload field ('nonce'), or feature keyword ('EIP-3009', 'CAIP-2', 'facilitator', 'JWT'). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must fully disclose behavior. It states the tool is free, uses embedded knowledge (no network), and returns a snippet and fix. However, it does not describe behavior for unrecognized queries, error handling, or output format specifics, leaving some behavioral gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and front-loaded with the core purpose. It includes a marketing claim ('Replaces 30 minutes...') and free-tier note, which are slightly extraneous but do not significantly bloat the text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has one parameter, no output schema, and no annotations. The description sufficiently covers input types, output nature (snippet + fix), and context (free, no network). It does not detail output format but is otherwise complete for its simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a rich description for the single parameter 'query', listing valid input types. The tool description recapitulates those types but adds no new semantic information beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: looking up x402 v2 protocol spec terms (error strings, HTTP status, header names, payload fields, feature keywords) and returning a spec snippet with plain-English fix. It is distinct from sibling x402 tools (facilitators, simulate) which are not reference lookups.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description positions the tool as a quick reference to replace manual spec-grepping, implying use when needing spec information. It does not explicitly state when not to use or list alternatives, but the context is clear given sibling tools differ in function.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.