onyx-mcp
Server Details
12 paid agent tools over x402 USDC on Base — captcha, SMS OTP, HLR, URL text, DNS, email, more.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- dimitrilaouanis-tech/onyx-mcp
- GitHub Stars
- 0
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.3/5 across 29 of 29 tools scored.
Most tools have clearly distinct purposes (e.g., browser tools each handle a specific action, crypto tools separate decode/explainer). However, onyx_base_tx_decode and onyx_base_tx_explainer overlap slightly, and some browser tools (extract, screenshot) could be confused without careful reading. Overall, high disambiguation with minor potential ambiguity.
All tools follow the 'onyx_<domain>_<action>' pattern using snake_case, which is consistent. Some names like 'agent_signup_kit' or 'html_meta' deviate from the verb-final pattern, but the overall naming is predictable and readable.
With 29 tools, the server is large and covers many disparate domains (browser, crypto, email, DNS, utilities). This feels excessive for a coherent tool set, bordering on a grab bag. The high count reduces focus and makes tool selection harder for agents.
The tool set covers multiple domains but has notable gaps. For example, browser automation lacks scroll/wait, crypto only supports Base mainnet, and there is no email sending or advanced captcha support. However, each domain's core operations are present, making it functional but not comprehensive.
Available Tools
31 toolsonyx_agent_signup_kitAInspect
One-call signup-flow pre-flight for autonomous agents. Validates an email (syntax + DNS + disposable check), confirms the target domain resolves, optionally solves a captcha image, and optionally fetches an SMS OTP via real carrier SIM. Returns the consolidated pass/fail per step plus the OTP and captcha answer. Replaces four separate calls with one $0.058 USDC payment — saves agents ~7% vs unit pricing and a full round-trip per step. Ideal for browser agents at signup walls. Demo mode returns synthetic captcha + OTP so the payment loop is testable for free. (price: $0.058 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| Yes | Email to validate | ||
| domain | No | Service domain to confirm resolves | |
| phone_number | No | Optional phone for SMS OTP | |
| captcha_image_b64 | No | Optional captcha base64 | |
| captcha_image_url | No | Optional captcha image URL |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It discloses payment ($0.058 USDC), demo mode with synthetic captcha/OTP, and return format. It lacks details on error handling or failure scenarios, but overall provides substantial behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively long but every sentence adds value. It is front-loaded with the core purpose and includes pricing and demo mode. Minor redundancy in explaining replacement of four calls, but overall concise for the amount of information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately covers return format (consolidated pass/fail per step, OTP, captcha answer). It addresses pricing, demo mode, and the tool's role as a bundle. Completeness is high for a tool of moderate complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds significant meaning: email validation includes syntax, DNS, disposable check; domain must resolve; captcha can be provided as base64 or URL; SMS OTP uses real carrier SIM. This goes well beyond the schema's brief descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'One-call signup-flow pre-flight for autonomous agents.' It lists specific actions (email validation, domain confirmation, captcha solving, SMS OTP) and distinguishes itself from sibling tools by bundling four separate calls into one.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states the ideal use case ('browser agents at signup walls') and the benefit over unit pricing. However, it does not state when to avoid using this tool or explicitly mention alternatives, though sibling tools are implied.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_agent_workflowAInspect
Run a multi-step workflow across Onyx tools in one paid call. Each step names a tool and its args; later steps can reference earlier outputs via {"$ref": "step_N.field"} or {"$prev": "field"}. Saves agents the round-trip + per-call gas of N separate x402 settles when they know the chain in advance — e.g. validate email → check domain DNS → solve captcha → fetch SMS OTP, all atomic. Stops on first step error and returns partial results. Cheaper than the unit-call sum because it bundles. (price: $0.020 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| steps | Yes | Ordered list of {tool, args} |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses key behaviors: stops on first error, returns partial results, cheaper than separate calls. No annotations provided, so description carries full burden and does well.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single paragraph, no fluff, front-loaded with purpose and key behavior. Could be slightly more structured (e.g., bullet for reference syntax), but efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Explains behavior well despite no output schema: atomicity, error handling, cost. Covers typical concerns for a workflow tool. Leaves some details to schema (step constraints), but sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Adds significant meaning beyond the input schema: explains the inner structure of steps (tool, args) and crucial reference syntax ($ref, $prev). Schema coverage is 100%, but description enriches it.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it runs a multi-step workflow across Onyx tools in one paid call. Gives an explicit example chain and distinguishes from individual sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Describes when to use (when chain known in advance) and the benefit (lower cost, atomicity). Implicitly tells when not to use (if dynamic decisions needed), though no explicit exclusion.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_token_risk_scanAInspect
Risk-scan any ERC-20 token on Base mainnet. Returns ownership status (renounced or active owner address), mint authority (still mintable?), top-1 / top-10 holder concentration via balanceOf probes, contract age in days, basic honeypot signal (eth_call swapExactETHForTokens against Aerodrome to detect transfer blocks), and a 0-100 risk score with verdict (safe / caution / high_risk). Use before a trading agent buys a freshly minted token — saves blowing the entire position on a rug. Direct equivalent of OATP's Solana token_risk_scan ($0.50, 980 unique paying agents). Onyx ships at $0.25 — first on Base mainnet. (price: $0.25 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| address | Yes | 0x-prefixed ERC-20 contract address on Base |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but the description details the checks performed (ownership, mint, concentration, age, honeypot via eth_call, risk score). It also mentions price and tier. Some limitations (e.g., only Base ERC-20) are implied but not exhaustive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a dense paragraph containing multiple pieces of information. While not overly verbose, it could be structured more concisely (e.g., bullet points) for faster agent parsing.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description adequately lists return values (ownership, mint, concentration, age, honeypot, risk score). It covers the tool's purpose and major capabilities, though exact output format is unspecified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a clear description of the 'address' parameter (0x-prefixed, ERC-20, Base). The description adds no further parameter-specific details beyond the schema, so baseline 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it risk-scans any ERC-20 token on Base mainnet, listing specific checks (ownership, mint, holder concentration, contract age, honeypot, risk score). It is distinct from all sibling tools, none of which perform risk scanning.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises use 'before a trading agent buys a freshly minted token' and mentions it saves losses. No explicit when-not-to-use or alternative tools, but the context is clear and the tool stands alone among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_tx_decodeAInspect
Fetch a Base mainnet transaction by hash and return a human-readable summary: from/to, value (ETH + USD-est), gas used, status, block, input data length, and the function selector decoded if it matches a known signature. Use when a trading agent needs to inspect a tx before or after settlement — pairs with onyx_token_metadata for full context. Reads from Base's public RPC (no key needed). Demo mode returns a synthetic record. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| tx_hash | Yes | 0x-prefixed Base tx hash |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description fully covers behavioral traits: it reads from Base's public RPC (no key needed), mentions demo mode returns synthetic record, and discloses pricing tier. This is comprehensive for a read-only tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences plus a brief note on demo and pricing. It is front-loaded with the core purpose, efficient with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input (single hash), no output schema, and no annotations, the description fully explains what the tool returns (human-readable summary with list of fields) and additional context (RPC source, demo mode, pricing). It is complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has one parameter with a description '0x-prefixed Base tx hash,' and the description text mentions 'tx_hash' but adds no new semantics. Schema coverage is 100%, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches a Base mainnet transaction and returns a human-readable summary with specific fields (from/to, value, gas, status, etc.). It distinguishes itself from siblings by noting it pairs with onyx_token_metadata for full context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use when a trading agent needs to inspect a tx before or after settlement' and mentions pairing with onyx_token_metadata, providing clear context. It lacks explicit when-not-to-use but is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_tx_explainerAInspect
Decode a Base mainnet transaction into a human-readable summary. Returns a one-line plain-English description of what happened (token transfers, swaps, approvals, contract deploys), ERC-20 transfer events with symbol resolution, balance changes per address, the function selector + decoded method name where it matches a known signature, gas used, fee in ETH, and tx status. Use when a trading agent needs to verify what a tx actually did before/after settlement, or when a wallet agent needs to explain a tx to its user. Direct equivalent of OATP's Solana tx_explainer ($0.10, 1,350 unique paying agents) — Onyx is the first to ship this on Base mainnet at $0.05. (price: $0.05 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| tx_hash | Yes | 0x-prefixed Base tx hash |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations, so description carries full burden. It discloses behavioral traits: returns plain-English summary, token events, balance changes, function signature, gas, fee, status. Also notes price and tier. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is four sentences, front-loaded with core purpose, then enumerates outputs, use cases, and a comparison. Each sentence is informative, though slightly verbose; still concise overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, description thoroughly covers input, return types, and use cases. It explains all relevant aspects for an agent to decide and invoke the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Single parameter tx_hash with schema coverage 100%. Description adds context 'Base mainnet transaction' and indicates what output to expect, but does not add significant semantics beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it decodes a Base mainnet transaction into a human-readable summary, listing specific outputs like ERC-20 transfers, function selector, gas, and status. It distinguishes from sibling tools by focusing on human-readable explanation versus raw decoding.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases: verifying transactions for trading agents and explaining transactions to users. Does not mention when not to use or differentiate from onyx_base_tx_decode, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_base_tx_simulatorAInspect
Simulate a Base mainnet transaction before sending it. Returns success/revert prediction, the revert reason if any, decoded return data, and an estimated gas figure. Use as a pre-flight check inside a trading agent's tool-call dispatcher — agents should simulate before signing to avoid paying gas on a doomed tx. Direct equivalent of OATP's Solana tx_simulator ($0.20, 1,304 unique paying agents) — Onyx is the first to ship this on Base mainnet at $0.10. Read-only — never submits. (price: $0.10 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| data | No | Hex-encoded calldata (default 0x) | |
| block | No | block tag (latest/pending/0x...) default latest | |
| value_wei | No | ETH wei to send (default 0) | |
| to_address | Yes | 0x-prefixed contract or wallet | |
| from_address | No | 0x-prefixed sender |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully discloses that the tool is read-only and never submits a transaction. It also details the return values (success prediction, revert reason, decoded data, estimated gas) and mentions pricing, ensuring full behavioral transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences front-load the purpose, returns, and usage context, followed by a comparison and pricing note. Every sentence adds value, with no redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters, no output schema, and no annotations, the description covers purpose, return values, usage guidance, and safety (read-only). It is complete and leaves no major gaps for an agent to understand what the tool does and when to use it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema coverage is 100%, so baseline is 3. The description does not add specific parameter meanings beyond the schema, but it does explain overall behavior and return types. No additional parameter-level elaboration is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool simulates a Base mainnet transaction before sending, specifies it returns success/revert prediction, revert reason, decoded return data, and estimated gas, and distinguishes it as a pre-flight check for trading agents, comparing it to a Solana counterpart.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises using the tool before signing to avoid paying gas on failed transactions, providing clear context for when to use it. Does not explicitly mention when not to use or alternatives, but the guidance is strong.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_clickAInspect
Click the first visible button or link whose text matches the query (case-insensitive substring match). Returns whether a match was found and the matched element's text + href. Use after onyx_browser_extract to act on what the page advertised. Demo mode returns synthetic OK. (price: $0.003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Substring of the element's visible text |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses substring matching, visibility preference, return data (match status, text, href), and demo mode behavior. Lacks details on scroll/waits/state modification, but sufficient for a simple click. No annotations to contradict.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences plus a pricing note; every sentence carries essential information. Front-loaded with the core action and matching strategy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all necessary behavioral aspects: what it clicks, matching rule, return values, and demo mode. No output schema needed; description fully equips an agent to use it correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has one parameter with a basic description. The tool description adds matching semantics (case-insensitive substring, visible elements) beyond the schema, enriching parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool clicks the first visible button or link matching a query via case-insensitive substring match. Distinguishes itself from sibling browser tools by being the dedicated click action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly recommends using after onyx_browser_extract, providing clear context for when to invoke. No when-not-to-use guidance, but the single parameter and sibling set make it unambiguous.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_evalAInspect
Evaluate a JavaScript expression on the current CDP-controlled Chrome page and return the result by value. Use as a power-tool when the specific click/extract/type tools don't fit — pull deeply nested DOM data, dispatch synthetic events, read computed styles. Demo mode echoes the expression length without executing. (price: $0.004 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| expression | Yes | JavaScript expression. Last value is returned. | |
| await_promise | No | If the expression returns a Promise, wait for it. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations, so description handles behavioral disclosure. It explains it returns by value and notes demo mode echoes expression length. However, doesn't warn that evaluation can cause side effects (e.g., DOM mutation), which is a minor gap for a power tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, no wasted words. First sentence states core purpose, second provides usage guidance and additional info (demo mode, pricing). Well front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description mentions return 'by value'. Lacks details on error handling, return type structure, or limitations. However, for a power tool with good usage guidance, it's largely adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage of both parameters. The description adds 'Last value is returned' for expression, which is already in the schema description. No extra context for await_promise. Baseline 3 is appropriate as schema already provides meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it evaluates JavaScript on a Chrome page and returns the result by value. Also positions it as a power-tool for cases where simpler tools (click/extract/type) don't fit, distinguishing it from siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use: when specific tools don't fit, with examples like nested DOM, synthetic events, computed styles. Also mentions demo mode behavior. This provides clear context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_extractAInspect
Read the current CDP-controlled Chrome page and return the visible text content plus a structured summary of clickable elements: buttons, links (with hrefs), inputs (with names/placeholders/types). Use when an agent needs to plan its next action — list what's on the page without screenshotting + vision-modeling. Cheap, structured, deterministic. Demo mode returns a plausible synthetic page summary. (price: $0.003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| max_chars | No | Cap on returned text length (max 80000) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations, so description carries full burden. It discloses read-only nature, structured output, cheap/deterministic behavior, and demo mode behavior. Lacks rate limit or loading state info but sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single paragraph, each sentence adds value. Starts with action, details, usage, extra info. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with one parameter and no output schema, description covers return content (text + clickable elements with attributes), usage context, and demo mode. Missing explicit output format but is specific enough.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a single parameter. Description does not add meaning beyond the schema's own description. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it reads the page and returns visible text plus structured clickable elements. Distinguishes from sibling browser tools like click, navigate, screenshot.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use when an agent needs to plan its next action' and contrasts with screenshot+vision-modeling. Could be clearer on when not to use, but provides strong context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_screenshotAInspect
Capture a PNG screenshot of the current CDP-controlled Chrome page and return it as base64. Use to feed a vision-LLM (Claude / GPT-4V) for screen-understanding agents, or to archive an action's visual result. Returns also the page title, URL, and viewport dimensions. Cap of 1MB returned. Demo mode returns a synthetic 1×1 PNG; self-host with ONYX_CDP_URL for real captures. (price: $0.008 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| format | No | png | |
| full_page | No | Capture full scrollable page or just viewport |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses return contents (title, URL, viewport), file size cap (1MB), demo mode behavior (synthetic 1×1 PNG), and pricing. Lacks details on error handling but sufficient for most use.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Efficiently packs purpose, use cases, return info, constraints, and pricing into a few sentences. Front-loaded with core action. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description lists return fields (title, URL, viewport) and the 1MB cap. Covers essential context for a screenshot tool. Missing only minor details like error scenarios.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 2 params with 50% description coverage. Description mentions 'PNG' (default format) but does not clarify the JPEG option or the format parameter in detail. Adds little beyond schema for full_page (already described in schema). Averages to baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Capture a PNG screenshot of the current CDP-controlled Chrome page and return it as base64', specifying the verb (capture), resource (screenshot), and format. Distinguishes from sibling browser tools like navigate, click, type, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases: 'Use to feed a vision-LLM... or to archive an action's visual result'. Also gives context on demo vs real capture with ONYX_CDP_URL. Does not list when not to use, but sibling tools cover different actions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_browser_typeAInspect
Find an input/textarea/select on the current CDP page by its name, id, or visible label, set its value via the React-safe native setter, and fire input + change events so frameworks like React/Vue see the update. Use after onyx_browser_navigate when an agent fills a form. Returns the field selector matched and the final value. Demo mode returns synthetic OK. (price: $0.002 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| value | Yes | Text to enter | |
| selector | Yes | Name, id (#foo), CSS selector, or visible label substring |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It discloses the React-safe setter, event firing for frameworks, return values (field selector and final value), and demo mode behavior. It does not cover potential error states or edge cases like missing fields, but is otherwise transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (3-4 sentences) with no wasted words. It front-loads the core action and immediately follows with usage context and return information. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description adequately covers what the tool does, when to use it, and what it returns. It lacks information about error handling or behavior when the element is not found, but given the well-covered parameters and clear purpose, it is mostly complete for a form-filling tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. The description adds value by explaining that 'selector' can be a name, id (#foo), CSS selector, or visible label substring, which goes beyond the schema's concise descriptions. It also clarifies that 'value' is the text to enter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly and specifically states the tool's purpose: finding an input/textarea/select by name, id, or visible label, setting its value via a React-safe native setter, and firing input+change events. It distinguishes itself from siblings like onyx_browser_click and onyx_browser_navigate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use after onyx_browser_navigate when an agent fills a form,' providing clear context for when to use it. It lacks exclusion criteria or alternatives but offers actionable guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_dns_lookupAInspect
Resolve a domain to its A/AAAA records, or reverse-resolve an IP to its hostname. Useful for validating a domain exists before scraping, checking if two domains share infrastructure, mapping CDN origins, or doing safety lookups before agents call third-party APIs. Returns IPv4, IPv6, canonical hostname, and resolution time. Powered by stdlib so results are whatever the host's DNS resolver returns — typically 20-100ms. (price: $0.0005 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| host | Yes | Domain (example.com) or IP address |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the burden. It explains the tool returns IPv4, IPv6, canonical hostname, and resolution time, and notes the resolver behavior (stdlib, typical latency 20-100ms) and pricing. This is adequate for a simple lookup tool, though it could mention error responses.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences plus a short pricing note, with no redundant information. Every sentence adds value, making it highly efficient for an agent to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and lack of output schema, the description covers return values, performance, and pricing. It doesn't discuss error handling or rate limits, but for a basic DNS lookup this is sufficient and leaves little ambiguity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema describes the 'host' parameter as 'Domain or IP address'. The description adds value by explaining that domains yield A/AAAA records and IPs yield reverse resolution. Schema coverage is 100%, so the description enhances understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool resolves domains to A/AAAA records and reverse-resolves IPs. It lists specific use cases (validating domains, checking infrastructure, CDN mapping, safety lookups), clearly distinguishing it from sibling tools like email validation or IP geolocation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear scenarios for when to use the tool (e.g., before scraping, infrastructure checks, safety lookups). While it doesn't explicitly state when not to use it, the context is sufficient for an agent to understand appropriate use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_email_validateAInspect
Validate an email address: RFC-5322 syntax check, domain DNS resolution (does the domain exist?), and disposable-provider detection (Mailinator, 10minutemail, GuerrillaMail, etc.). Returns a single confidence verdict plus the underlying signals so agents can decide whether to send. Use before mailing list signups, password-reset flows, or sales-lead capture to filter out trash addresses cheaply. ~30-80ms typical. (price: $0.0008 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| Yes | Address to validate |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, but the description discloses the three validation steps, return of confidence verdict and signals, typical latency (30-80ms), and cost. Lacks error handling details but is transparent for a simple validation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with clear front-loading of purpose, followed by details on use cases, performance, and cost. No redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description covers purpose, checks, use cases, performance, cost, and return type. Fully adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has one parameter with a minimal description. The tool description adds meaning by explaining what validation entails (syntax, DNS, disposable checks), which is not evident from the schema alone.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool validates an email address with three specific checks (syntax, DNS, disposable), and lists use cases and output. It distinguishes from sibling tools which perform different utilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly recommends using before mailing list signups, password-reset flows, or sales-lead capture. Does not mention when not to use or alternatives, but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_ens_resolveAInspect
Resolve an ENS name to its current Ethereum mainnet address (or vice versa). Returns the canonical address, avatar URL if set, and the resolver contract that returned it. Use when an agent encounters a human-readable name like 'vitalik.eth' and needs to send funds or validate identity. Reads via the public ensideas API (no key, no rate-limit pain for typical agent traffic). ~200-500ms. (price: $0.0008 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | ENS name (foo.eth) or 0x-address for reverse |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses the API source (public ensideas API), no key/rate-limit issues, typical latency (200-500ms), and cost ($0.0008 USDC). Since no annotations are present, this carries the burden well.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, each with purpose: action, output, usage context, and operational details. No redundancy or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with one parameter and no output schema, the description fully covers what it does, when to use it, what it returns, and performance/cost, making it complete and actionable for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Single parameter 'name' is fully described in schema; description adds meaning by stating it resolves to Ethereum mainnet address and returns avatar URL and resolver contract, enriching beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it resolves ENS names to Ethereum addresses (and vice versa), lists return fields (address, avatar, resolver), and distinguishes it from sibling tools that focus on URLs, DNS, or other utilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly suggests using it when the agent encounters a human-readable name like 'vitalik.eth' and needs to send funds or validate identity, providing clear context. Could add when not to use, but the guidance is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_fx_convertAInspect
Convert between any two fiat currencies (USD, EUR, GBP, JPY, BRL, USDC-equivalent, 160+ ISO-4217 codes) at the current mid-market rate. Returns both the rate and the converted amount, plus the rate's last update timestamp. Use when an agent needs to price a service in another currency, normalize multi-currency invoices, or convert x402 USDC amounts to local fiat for human-readable receipts. Powered by open.er-api.com (free tier, no key). ~150-400ms. Demo mode returns USD-EUR @ 0.92 for testing. (price: $0.0008 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| to | Yes | ISO-4217 target currency code | |
| from | Yes | ISO-4217 source currency code | |
| amount | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided; description compensates by stating external API (open.er-api.com), estimated latency (150-400ms), demo mode behavior, and price. It is a read-only conversion, so no destructive actions to note.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is well-structured with purpose first, then use cases, then technical details. Could trim demo mode and price, but overall efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description explains return values (rate, converted amount, timestamp). Covers key behaviors given no annotations, though latency and demo mode are extraneous.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 67% with 'from' and 'to' described; 'amount' has default but no description. Description adds that currencies are ISO-4217 and mentions specific codes, but does not add detail beyond schema for parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it converts between fiat currencies, lists specific examples (USD, EUR, etc.), and mentions returned data (rate, converted amount, timestamp). It is distinct from sibling tools with its domain (currency conversion).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Describes when to use: pricing services, normalizing invoices, converting x402 USDC. Includes demo mode for testing. Lacks explicit when-not-to-use or alternative tools, but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_hash_computeAInspect
Compute md5, sha1, sha256, sha512, and sha3-256 of any text or base64-encoded bytes. Returns each digest as both hex and base64. Use for content-addressed lookups, dedupe keys, signature verification support, or fingerprinting. Stdlib-only — runs locally, never logs input. <2ms. (price: $0.0003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| b64 | No | Or: base64-encoded bytes | |
| text | No | UTF-8 string to hash |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations; description discloses it runs locally, never logs input, takes <2ms, and includes pricing/tier. Sufficient transparency for a pure computation tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two efficient sentences: action first, then use cases and properties. No redundancy, front-loaded with key info.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers inputs, outputs (hex/base64), behavior (local/no logging), performance, and use cases. Complete for a hash function tool despite no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers both parameters (100%), description adds output format detail but doesn't significantly extend beyond schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it computes multiple hash algorithms (md5, sha1, sha256, sha512, sha3-256) for text or base64 bytes, returning each as hex and base64. Distinct from diverse siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Lists specific use cases (content-addressed lookups, dedupe keys, signature verification, fingerprinting). Mentions local execution and speed. No explicit exclusions but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_hlr_lookupAInspect
HLR lookup for any E.164 phone number. Returns the live carrier, MCC/MNC, country, line type (mobile/landline/voip), and portability status. Use before sending an SMS to avoid burning credits on dead/landline/VoIP numbers; or to verify which carrier owns a number after a port. Onyx routes to the cheapest live HLR vendor (numberportabilitylookup, hlr-lookups, text2reach) under the hood and pockets the spread — you pay one flat $0.005 USDC. Demo mode (default in cloud) returns a plausible synthetic record after ~150ms so you can test the loop free; self-host with vendor keys for real lookups. (price: $0.005 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| phone_number | Yes | E.164 phone number, e.g. +5511987654321 |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description fully discloses behavioral traits: it routes to the cheapest HLR vendor, charges $0.005 USDC per lookup, has a demo mode returning synthetic data, and requires self-hosting with vendor keys for real lookups. This exceeds typical transparency expectations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, each delivering critical information (what, when, pricing/behavior). It is front-loaded with the core operation. Minor redundancy exists (e.g., 'live' appears twice), but overall it is concise for the amount of detail.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the single required parameter and no output schema, the description covers the tool's purpose, use cases, pricing, demo vs. real mode, and vendor routing. It does not detail the exact response shape, which is acceptable without an output schema, but the agent can infer from described fields.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already describes the phone_number parameter with an example format. The description repeats 'E.164 phone number' but adds no new semantic information beyond what schema provides, so it scores the baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs HLR lookup for E.164 numbers and lists specific return fields (carrier, MCC/MNC, country, line type, portability status). It is distinct from sibling tools which cover DNS, email, IP, etc., so the purpose is unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly recommends using this tool before sending SMS to avoid wasted credits and to verify carrier after porting. It does not name alternative tools, but provides clear context for when to invoke it, earning a 4.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_html_metaAInspect
Fetch a URL and extract OpenGraph + Twitter Card + standard meta tags: og:title, og:description, og:image, og:type, twitter:card, twitter:image, canonical link, favicon, JSON-LD blocks. Use when an agent needs to preview a link before sharing, build a citation card, or detect spam/ads via meta-tag fingerprints. Stripped of HTML noise. ~150-500ms typical. (price: $0.0008 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to inspect |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so the description carries full burden. It mentions stripping HTML noise and typical latency (150-500ms), plus pricing info. It does not detail error handling or auth requirements, but for a simple fetch-and-extract tool the behavior is well conveyed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is compact yet comprehensive—one sentence for what it does, one for when to use, and one for performance/pricing. No fluff, each sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description covers extraction details, use cases, and performance. It does not specify the return format (likely JSON), but the context signals suggest no output schema, so the description is reasonably complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with the single parameter 'url' already described as 'URL to inspect'. The description adds no extra semantics beyond what the schema provides, meeting the baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool fetches a URL and extracts OpenGraph, Twitter Card, and standard meta tags, with concrete use cases like link previewing and spam detection. It clearly differentiates from siblings by specifying the meta extraction scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives clear use cases (previewing, citation, spam detection) but does not explicitly state when not to use it or mention alternatives among sibling tools, though the context signals show many similar URL-related tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_ip_geolocateAInspect
Geolocate any public IPv4/IPv6 address — country, region, city, lat/lon, timezone, ISP, ASN, mobile/proxy/hosting flags. Useful for filtering traffic by country, detecting datacenter/VPN egress, fraud scoring, or deciding which regional endpoint to route an agent through. Backed by ip-api.com (free tier, ~1k requests/min). ~80-200ms typical. Demo mode returns a plausible US record so the payment loop can be tested without burning the upstream rate limit. (price: $0.0008 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| ip | Yes | IPv4 or IPv6 address |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully carries the burden of transparency. It discloses the backend (ip-api.com), rate limits (~1k requests/min), typical latency (~80-200ms), demo mode behavior, and pricing ($0.0008 USDC). This provides comprehensive behavioral context for an agent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is compact and upfront about the main function, then covers use cases, backend details, demo mode, and pricing. While it could be slightly more structured (e.g., bullet points), it efficiently conveys all necessary information without unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has one parameter and no output schema, the description covers all relevant aspects: purpose, usage context, behavioral details (rate limits, latency, demo mode), and return fields. It is complete enough for an AI agent to correctly select and invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'ip' is described in the schema as 'IPv4 or IPv6 address'. The description adds value by specifying 'public' and implicitly clarifying that private IPs are not supported. Additionally, it mentions demo mode behavior, which adds context beyond the schema. With 100% schema coverage, baseline is 3, but the extra context justifies a 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool geolocates any public IPv4/IPv6 address and lists all returned fields (country, region, city, lat/lon, timezone, ISP, ASN, mobile/proxy/hosting flags). It also gives concrete use cases like traffic filtering and fraud scoring, clearly distinguishing it from sibling tools like onyx_dns_lookup or onyx_email_validate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context on when to use the tool, listing specific use cases such as filtering by country, detecting datacenter/VPN egress, fraud scoring, and routing decisions. It also mentions the demo mode for testing. However, it does not explicitly state when not to use this tool or suggest alternatives among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_jwt_decodeAInspect
Decode a JWT (header + payload) without verifying the signature. Returns the algorithm, key id, all claims (iss, sub, aud, exp, iat, nbf, custom), expiry status, and any structural anomalies. Use when an agent receives a token from an external API and needs to inspect it for routing, expiry, or audit logging. Stdlib-only — runs locally, never sends the token anywhere. (price: $0.0003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| token | Yes | JWT (three base64url segments separated by .) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses that signature is not verified, decoding is local and never sends the token externally, and that anomalies are reported. Provides adequate behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, with three sentences covering purpose, usage, and security context. It includes a price note which is somewhat extraneous but not detrimental. Front-loaded with core purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity (single parameter, no output schema), the description is complete. It explains the decoding process, return values (claims, expiry, anomalies), and usage context. No gaps are apparent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (single parameter token described). Description adds value by explaining what decoding entails (header+payload), what will be returned, and that signature is not verified, which exceeds the schema's minimal description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it decodes a JWT without verifying the signature, and lists the returned items (algorithm, key id, claims, expiry status, anomalies). It distinguishes itself from siblings as no other sibling tool handles JWT decoding.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly provides when to use: when an agent receives a token from an external API for routing, expiry, or audit logging. It also notes the local execution and security aspect. Lacks explicit exclusions but given no alternative JWT tools, this is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_password_strengthAInspect
Score password strength on a 0-100 scale. Returns Shannon entropy (bits), character-class diversity, length, common-pattern detection (sequences, repeats, dictionary-likeness), and a verdict (very_weak / weak / fair / strong / very_strong). Use when an agent generates passwords for accounts it creates, or when validating user-supplied credentials. Stdlib-only — runs locally, never sends the password anywhere. <5ms. (price: $0.0003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| password | Yes | Password to score |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description discloses that it runs locally (Stdlib-only), never sends the password, takes <5ms, and has a cost. This informs the agent about safety, latency, and resource implications.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph that front-loads the main action and proceeds with details. It's efficient with no wasted words, though slightly dense.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the output components (entropy, diversity, length, patterns, verdict) despite no output schema. It also includes use cases and cost, making it fairly complete for a simple one-parameter tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a simple description. The description adds a security note about not sending the password but doesn't provide deeper parameter semantics beyond what the schema already conveys.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it scores password strength on a 0-100 scale, listing specific factors (entropy, diversity, patterns) and verdicts. It distinguishes from all siblings, none of which are password-related.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use when an agent generates passwords for accounts it creates, or when validating user-supplied credentials.' This provides clear context, though it doesn't mention when not to use it (but no alternative tool for this task exists).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_robots_checkAInspect
Fetch a domain's robots.txt and report whether a given path is allowed for a given user-agent. Returns the raw robots.txt text, the matched rule, the crawl-delay if specified, and a clean allow/disallow verdict. Use when an agent does web scraping and wants to be polite — saves bans, saves CAPTCHAs, saves drama. ~50-200ms. (price: $0.0005 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Any URL on the target domain | |
| user_agent | No | UA string to test | * |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses return values (raw text, rule, crawl-delay, verdict), performance (~50-200ms), and cost ($0.0005 USDC). It also implies non-destructive, read-only behavior. Missing details on error handling, but overall adequate for a simple tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences plus a brief time/cost note; no wasted words. The primary action and returns are front-loaded, making it efficient for an agent to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and no output schema, the description covers the key aspects: purpose, inputs, returns, and cost/performance. It lacks explicit error handling or edge cases (e.g., missing robots.txt), but this is likely acceptable for a straightforward check tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the schema already describes both parameters. The description adds usage context but does not clarify the relationship between 'url' and 'path'—the path is implied from the URL, which might confuse agents expecting a separate path parameter. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches robots.txt and checks if a path is allowed for a user-agent. It is distinct from sibling tools like browser navigation or DNS lookups, making its purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description advises using the tool when 'an agent does web scraping and wants to be polite,' providing clear context. However, it does not explicitly exclude cases or name alternatives, though siblings like onyx_browser_navigate might be misused without this guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_sms_verifyAInspect
Receive an SMS OTP delivered to a physical phone on a real carrier SIM. No VoIP numbers, no virtual gateways, no Twilio — these get rejected by modern signup flows (WhatsApp, Telegram, banking, KYC). Onyx routes to a real Samsung handset on a carrier-issued SIM, polls the inbox via adb, extracts the OTP, returns it. Typical end-to-end 20-60s including network delivery. Demo mode (default) returns a synthetic 6-digit code after a 2s delay so you can test the full payment + tool-call loop without burning real SMS-minutes. Set ONYX_DEMO_MODE=0 in self-hosted mode to dispatch to your own phone fleet. (price: $0.050 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| service | No | Service name for context | |
| timeout_sec | No | ||
| phone_number | Yes | E.164 number |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden. It describes the real phone mechanism (adb, polling), typical timing (20-60s), demo mode behavior, and pricing. However, it lacks details on error handling, failure modes, and what happens if no SMS is received, which would be helpful for an agent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively long but every sentence adds value. Key information (function, uniqueness, timing, demo mode) is front-loaded. Minor redundancy (e.g., mentioning no VoIP twice) could be tightened, but overall structure is efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex tool with no output schema, the description covers core aspects: real vs demo, timing, pricing. However, it does not specify the return format (simple string vs object) or handle edge cases (invalid number, timeout behavior). Gaps remain for an agent to fully understand all outcomes.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description does not explain any parameters beyond the schema. Schema covers 2 of 3 parameters with descriptions; timeout_sec lacks schema description and is not mentioned in the description. The description adds no additional semantic value to the parameters, failing to compensate for the 67% schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: receiving an SMS OTP delivered to a physical phone on a real carrier SIM. It distinguishes itself from sibling tools (all non-SMS) by emphasizing the real carrier SIM aspect and rejecting VoIP numbers, making the unique function obvious.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly details when to use the tool (for signup flows rejecting VoIP numbers) and when to use demo mode for testing. It provides clear guidance on production versus testing, and indirectly implies not to use for virtual numbers. No explicit alternative is named, but the context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_solve_captchaAInspect
Solve an image-based text captcha and return the recognized text. Works on standard alphanumeric captchas (web signup forms, login walls, scraping checkpoints). OCR via ddddocr — typical p50 latency 30-80ms, 70-90% accuracy on common captcha fonts. Provide either an image URL we fetch on your behalf, or raw base64 image bytes if you already have them. Use when an agent encounters a captcha mid-task and needs to continue without human intervention. Cheaper and faster than 2captcha for simple image captchas; not designed for reCAPTCHA v2/v3 or hCaptcha (those are interaction-based). (price: $0.003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| image_b64 | No | Base64-encoded image bytes | |
| image_url | No | URL to fetch |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, but the description discloses latency (30-80ms), accuracy (70-90%), the OCR engine (ddddocr), pricing ($0.003 USDC), and that it can fetch URL or accept base64. This fully informs the agent about behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is compact (5 sentences) and front-loaded with the core purpose. Every sentence adds value: purpose, examples, technical details, usage guidance, pricing. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description explains the return value (recognized text). It covers constraints (not for reCAPTCHA), alternatives, pricing, latency, accuracy, and input options. This is comprehensive for a captcha solver tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds context by explaining that image_url is for fetching on behalf, and image_b64 is for raw bytes if already available, which clarifies the two options beyond the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it solves image-based text captchas and returns recognized text. It specifies the type of captchas (standard alphanumeric) and provides examples (web signup forms, login walls, scraping checkpoints). It also distinguishes itself from reCAPTCHA and hCaptcha.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use when an agent encounters a captcha mid-task and needs to continue without human intervention.' Also contrasts with 2captcha (cheaper and faster for simple captchas) and states it is not designed for reCAPTCHA or hCaptcha.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_token_metadataAInspect
ERC-20 token metadata lookup on Base mainnet: name, symbol, decimals, and total supply for any contract address. Use before transacting with a token agents discover at runtime — confirms the contract is a real ERC-20 and resolves human-readable identity. Reads via Base public RPC, ~150-300ms typical. Pairs with onyx_base_tx_decode for full token-flow context. No vendor key needed. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| address | Yes | 0x-prefixed ERC-20 contract address on Base |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden. It discloses that reads are via Base public RPC, typical latency of 150-300ms, no vendor key needed, and mentions pricing. It implicitly indicates read-only behavior through terms like 'lookup' and 'confirms,' providing sufficient transparency for safe invocation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured: purpose first, then usage context, performance, pairing, and pricing. Every sentence serves a distinct purpose with no redundancy, making it highly concise and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple one-parameter tool with no output schema, the description covers all necessary information: what it does, when to use it, performance characteristics, external dependencies, and pricing. It also references a companion tool, making the context complete for the agent's decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the single 'address' parameter already described as '0x-prefixed ERC-20 contract address on Base.' The description repeats this same information, adding no additional semantic value beyond what the schema provides, so the baseline score of 3 applies.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the tool as an ERC-20 token metadata lookup on Base mainnet, listing specific fields (name, symbol, decimals, total supply) and the purpose: confirming a contract is a real ERC-20 and resolving human-readable identity. It distinguishes itself from siblings like onyx_base_tx_decode by noting its use before transacting and pairing context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly recommends use before transacting with a token discovered at runtime and mentions pairing with onyx_base_tx_decode for full context. While it doesn't state when not to use it, the guidance is clear and practical for the intended scenario.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_url_parseAInspect
Parse any URL into structured components: scheme, host, port, path, query params (as both raw and decoded list), fragment, userinfo. Use when an agent needs to inspect, modify, or validate a URL — change a query param, strip tracking, normalize for caching. Stdlib only, no network calls, <1ms. (price: $0.0003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to parse |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses important behavioral traits: 'Stdlib only, no network calls, <1ms' and cost. No annotations are provided, so the description effectively communicates safety and performance. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first defines purpose and output, second adds usage and performance. No unnecessary words, well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple parse tool with one parameter and no output schema, the description fully explains what it does, when to use it, and the behavioral characteristics.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a simple 'url' parameter. The description adds context on what the output includes but does not add further semantics to the parameter beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with a clear verb (Parse) and resource (URL), and enumerates the structured components returned. This distinguishes it from sibling tools like onyx_url_unshorten or onyx_url_text.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases: inspect, modify, validate a URL, change query param, strip tracking, normalize for caching. No explicit exclusions or alternatives, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_url_textAInspect
Fetch any public URL and return the readable text content stripped of HTML/scripts/styles. Use when an agent needs to reason over a web page without rendering a browser — docs pages, articles, search-result snippets, GitHub READMEs. Returns plain text + page title + word count + final URL after redirects. Capped at 200kB output to keep token costs predictable. ~150-800ms typical depending on origin server. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | HTTPS URL to fetch | |
| max_chars | No | Truncate output (max 200000) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavior: returns plain text, page title, word count, final URL after redirects; output capped at 200kB; typical latency 150-800ms; price and tier mentioned. This covers key behavioral traits beyond what annotations would provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and well-structured: first sentence states the core action, followed by usage context, return details, and performance/pricing. Every sentence adds value with no redundancy, making it easy to scan.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and no annotations, the description covers core aspects: what it does, return values, limitations, performance, and pricing. It lacks explicit error handling details (e.g., unreachable URL), but for a simple fetch tool this is a minor gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds minimal extra value by mentioning the output cap (200kB) in relation to max_chars, but the schema already describes max_chars as 'Truncate output (max 200000)'. No significant addition beyond what the schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool fetches any public URL and returns readable text content, specifying what is stripped (HTML/scripts/styles) and what is returned (plain text, page title, word count, final URL). It provides specific use cases (docs pages, articles, etc.), distinguishing it from siblings like onyx_url_unshorten.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description guides when to use the tool: 'Use when an agent needs to reason over a web page without rendering a browser' and lists example scenarios. It doesn't explicitly state when not to use it, but the sibling tool onyx_url_unshorten provides an implied alternative for URL expansion.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_url_unshortenAInspect
Follow HTTP redirects on any URL and return the final destination + the full redirect chain. Use when an agent encounters a bit.ly/t.co/lnkd.in/ shortened link and needs to know where it actually goes before clicking. Returns each hop's status code, location, and final URL with status. Cap of 10 hops to prevent loops. ~100-400ms typical. (price: $0.0005 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to unshorten |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the burden. It discloses a 10-hop cap, typical 100-400ms latency, and cost ($0.0005 USDC). Does not cover error handling or auth, but for a simple lookup tool this is sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise two-sentence description: first sentence defines action and output, second provides usage guidance and technical details. Efficiently structured with no redundant words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (single input, no output schema), the description fully covers purpose, usage, behavioral details, and constraints. It explains return values (redirect chain with status/location) and limits, making it complete for agent decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The sole parameter 'url' has schema description 'URL to unshorten'. The tool description adds context about what the tool does but no additional detail about the parameter format, validation, or constraints beyond schema. Schema coverage is 100% so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool follows HTTP redirects to find the final destination and redirect chain, using specific verb+resource. It is distinct from sibling tools which cover different network utilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly recommends use when encountering shortened links (bit.ly, t.co, lnkd.in) and implies not to use for non-shortened URLs. Does not explicitly mention alternatives but siblings are clearly different.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_user_agent_parseAInspect
Parse any HTTP User-Agent string into a structured record: browser name/version, OS name/version, device type (desktop/mobile/tablet/bot), rendering engine. Use for analytics, fraud scoring, or routing logic based on client capabilities. Stdlib regex only — works offline, no external lookups. <2ms. (price: $0.0003 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| user_agent | Yes | User-Agent header value |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It discloses that it uses 'Stdlib regex only — works offline, no external lookups. <2ms.' and includes the price. This provides good insight into behavior, though edge cases or error responses are not mentioned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the core function, and contains no fluff. Every sentence adds value: purpose, output fields, use cases, performance, and cost.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input (one string) and no output schema, the description covers purpose, output structure (browser, OS, device, engine), performance, and cost. It could be more explicit about the return format, but 'structured record' suffices.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the parameter 'user_agent' described as 'User-Agent header value'. The description adds context about the tool's purpose but does not add semantics beyond the schema. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool parses HTTP User-Agent strings into structured components (browser, OS, device type, engine), and lists use cases like analytics and fraud scoring. This verb+resource combination is specific and distinguishes from sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly mentions when to use (analytics, fraud scoring, routing logic) and highlights offline capability and speed. However, it does not state when not to use or suggest alternatives for similar tasks.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
onyx_whoisAInspect
Domain whois via the modern RDAP protocol — registrar, creation/expiry dates, nameservers, registrant country, status flags. Use to vet a domain before agents transact with it (newly registered = higher fraud risk), check trademark conflicts, or confirm ownership transfer. Powered by rdap.org (no API key, free tier). ~200-500ms typical. (price: $0.001 USDC, tier: metered)
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | Domain name, e.g. example.com |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses key behavioral traits: it uses the RDAP protocol via rdap.org (free tier, no API key), typical response time of 200-500ms, and cost of $0.001 USDC per call. This goes beyond the schema, though rate limits are not mentioned.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, well-structured, and front-loaded with the core function. Every sentence adds value (purpose, data fields, use cases, source, performance, cost) without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description covers all relevant aspects: what it does, what data it returns, when to use it, performance, cost, and external source. It is complete enough for an agent to decide to invoke the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already fully describes the single 'domain' parameter with an example. The description adds context about usage but no additional parameter-level details, so it adds marginal value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs domain whois via RDAP and lists the specific data returned (registrar, dates, nameservers, etc.). It distinguishes from sibling tools like onyx_dns_lookup by focusing on domain registration information.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear use cases: vetting domains for fraud risk, checking trademark conflicts, and confirming ownership transfer. It does not explicitly compare to siblings but the distinct function implies when to use this tool over others.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!