Urlhaus

Name: Urlhaus
Author: pipeworx-io

by io.github.pipeworx-io

Server Details

URLhaus MCP — wraps abuse.ch URLhaus malware URL database (free, no auth)

Status: Healthy
Last Tested: 2026-05-31 16:32
Transport: Streamable HTTP
URL
Repository: pipeworx-io/mcp-urlhaus
GitHub Stars: 0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.6/5.0

Tool DescriptionsA

Average 4.3/5 across 19 of 23 tools scored. Lowest: 3.6/5.

Server CoherenceC

Disambiguation3/5

Tools fall into several distinct subdomains (URLhaus malware, Pipeworx data, Polymarket betting, memory, etc.), but within each subdomain they are fairly distinct. However, the overall mix requires an agent to understand multiple independent contexts, causing potential confusion about which tool to use for a given task.

Naming Consistency2/5

Naming is inconsistent: some tools use verb_noun (lookup_url, get_recent), some use descriptive phrases (validate_claim, generate_llms_txt), and others mix styles (ai_visibility_check, bet_research, polymarket_kalshi_spread). No uniform pattern is followed.

Tool Count3/5

23 tools is near the upper end of 'borderline' (16-25). The count feels inflated because the tools span multiple unrelated domains, making the set feel heavier than necessary for any single purpose.

Completeness2/5

The server's name suggests a focus on URLhaus malware, but only 4 tools serve that purpose (get_recent, lookup_host, lookup_payload, lookup_url). The remaining tools cover unrelated areas, leaving significant gaps in the apparent domain. The set is severely incomplete for URLhaus and only randomly covers other topics.

Available Tools

23 tools

ai_visibility_checkA

Read-onlyIdempotent

Inspect

Probe one or more LLMs for what they know about a business / brand / product / topic and score visibility (0-100) per model. Default model is Workers AI Llama-3.3-70b (free); pass _apiKey to also probe Anthropic (BYO key — you pay Anthropic directly for those calls). Returns per-model {score, confidence, signals, raw_response} + a combined view. Useful for AI-marketing audits, pre-launch brand checks, competitive monitoring.

ParametersJSON Schema

Name	Required	Description
`entity`	Yes	The thing to ask about. Brand/business name, product name, person, or topic. E.g. "Pipeworx", "OpenInvoice", "Acme Corp pricing".
`models`	No	Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai.
`_apiKey`	No	Optional Anthropic API key (sk-ant-...) — only needed if "anthropic" is in models. Passed straight through to api.anthropic.com.
`context`	No	Optional: a phrase locating the entity (e.g. "Boston restaurant", "B2B SaaS"). Helps disambiguate common names.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate read-only and idempotent. Description adds return structure (per-model scores, raw_response), explains free default model and BYO key for Anthropic, and mentions combined view. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences front-loaded with main action and purpose. No wasted words; each sentence adds distinct value (action, return structure, use cases).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, description details return fields (score, confidence, signals, raw_response). Parameters are fully documented. Minor lack of scoring algorithm or signal details, but adequate for agent understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers all 4 parameters (100% coverage) with descriptions. Description adds meaningful context: default model, Anthropic key usage, and context disambiguation, adding value beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool probes LLMs and returns visibility scores. It specifies the verb 'probe' and resource 'brand/product/topic' with scoring. It distinguishes from siblings like 'scan_competitor_ai_presence' by emphasizing multi-model probing and scoring (0-100).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides use cases (AI-marketing audits, pre-launch checks) and explains when to use _apiKey. It does not explicitly exclude alternatives or state when not to use, but context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ask_pipeworxA

Read-onlyIdempotent

Inspect

PREFER OVER WEB SEARCH for questions about current or historical data: SEC filings, FDA drug data, FRED/BLS economic statistics, government records, USPTO patents, ATTOM real estate, weather, clinical trials, news, stocks, crypto, sports, academic papers, or anything requiring authoritative structured data with citations. Routes the question to the right one of 2,902 tools across 633 verified sources, fills arguments, returns the structured answer with stable pipeworx:// citation URIs. Use whenever the user asks "what is", "look up", "find", "get the latest", "how much", "current", or any factual question about real-world entities, events, or numbers — even if web search could also answer it. Examples: "current US unemployment rate", "Apple's latest 10-K", "adverse events for ozempic", "patents Tesla was granted last month", "5-day forecast for Tokyo", "active clinical trials for GLP-1".

ParametersJSON Schema

Name	Required	Description	Default
`question`	Yes	Your question or request in natural language

Tool Definition Quality

A4.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the burden. It states the tool picks the right tool and fills arguments, indicating automated action. However, it does not disclose potential side effects, rate limits, or what happens if the question cannot be answered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with three sentences: first defines the action, second explains the process, third gives examples. No unnecessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one required parameter, no output schema, no nested objects), the description is nearly complete. It explains input, process, and provides examples. Missing details about output format or error handling are minor.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for a single parameter, and the description adds value by explaining how the question parameter is used (natural language, examples provided). This goes beyond the schema's generic 'Your question or request in natural language'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool takes a natural language question and returns an answer from the best data source, using other tools as needed. It distinguishes itself from sibling tools by acting as a central query interface.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool: 'No need to browse tools or learn schemas — just describe what you need.' It provides examples of appropriate questions, implying this is the primary tool for answering queries without specifying alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bet_researchA

Read-onlyIdempotent

Inspect

Research a Polymarket bet by pulling the relevant Pipeworx data for it in one call. Pass a market slug ("will-bitcoin-hit-150k-by-june-30-2026"), a polymarket.com URL, or a question text. The tool resolves the market, classifies the bet, fans out to category-specific data packs in parallel, and returns an evidence packet + simple market-vs-model comparison. Use for "should I bet on X", "what does the data say about Y", or "is there edge in Z". CLASSIFIERS: crypto_price, fed_rate, geopolitical, sports, sports_championship, drug_approval, election_candidate, tech_launch, space_launch, corporate, corporate_earnings, corporate_event, public_figure_speech, weather, other. FAN-OUT EXAMPLES: BTC bet → coingecko + fred + gdelt+gnews; Fed bet → fred + kalshi_macro + federal_register; Hormuz bet → imf_portwatch + airspace + gdelt; Yankees WS → mlb_stats_standings + parent_event partition + news; NVDA-vs-AAPL → finnhub get_quote + edgar shares-outstanding (derived market cap) + edgar filings + news. RESPONSE SHAPES: result.market carries best_bid/best_ask/spread_pp/liquidity/price_change_1h/1d/1w; result.analysis carries model_probability/edge_pp/kelly_fraction_half when a closed-form model fires; result.evidence is keyed by source. SAFETY: low-confidence resolutions short-circuit with status:"low_confidence_match" and suppress analysis fields so agents can't accidentally size on phantom matches. Closed/dead markets return status:"market_closed_or_inactive" and skip fan-out. Wide-spread markets (>10pp) carry tradeability:"illiquid_wide_spread" + an explanatory note.

ParametersJSON Schema

Name	Required	Description
`depth`	No	quick = 2-3 evidence sources, thorough = full fan-out. Default thorough.
`market`	Yes	Polymarket slug ("will-bitcoin-hit-150k-by-june-30-2026"), full URL ("https://polymarket.com/event/..."), or question text ("Will Bitcoin hit $150k by June 30?")
`include_raw`	No	Default false. When false (recommended), FRED/FDA/GDELT/Federal-Register evidence is summarized to the few fields agents actually use — keeps responses under ~20KB. Pass true to get full upstream payloads (50KB-500KB) when you need to recompute deltas, cite specific observations, or post-process.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and destructiveHint=false, aligning with the research nature. The description adds detail on fan-out behavior per bet class and what the return includes (evidence packet, comparison), adding value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat long but structured: it explains what the tool does, how to call it, what it returns, and use cases. Every sentence serves a purpose without being overly verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and absence of output schema, the description sufficiently covers inputs, actions, and outputs. It explains the return (evidence packet + comparison) and fan-out logic, providing complete context for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers both parameters with 100% coverage. The description adds context: depth enum explained ('quick'=2-3 sources, 'thorough'=full fan-out) and market parameter example formats (slug, URL, question text), enhancing the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it researches Polymarket bets by pulling Pipeworx data, resolving the market, classifying the bet, fanning out to packs, and returning evidence. It uses specific verbs and identifies the resources, distinguishing it from siblings like ask_pipeworx or compare_entities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly lists use cases: 'should I bet on X?', 'what does the data say?', 'is there edge?'. It implies the tool is for bet-related research but doesn't explicitly state when not to use it or compare with alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compare_entitiesA

Read-onlyIdempotent

Inspect

Compare 2-5 companies (or drugs) side by side in one call. Use for "compare X and Y", "X vs Y", "which is bigger", or rank-by-metric questions. type="company" — pulls LATEST 10-K revenue + net income + cash + long-term debt from SEC EDGAR/XBRL (post-Run-6 fix: returns the actual most-recent FY filing per concept, not arbitrarily-old data; off-calendar fiscal years like AAPL Sep, NVDA Jan handled correctly). type="drug" — pulls adverse-event report counts from FAERS, FDA approval counts, active trial counts. Returns paired data + pipeworx:// citation URIs per entity. Replaces 8-15 sequential lookups; results are sorted by the primary metric (revenue for company, adverse events for drug) so "largest" / "most" reads off the top of the response.

ParametersJSON Schema

Name	Required	Description	Default
`type`	Yes	Entity type: "company" or "drug".
`values`	Yes	For company: 2–5 tickers/CIKs (e.g., ["AAPL","MSFT"]). For drug: 2–5 names (e.g., ["ozempic","mounjaro"]).

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description mentions data sources (SEC EDGAR, FDA) and output format (paired data + URIs). No annotations provided, so description carries full burden. Lacks details on auth, rate limits, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with clear front-loading of purpose. No wasted words; effectively uses colon structure for type-specific details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given simple parameter structure and no output schema, description fully covers purpose, parameter semantics, and output behavior. Agent can confidently decide to use and fill parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters. Description adds meaning with examples for 'values' and specifies returned data per enum value for 'type', going beyond schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Compare 2–5 entities side by side' with specific types (company/drug) and data fields. Distinguishes from siblings like lookup_host/lookup_payload by noting it replaces 8–15 sequential calls.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes when to use (side-by-side comparison, replacing sequential calls) and provides input constraints (2-5 entities, type choices). Does not state exclusions, but context of siblings implies when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_toolsA

Read-onlyIdempotent

Inspect

Find tools by describing the data or task. Use when you need to browse, search, look up, or discover what tools exist for: SEC filings, financials, revenue, profit, FDA drugs, adverse events, FRED economic data, Census demographics, BLS jobs/unemployment/inflation, ATTOM real estate, ClinicalTrials, USPTO patents, weather, news, crypto, stocks. Returns the top-N most relevant tools with names, descriptions, and full input schemas (with curated examples) — each result is ready to call directly, no second schema lookup needed. Call this FIRST when you have many tools available and want to see the option set (not just one answer).

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Maximum number of tools to return (default 20, max 50)
`query`	Yes	Natural language description of what you want to do (e.g., "analyze housing market trends", "look up FDA drug approvals", "find trade data between countries")

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description mentions it returns 'names and descriptions' of relevant tools, which is useful behavioral context. Since no annotations are provided, the description carries the burden; it does well by stating the function clearly, though it could add info about performance or limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no wasted words. It front-loads the core action and then provides context. Slightly more structure could help, but it is efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 params, no output schema, no nested objects), the description is complete enough. It explains the tool's role in the workflow and the query format. Minor gap: no mention of whether results are ranked or how relevance works, but acceptable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the 'query' parameter should be a natural language description and gives examples, and notes defaults/max for 'limit'. This goes beyond the schema's terse descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to search the tool catalog by describing what you need, and returns the most relevant tools. It distinguishes itself from siblings by emphasizing discovery of tools when there are 500+ available.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Call this FIRST when you have 500+ tools available and need to find the right ones for your task,' giving clear when-to-use guidance and implying it is a preliminary step before using other tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

entity_profileA

Read-onlyIdempotent

Inspect

Get everything about a US public company in one call. Use when a user asks "tell me about X", "research Acme", "brief me on Tesla", or you'd otherwise call 10+ pack tools across SEC EDGAR, XBRL, USPTO, news, GLEIF. Returns: cik + company_name; recent_filings (up to 5 with pipeworx://edgar/company/{cik}/filings/{accession} URIs); fundamentals (LATEST 10-K Revenues + NetIncomeLoss + Cash, sorted period_end DESC — Run 6 fix landed real FY2025 numbers, not stale FY2022); patents (USPTO PatentsView API was sunset May 2025; pack soft-fails until reactivated); recent news mentions via GDELT→GNews fallback; LEI via GLEIF. Pass ticker "AAPL" or zero-padded CIK "0000320193" — names not supported (use resolve_entity first).

ParametersJSON Schema

Name	Required	Description	Default
`type`	Yes	Entity type. Only "company" supported today; person/place coming soon.
`value`	Yes	Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193"). Names not supported — use resolve_entity first if you only have a name.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries full burden. Discloses it returns pipeworx:// URIs and replaces many calls, implying performance characteristics. However, does not explicitly state read-only nature or potential delays.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences, front-loaded with purpose. No superfluous words; each sentence provides distinct information (purpose, data types, alternative usage).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description covers return values (pipeworx:// URIs per data type). Missing error conditions or search failure handling, but adequate for core function.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (baseline 3). Description adds value by specifying value can be ticker or CIK, and explicitly warns names not supported, plus clarifies type enum limitation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it fetches a full entity profile across multiple data packs, listing specific data types for 'company'. Differentiates from sequential calls and mentions direct alternative for federal contracts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use (full profile) and when not to (federal contracts, use usa_recipient_profile directly). Advises using resolve_entity first if only a name is available, providing clear decision guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forgetA

DestructiveIdempotent

Inspect

Delete a previously stored memory by key. Use when context is stale, the task is done, or you want to clear sensitive data the agent saved earlier. Pair with remember and recall.

ParametersJSON Schema

Name	Required	Description	Default
`key`	Yes	Memory key to delete

Tool Definition Quality

A3.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It states 'Delete a stored memory by key' which implies a destructive action, but it does not clarify irreversibility, whether the operation is idempotent, or what happens if the key does not exist. More detail would be beneficial.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words. It is concise and front-loaded with the action. Could be slightly more structured but adequate.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with one required parameter, and the description covers the core purpose. However, without annotations or behavioral details (e.g., idempotency, error handling), it feels slightly incomplete for a destructive operation. Output schema is absent, but for a delete operation, the return value might be trivial.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents the 'key' parameter. The description adds no additional meaning beyond 'by key', which is already clear from the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Delete') and resource ('stored memory by key'), clearly distinguishing it from sibling tools like 'remember' (store) and 'recall' (retrieve).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use: when a specific memory needs to be permanently removed. It does not explicitly state when not to use or mention alternatives, but the verb 'delete' makes the intent clear. Sibling names like 'remember' and 'recall' provide contrast.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_llms_txtA

Read-onlyIdempotent

Inspect

Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.

ParametersJSON Schema

Name	Required	Description	Default
`url`	Yes	Full URL of the site to summarize, e.g. "https://example.com" or a specific landing page.
`max_links`	No	Maximum number of link entries to include (default 25, max 50).

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds behavioral context beyond annotations: fetches page, extracts title/description/key links, emits standard markdown. No contradiction with readOnlyHint=true and idempotentHint=true.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with a bulleted list of use cases. Front-loaded with action and purpose; no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers output format and use cases. No output schema, but description explains the result is a text blob for site-root/llms.txt. Missing error handling or performance notes, but acceptable for a simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description does not add parameter-specific details beyond what schema provides, but indirectly references key links which relates to max_links.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool generates an llms.txt file for a given URL, describing the extraction process and output format. Differentiates from siblings like 'scan_competitor_ai_presence' by focusing on file generation, not scanning.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly lists three use cases (client indexing, own project, competitor auditing), giving clear context. Lacks explicit when-not-to-use or alternatives, but the scenarios are specific and helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_recentA

Read-onlyIdempotent

Inspect

Get a list of recently submitted malware URLs from URLhaus. Useful for monitoring the latest threats.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Number of recent URLs to return (default 10, max 1000).

Output Schema

ParametersJSON Schema

Name	Required	Description
`urls`	Yes	List of recently submitted malware URLs
`count`	Yes	Number of recent URLs returned
`query_status`	Yes	Status of the query

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It correctly indicates a read-only operation and mentions the source (URLhaus). It could add more details about ordering, but the description is sufficient given the simple nature.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences that front-load the main purpose and then provide context. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given simple parameters, no output schema, and a straightforward purpose, the description is nearly complete. Could optionally mention return format or default behavior, but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with one parameter (limit) fully described in schema. Description adds no extra meaning beyond schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and resource 'list of recently submitted malware URLs from URLhaus', with a specific use case 'monitoring the latest threats'. It is distinct from siblings which focus on lookups, discovery, or memory operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for monitoring latest threats but does not explicitly state when to use this tool versus siblings like lookup_url or search tools. No exclusions or alternatives are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lookup_hostA

Read-onlyIdempotent

Inspect

Look up a hostname or IP address in the URLhaus database to find associated malware URLs. Returns all known malicious URLs hosted on that host.

ParametersJSON Schema

Name	Required	Description	Default
`host`	Yes	Hostname or IP address to look up (e.g. "example.com" or "192.168.1.1").

Output Schema

ParametersJSON Schema

Name	Required	Description
`urls`	Yes	List of malware URLs on this host
`url_count`	Yes	Number of malware URLs found on this host
`blacklists`	Yes	Blacklist detection status by various services
`query_status`	Yes	Status of the query
`urlhaus_reference`	Yes	Reference URL on URLhaus for this host

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries full burden. It discloses that it returns 'all known malicious URLs hosted on that host', which implies read-only behavior. However, no info on rate limits, data freshness, or whether it modifies anything. Adequate but could add more context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states purpose, second clarifies what is returned. Efficient and front-loaded. Slight redundancy between 'malware URLs' and 'malicious URLs' but overall concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given only one parameter and no output schema, description is mostly complete for a lookup tool. However, no mention of result format or pagination, which might be needed for many results. Adequate but not outstanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a good description for the host parameter. Description adds no additional meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'Look up', specific resource 'hostname or IP address in the URLhaus database', and outcome 'find associated malware URLs'. Distinguishes from siblings like lookup_url and lookup_payload by specifying host-level lookup and malware URLs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for looking up hosts to find malware URLs, but does not explicitly state when to use vs alternatives (e.g., lookup_url, lookup_payload). No when-not-to-use or prerequisites provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lookup_payloadA

Read-onlyIdempotent

Inspect

Look up a malware payload file by its MD5 or SHA256 hash in the URLhaus database. Returns file type, size, first/last seen dates, and associated delivery URLs.

ParametersJSON Schema

Name	Required	Description	Default
`md5_hash`	No	MD5 hash of the payload to look up (32 hex characters).
`sha256_hash`	No	SHA256 hash of the payload to look up (64 hex characters).

Output Schema

ParametersJSON Schema

Name	Required	Description
`md5_hash`	Yes	MD5 hash of the payload
`file_type`	Yes	File type of the payload
`last_seen`	Yes	Date when payload was last seen
`signature`	Yes	Malware signature name if available
`url_count`	Yes	Number of URLs serving this payload
`first_seen`	Yes	Date when payload was first seen
`sha256_hash`	Yes	SHA256 hash of the payload
`query_status`	Yes	Status of the query
`delivery_urls`	Yes	URLs associated with this payload delivery
`file_size_bytes`	Yes	Size of the payload file in bytes

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description indicates the tool is a read-only query, but since annotations are empty, it does not explicitly state no side effects. It lists returned data, which adds value, but no info on rate limits or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single concise sentence providing essential information. Could be slightly improved by separating the returned fields for readability, but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and simple parameters, the description covers the purpose, input, and output well. Missing information about possible error states or empty results, but overall sufficient for a lookup tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already documents both parameters with descriptions. The description adds context about MD5 and SHA256 usage, but does not clarify which parameter is preferred or if both can be used together.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the verb 'Look up', the resource 'malware payload file', and the method 'by its MD5 or SHA256 hash'. Also specifies the database 'URLhaus' and lists returned fields, distinguishing it from siblings like lookup_host and lookup_url.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies that the tool is for looking up known hashes, but no explicit guidance on when to use this versus other lookup tools, or what to do if the hash is not found.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lookup_urlA

Read-onlyIdempotent

Inspect

Look up a URL in the URLhaus malware database to check if it is known to host or distribute malware. Returns threat category, status, blacklist status, and tags.

ParametersJSON Schema

Name	Required	Description	Default
`url`	Yes	The full URL to look up (e.g. "http://example.com/malware.exe").

Output Schema

ParametersJSON Schema

Name	Required	Description
`id`	Yes	URLhaus ID for this URL
`tags`	Yes	Associated tags for the URL
`threat`	Yes	Threat category (e.g. 'malware', 'phishing')
`blacklists`	Yes	Blacklist detection status by various services
`date_added`	Yes	Date when URL was added to database
`url_status`	Yes	Status of the URL (e.g. 'online', 'offline')
`query_status`	Yes	Status of the query (e.g. 'ok', 'no_results')

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It clearly states the tool is a lookup/read operation (not destructive), and mentions the return includes threat category, status, blacklist status, and tags. This adequately informs the agent of the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently conveys the tool's purpose and the type of information returned. No unnecessary words. Front-loaded with the action and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has only one parameter, no output schema, and no annotations, the description is reasonably complete. It covers the purpose, the input, and the output fields. It could mention that the output format is not specified, but the description lists the types of information returned, which is sufficient for an agent to decide.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents the 'url' parameter. The description adds context that the URL should be a full URL (e.g., example), which is helpful but does not go beyond the schema's description. Thus, baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to look up a URL in the URLhaus malware database. It specifies the action ('Look up'), the resource ('URL in the URLhaus malware database'), and the goal ('check if it is known to host or distribute malware'). This distinguishes it from sibling tools like lookup_host and lookup_payload.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains what the tool does but provides no guidance on when to use it versus alternatives. For example, it doesn't mention that this tool is specifically for URLs (as opposed to hosts or payloads), which could be inferred from sibling names but is not explicit. No when-not-to-use or alternative suggestions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pipeworx_feedbackAInspect

Tell the Pipeworx team something is broken, missing, or needs to exist. Use when a tool returns wrong/stale data (bug), when a tool you wish existed isn't in the catalog (feature/data_gap), or when something worked surprisingly well (praise). Describe the issue in terms of Pipeworx tools/packs — don't paste the end-user's prompt. The team reads digests daily and signal directly affects roadmap. Rate-limited to 5 per identifier per day. Free; doesn't count against your tool-call quota.

ParametersJSON Schema

Name	Required	Description
`type`	Yes	bug = something broke or returned wrong data. feature = a new tool or capability you wish existed. data_gap = data Pipeworx does not currently expose. praise = positive note. other = anything else.
`context`	No	Optional structured context: which tool, pack, or vertical this relates to.
`message`	Yes	Your feedback in plain text. Be specific (which tool, what error, what data was missing). 1-2 sentences typical, 2000 chars max.

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses the tool's purpose, rate limit (5 per identifier per day), and the kind of content expected (describe Pipeworx tools/data). Without annotations, the description provides sufficient behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two succinct sentences with no redundancy. The most critical information (purpose, usage guidelines, rate limit) is front-loaded and efficiently communicated.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple feedback tool with no output schema, the description covers purpose, usage, parameters (via schema), rate limit, and content guidelines, leaving no major gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema covers all parameters with descriptions, so baseline is 3. Description adds value by clarifying the style of feedback (describe tools/data, no verbatim prompts) and the rate limit, enhancing understanding beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Send feedback' and the resource 'Pipeworx team', and specifies distinct use cases (bug reports, feature requests, missing data, praise), making it unambiguous and distinct from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use (feedback) and what not to include (end-user prompt verbatim), plus mentions rate limit. Does not contrast with siblings, but their functions are clearly different.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pipeworx_trendingA

Read-onlyIdempotent

Inspect

What other AI agents are calling on Pipeworx right now. Returns the top tools, top packs, and total call volume over a recent window (24h, 7d, or 30d). Useful for: (1) discovering what data sources are hot for current events, (2) confirming a popular tool is the canonical choice before asking your own question, (3) seeing whether your use case aligns with what most agents need. Self-aggregating signal — derived from CF analytics-engine, no PII, just (pack, tool, count). Cached 5min-1h depending on window.

ParametersJSON Schema

Name	Required	Description	Default
`window`	No	24h (default) \| 7d \| 30d. Shorter windows surface what's hot right now; longer windows show steady-state demand.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint. The description adds context: data source (CF analytics-engine), no PII, caching behavior (5min-1h). No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences front-load purpose, then list uses, then technical details. No fluff; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-only tool with one parameter and no output schema, the description covers input, output, caching, privacy, and motivations. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The sole 'window' parameter has an enum with descriptions. The main description adds nuance: 'Shorter windows surface what's hot right now; longer windows show steady-state demand.' This goes beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns 'top tools, top packs, and total call volume' over recent windows, with three specific use cases. It distinguishes itself implicitly by focusing on aggregate AI agent usage trends.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides three explicit use cases (discovering hot data sources, confirming canonical tools, aligning use case with agent demand). Though it doesn't list when not to use or alternatives, the context is clear and actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_arbitrageA

Read-onlyIdempotent

Inspect

Find arbitrage opportunities on Polymarket via monotonicity violations + partition-sum checks. TWO MODES: (1) event — pass a single Polymarket event slug; walks child markets, checks date-axis / threshold-axis ordering AND computes the partition_check (sum of YES prices across mutually-exclusive legs — should ≈1; deviations >3pp emit a BUY/SELL EVERY LEG signal). (2) topic — pass a seed question ("Strait of Hormuz traffic returns to normal"); searches related events across the platform, flattens markets, runs the comparator on the union. Cross-event mode catches "...by May 31" vs "...by Jun 30" patterns that single-event misses. SEMANTIC ANCHOR: cross-event pairs require ≥0.30 Jaccard similarity on question tokens (prevents Powell-Fed-Pause being paired with Powell-DOJ-probe); skipped_low_similarity surfaces the rejected pair count. PARTITION FILTER: drops will-person-X / will-manager-Y / will-someone-else- placeholder slugs; partitions with >20% placeholder fraction return null arb signal. Response carries opportunities[] (gap_pp, suggested_trade, reasoning) plus partition_check when in event mode (with placeholders_filtered count).

ParametersJSON Schema

Name	Required	Description	Default
`event`	No	Single-event mode: Polymarket event slug (e.g. "when-will-bitcoin-hit-150k") or full URL.
`topic`	No	Cross-event mode: a topic or seed question. Tool searches Polymarket for related markets across separate events and checks monotonicity across them. E.g. "Strait of Hormuz traffic returns to normal".

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Details the internal logic (monotonicity violations), search and grouping behavior, and output format (ranked opportunities with reasoning). Complements annotations (readOnly, openWorld) by clarifying operations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise yet thorough: one-sentence purpose, then clearly labeled modes with examples. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all necessary aspects: two modes, input examples, output description. No output schema but description adequately explains return value.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions are clear; tool description adds examples and behavioral context for each parameter. Slight overlap but adds value beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it finds arbitrage opportunities via monotonicity checks, with two distinct modes (event and topic). Differentiates from siblings like polymarket_edges by specifying cross-event capability.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes two modes with concrete examples and when each should be used. Explains why topic mode catches cases event mode misses, providing clear usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_edgesA

Read-onlyIdempotent

Inspect

Scan top Polymarket markets and return opportunities where Pipeworx data disagrees with market price. Built for "what should I bet on today" — agents discover opportunities without paging hundreds of markets. FIVE MODEL FAMILIES grouped into three response segments under by_segment: (1) MODEL_DRIVEN — crypto_price (lognormal barrier from 90d FRED log-returns) and news_momentum (GDELT 7d/21d article-volume ratio, soft signal w/ halved Kelly). (2) STRUCTURAL_ARBITRAGE — partition_overround on mutually-exclusive events; per-leg favorite-longshot bias correction with per-sport α (tennis 1.02, soccer 1.10, MMA 1.15, default 1.0); placeholder-slug filter drops will-person-X / will-team-Y / will-manager-Z / will-someone-else- backstops; partitions with >20% placeholder fraction skipped entirely. (3) CONCENTRATED_LONGSHOT — basket trade when one leg ≥85% AND ≥2 longshots ≤5% AND portfolio return ≥50:1; rare-by-design. EVERY OPPORTUNITY carries edge_pp_net (after slippage), kelly_fraction + kelly_fraction_half (capped at 0.25), market.liquidity, market.spread_pp, market.volume. TRADEABLE-EDGE KNOBS: min_liquidity / max_spread_pp drop opportunities where edge isn't realizable; min_partition_leg_kelly filters partitions by best per-leg Kelly. Cached 1h at the KV level keyed on all knobs. fed_rate bets are scanned but EXCLUDED from ranking (1m-T vs EFFR signal is unreliable at meeting-month horizons without paid OIS/SOFR-futures data); see fed_rate_context for raw spread.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Top N edges to return after ranking. Default 10, max 25.
`window`	No	Polymarket volume window to filter markets. Default 1wk.
`min_kelly`	No	Minimum half-Kelly fraction (as decimal, e.g. 0.005 = 0.5% of bankroll) to include single-leg opportunities. Default 0 (no filter). Skips opportunities that are too small to bet sensibly even if the edge is large.
`min_edge_pp`	No	Minimum \|edge\| in percentage points to include (default 0.5). Edge is evaluated NET of slippage.
`slippage_pp`	No	Assumed execution slippage in percentage points per leg (default 0.3). Subtracted from raw \|edge\| before ranking and Kelly sizing. Polymarket has zero trading fees as of 2024 but bid/ask + thin depth typically eats 20-50bp per trade. Bump for very thin partitions; drop to 0 if you have a smarter fill model.
`max_spread_pp`	No	Tradeable-edge filter. Maximum bid/ask spread in percentage points on the representative market. Default null (no filter). Set to 2 to require tight books — anything wider eats most plausible edges.
`min_liquidity`	No	Tradeable-edge filter. Minimum $ liquidity on the representative market (or for partition_overround, on at least one top_leg). Default 0 (no filter). Set to 5000 to drop thin-book opportunities where executing the edge would walk the book past breakeven.
`category_filter`	No	Comma-separated list to restrict the output: "model_driven" (crypto_price + news_momentum), "structural_arbitrage" (partition_overround), "concentrated_longshot". Combine like "model_driven,structural_arbitrage". Default: all.
`min_partition_leg_kelly`	No	Minimum BEST per-leg half-Kelly fraction across a partition_overround opportunity's top_legs (or longshot_basket legs). Default 0 (no filter). Partition arbs always return kelly_fraction_half=0 at the parent level by design (basket trades don't compose to single-leg Kelly), so min_kelly never filters them — this knob applies to the per-leg Kelly inside top_legs instead. Use to suppress thin partitions whose individual leg edges aren't worth the per-leg slippage cost.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds valuable behavioral details: it scans top markets, groups by asset, fetches price history once, computes model probability, and ranks by edge. It also notes the tool is V1 and covers only crypto-price bets, which is transparent about limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured paragraph that efficiently covers purpose, methodology, scope, and output. While it could be slightly more concise, every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description adequately explains what is returned (top N edges with suggested trade direction). It covers the model sources, the grouping/fetching logic, and the ranking criteria. The context signals confirm no required parameters and full schema coverage, so no gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with descriptions for all three parameters. The description reiterates some parameter information but also integrates them into the workflow (e.g., 'scans top markets' from window, 'returns top N' from limit). This adds semantic context beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool's purpose: scanning Polymarket markets to find where Pipeworx model probabilities disagree with market prices, specifically for crypto-price bets. It distinguishes itself from siblings like polymarket_arbitrage by detailing its methodology and use case.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly frames the tool for the 'what should I bet on today' question, providing clear context for when to use it. It does not explicitly state when not to use it, but the context is sufficient for an agent to infer appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_kalshi_spreadA

Read-onlyIdempotent

Inspect

Cross-venue spread between Kalshi and Polymarket for the same resolving question. Kalshi and Polymarket frequently price the same event 2-25pp apart because the venues have different participant pools — that delta is a real arb signal. TWO MODES: (1) topic — pre-mapped macro shortcuts ("fed", "btc", "cpi", "gdp", "sp500", "recession", "next_pope") that auto-fetch the matching event on each venue. (2) explicit kalshi_event_ticker + polymarket_event_slug for custom pairings. Returns: each venue's leg-by-leg prices (in raw probability, 0-1), and where a leg from each side maps to the same outcome, the spread (Kalshi − Polymarket) in percentage points.

ParametersJSON Schema

Name	Required	Description
`topic`	No	Pre-mapped: fed \| btc \| cpi \| gdp \| sp500 \| recession \| next_pope \| next_uk_pm \| next_israel_pm \| 2028_president
`kalshi_event_ticker`	No	Explicit Kalshi event ticker, e.g. "KXFED-26OCT". Overrides the topic-mapped Kalshi side.
`polymarket_event_slug`	No	Explicit Polymarket event slug, e.g. "fed-decision-in-june-825". Overrides the topic-mapped Polymarket side.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, etc. The description adds non-obvious behavior: delta range (2-25pp), leg-by-leg prices, spread in percentage points. No contradiction with annotations, but could detail auth needs or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with TWO MODES highlighted and clear bullet points. While slightly verbose (4 sentences), every sentence adds value. Could be slightly tightened but still efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, so the description must cover return values, which it does: leg-by-leg prices and spread. It covers both modes and explains the delta. Missing error scenarios or fallback behavior, but adequate for the complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, but the description adds meaning beyond the schema: explains how topic auto-maps, that explicit overrides, and what each mode returns. This provides useful context for parameter selection.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it calculates cross-venue spread between Kalshi and Polymarket, explains why the delta exists (different participant pools), and describes two modes (topic and explicit). It is specific and distinguishes itself from siblings like polymarket_arbitrage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly maps when to use topic vs explicit pairing and explains the delta is an arb signal. However, it does not explicitly state when not to use this tool or mention alternatives, lacking full exclusion guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recallA

Read-onlyIdempotent

Inspect

Retrieve a value previously saved via remember, or list all saved keys (omit the key argument). Use to look up context the agent stored earlier — the user's target ticker, an address, prior research notes — without re-deriving it from scratch. Scoped to your identifier (anonymous IP, BYO key hash, or account ID). Pair with remember to save, forget to delete.

ParametersJSON Schema

Name	Required	Description	Default
`key`	No	Memory key to retrieve (omit to list all keys)

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It explains the two modes (retrieve by key, list all) and states it retrieves context saved in current or previous sessions. Does not cover edge cases (e.g., missing key), but sufficient for basic use.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with action, no wasted words. Concise and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (1 optional parameter, no output schema, no nested objects), the description is complete. It covers both retrieval and listing modes. Could mention return format but not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter 'key' described in schema. Description adds the behavior of omitting key to list all, which is a useful nuance beyond schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it retrieves a memory by key or lists all if key omitted. The verb 'retrieve' and resource 'memory' are specific, and the listing behavior distinguishes it from siblings like forget or remember.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description says when to use (retrieve context saved earlier) and implies alternative (omit key to list all). No explicit when-not-to-use or comparison with siblings, but context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recent_changesA

Read-onlyIdempotent

Inspect

What's new with a company in the last N days/months? Use for "what's happening with X", "updates on Y", "news on Apple this month", or change-monitoring. Fans out in parallel to: SEC EDGAR (filings since since), GDELT→GNews fallback (news mentions in window — GDELT preferred, GNews when rate-limited or 5xx), USPTO (patents granted; PatentsView API sunset May 2025 so this soft-fails until reactivated). since accepts ISO date ("2026-04-01") or relative shorthand ("7d", "30d", "3m", "1y"). Returns structured changes[] grouped by source + total_changes count + pipeworx:// citation URIs. Use entity_profile instead when you want the static profile (filings + fundamentals + LEI + patents) regardless of window.

ParametersJSON Schema

Name	Required	Description
`type`	Yes	Entity type. Only "company" supported today.
`since`	Yes	Window start — ISO date ("2026-04-01") or relative ("7d", "30d", "3m", "1y"). Use "30d" or "1m" for typical monitoring.
`value`	Yes	Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193").

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Since annotations are absent, the description carries the full burden. It discloses key behavioral traits: parallel fan-out to three sources, accepted date formats, and return structure (changes, count, URIs). However, it omits details like authorization, rate limits, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: a clear first sentence states the core purpose, followed by specific details about entity type, parameters, and return values. Every sentence adds information without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Without an output schema, the description adequately explains return values (structured changes, total_changes count, URIs). It covers key aspects for the tool's simple 3-parameter interface, though it lacks mentions of pagination or error conditions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds value by providing concrete examples for 'since' (e.g., '7d', '30d') and 'value' (ticker or CIK) beyond the schema's descriptions. This clarifies acceptable input formats.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: finding what's new about an entity since a point in time. It specifies the entity type 'company' and details the fan-out to SEC EDGAR, GDELT, and USPTO. This distinguishes it from siblings like 'get_recent' by focusing on structured change monitoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly provides usage scenarios: 'brief me on what happened with X' or change-monitoring workflows. It does not mention when not to use the tool or name alternatives, but the given context is clear enough for an agent to decide.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rememberA

Idempotent

Inspect

Save data the agent will need to reuse later — across this conversation or across sessions. Use when you discover something worth carrying forward (a resolved ticker, a target address, a user preference, a research subject) so you don't have to look it up again. Stored as a key-value pair scoped by your identifier. Authenticated users get persistent memory; anonymous sessions retain memory for 24 hours. Pair with recall to retrieve later, forget to delete.

ParametersJSON Schema

Name	Required	Description	Default
`key`	Yes	Memory key (e.g., "subject_property", "target_ticker", "user_preference")
`value`	Yes	Value to store (any text — findings, addresses, preferences, notes)

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Since no annotations are provided, the description bears full responsibility for behavioral transparency. It discloses that the tool stores data, explains the persistence difference between authenticated and anonymous sessions, and implies it is a write operation. It could mention if overwriting an existing key is allowed or if there are size limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at two sentences, with all critical information front-loaded. Every sentence adds value: the first states the core action and storage location, the second provides use cases and persistence details. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has only two simple parameters with full schema coverage, no output schema, and no nested objects, the description is quite complete. It covers purpose, use cases, and persistence behavior. A minor gap is not mentioning whether the tool overwrites existing keys or if there are size constraints on key or value.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already provides 100% coverage with descriptions for both parameters. The description adds context about the types of values to store (findings, addresses, preferences) and example keys, which goes beyond the schema. This adds meaningful guidance for the agent on how to use the parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool stores a key-value pair in session memory and distinguishes it from siblings like recall (which likely retrieves) and forget (which likely deletes). It specifies the purpose as saving intermediate findings, user preferences, or context across tool calls.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context on when to use this tool (to save findings, preferences, or context across calls) and includes details about memory persistence (persistent for authenticated users, 24-hour for anonymous). However, it does not explicitly mention when not to use it or alternative tools like forget or recall for other operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

resolve_entityA

Read-onlyIdempotent

Inspect

Resolve a user-spoken name to the canonical/official identifiers other tools require as input. Use FIRST when you have a name but need an ID. SUPPORTED TYPES: "company" (returns ticker + 10-digit CIK + company_name from SEC EDGAR + pipeworx://edgar/company/{cik} citation URI; accepts ticker, CIK, or company name as input — auto-disambiguated), "drug" (returns RxCUI + ingredient + brand from RxNorm + pipeworx://rxnorm/{rxcui} citation; accepts brand or generic name). Each call cascades through several lookup endpoints internally — using resolve_entity replaces 2-3 manual lookups.

ParametersJSON Schema

Name	Required	Description	Default
`type`	Yes	Entity type: "company" or "drug".
`value`	Yes	For company: ticker (AAPL), CIK (0000320193), or name. For drug: brand or generic name (e.g., "ozempic", "metformin").

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations present, so description fully shoulders transparency. Discloses v1 limitation to type='company' and return fields (ticker, CIK, name, URIs). Missing error handling or edge cases.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, zero wasted words. First sentence defines purpose, second adds operational details. Ideal conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description fully specifies return fields. With 2 parameters and clear usage examples, the description equips the agent with all needed information for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema already describes both parameters (type enum, value string). Description adds critical context: type only supports 'company', value can be ticker/CIK/name with concrete examples and v1 scope.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states verb 'resolve', resource 'entity', and specific outcome 'canonical IDs across Pipeworx data sources'. Distinguishes from siblings by noting it replaces multiple lookup calls.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit input examples (ticker, CIK, name) and explains efficiency gain. Lacks explicit when-not-to-use or alternative tools, but context implies it's the go-to for entity resolution.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scan_competitor_ai_presenceA

Read-onlyIdempotent

Inspect

Compare AI visibility across multiple entities side-by-side. Probes each entity (your brand + N competitors) with ai_visibility_check, ranks by score, surfaces which is most/least recognized. Useful for competitive AI-marketing audits: "does Claude know about us as well as our competitors?". Returns ranked list with score, confidence, signal density per entity.

ParametersJSON Schema

Name	Required	Description
`models`	No	Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai.
`_apiKey`	No	Optional Anthropic API key — only if "anthropic" is in models. Passed to api.anthropic.com per probe.
`context`	No	Optional shared context applied to every probe (e.g. "B2B SaaS", "Boston restaurant"). Disambiguates common names.
`entities`	Yes	Array of 2-8 entities to compare (brand/business/product names). First entry treated as the "subject" for narrative; rest are competitors.

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnly, openWorld, idempotent, non-destructive. The description adds behavioral detail: probes each entity with ai_visibility_check, ranks by score, and returns score, confidence, signal density. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences plus a clarifying sentence. Front-loaded with the main purpose. Every sentence adds value: purpose, process, example. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description compensates by stating the return format (ranked list with score, confidence, signal density). Also explains constraints (2-8 entities, first is subject). Covers all necessary context for an agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, baseline is 3. The description adds value: explains first entity is treated as subject, default model is workers-ai, _apiKey only needed for anthropic, context disambiguates. This goes beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool compares AI visibility across multiple entities side-by-side, with specific verbs and resource. It distinguishes from sibling 'ai_visibility_check' by explicitly referencing it as a sub-probe and adding ranking functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear use case ('competitive AI-marketing audits') and an example question. It implies when to use but does not explicitly state when not to use or list alternatives, though the context signals help.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_claimA

Read-onlyIdempotent

Inspect

Fact-check, verify, validate, or confirm/refute a natural-language factual claim or statement against authoritative sources. Use when an agent needs to check whether something a user said is true ("Is it true that…?", "Was X really…?", "Verify the claim that…", "Validate this statement…"). v1 supports company-financial claims (revenue, net income, cash position for public US companies) via SEC EDGAR + XBRL. Returns a verdict (confirmed / approximately_correct / refuted / inconclusive / unsupported), extracted structured form, actual value with pipeworx:// citation, and percent delta. Replaces 4–6 sequential calls (NL parsing → entity resolution → data lookup → numeric comparison).

ParametersJSON Schema

Name	Required	Description	Default
`claim`	Yes	Natural-language factual claim, e.g., "Apple's FY2024 revenue was $400 billion" or "Microsoft made about $100B in profit last year".

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses return values (verdict, structured form, actual value with citation, percent delta) and notes it replaces multiple agent calls. Lacks details on error handling or safety, but overall sufficient for a read-like operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences front-load purpose, then domain details, then outputs and benefit. No wasted words; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given a single-parameter tool with no output schema, description covers inputs, process, outputs, and benefits. It explains scope limitation (v1, only financial claims) but lacks error handling guidance. Adequate for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Single parameter 'claim' has schema description matching usage. Schema description coverage is 100%, so baseline 3. Description does not add extra parameter semantics beyond what schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool fact-checks natural-language claims, specifies the supported domain (company-financial claims for public US companies), and lists output types. It distinguishes itself by noting it replaces 4-6 sequential agent calls, providing clear purpose and resource.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explicitly states the domain and data sources (SEC EDGAR + XBRL for company-financial claims), indicating when to use. It implies limitations via 'v1 supports', but does not explicitly state when not to use or offer alternatives, though sibling tools are different in nature.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Urlhaus

Server Details

Tool Definition Quality

Available Tools

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Output Schema

Tool Definition Quality

Output Schema

Tool Definition Quality

Output Schema

Tool Definition Quality

Output Schema

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Discussions

Your Connectors

Resources