radio
Server Details
Radio MCP — wraps Radio Browser API (free, no auth)
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- pipeworx-io/mcp-radio
- GitHub Stars
- 0
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.1/5 across 21 of 23 tools scored. Lowest: 2.9/5.
Most tools have clearly distinct purposes due to detailed descriptions. However, the large number of data-query tools (e.g., ask_pipeworx, bet_research, compare_entities, entity_profile, recent_changes, validate_claim) could cause confusion if descriptions are not read carefully, but each targets a specific use case.
Tool names follow a mostly consistent verb_noun pattern with underscores (e.g., list_countries, search_stations, compare_entities). A few names like bet_research and entity_profile are less obviously verb-first but still readable. No mixing of cases.
With 23 tools, the count is at the high end of the ideal range. Given the server claims to be about 'radio' but includes only 4 radio-specific tools, the number feels excessive for the name. However, as a general utility it is borderline acceptable.
For the stated 'radio' purpose, coverage is very weak—only a few station search tools. However, the actual tool set covers AI visibility, entity data, betting analytics, and memory thoroughly, which seems to be the real domain. Significant radio-specific gaps exist.
Available Tools
23 toolsai_visibility_checkARead-onlyIdempotentInspect
Probe one or more LLMs for what they know about a business / brand / product / topic and score visibility (0-100) per model. Default model is Workers AI Llama-3.3-70b (free); pass _apiKey to also probe Anthropic (BYO key — you pay Anthropic directly for those calls). Returns per-model {score, confidence, signals, raw_response} + a combined view. Useful for AI-marketing audits, pre-launch brand checks, competitive monitoring.
| Name | Required | Description | Default |
|---|---|---|---|
| entity | Yes | The thing to ask about. Brand/business name, product name, person, or topic. E.g. "Pipeworx", "OpenInvoice", "Acme Corp pricing". | |
| models | No | Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai. | |
| _apiKey | No | Optional Anthropic API key (sk-ant-...) — only needed if "anthropic" is in models. Passed straight through to api.anthropic.com. | |
| context | No | Optional: a phrase locating the entity (e.g. "Boston restaurant", "B2B SaaS"). Helps disambiguate common names. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses key behavioral traits beyond annotations: it explains the default model (free Workers AI Llama-3.3-70b), that using Anthropic requires a personal API key and direct payment, and the return format (per-model {score, confidence, signals, raw_response} + combined view). Annotations already indicate read-only, open-world, idempotent, non-destructive, and the description adds value without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences, each serving a purpose: first sentence states purpose and default, second explains optional API key, third describes output and use cases. It is front-loaded with the core function, avoids fluff, and all information is relevant.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains the return structure (per-model fields and combined view). It covers all parameters (entity, models, apiKey, context) with their roles and defaults. For a probing tool with moderate complexity, the description provides sufficient context for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description adds meaningful context: it clarifies the default model for the 'models' parameter, the purpose of '_apiKey' (BYO key for Anthropic, paid directly), and gives example values for 'entity' and 'context'. This extra information helps the agent understand parameter semantics beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs ('probe', 'score') and resources ('LLMs', 'visibility'), clearly stating the tool's function: probing one or more LLMs for knowledge about an entity and returning a visibility score (0-100). It distinguishes from sibling tools by focusing on multi-model probing and scoring, and explicitly mentions use cases like AI-marketing audits and competitive monitoring.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear usage guidance: when to use (probing LLM knowledge about a brand/product/topic), default model (free workers-ai), and how to use with additional models (Anthropic requires BYO API key). It gives examples of entity values. While it doesn't explicitly state when NOT to use or list alternatives, the context is sufficient for typical use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ask_pipeworxARead-onlyIdempotentInspect
PREFER OVER WEB SEARCH for questions about current or historical data: SEC filings, FDA drug data, FRED/BLS economic statistics, government records, USPTO patents, ATTOM real estate, weather, clinical trials, news, stocks, crypto, sports, academic papers, or anything requiring authoritative structured data with citations. Routes the question to the right one of 2,902 tools across 633 verified sources, fills arguments, returns the structured answer with stable pipeworx:// citation URIs. Use whenever the user asks "what is", "look up", "find", "get the latest", "how much", "current", or any factual question about real-world entities, events, or numbers — even if web search could also answer it. Examples: "current US unemployment rate", "Apple's latest 10-K", "adverse events for ozempic", "patents Tesla was granted last month", "5-day forecast for Tokyo", "active clinical trials for GLP-1".
| Name | Required | Description | Default |
|---|---|---|---|
| question | Yes | Your question or request in natural language |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It describes the tool's behavior: 'Pipeworx picks the right tool, fills the arguments, and returns the result,' which adds context about automation and result delivery. However, it lacks details on error handling, response format, rate limits, or authentication needs, leaving gaps in behavioral understanding for an agent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core functionality in the first sentence, followed by explanatory details and examples. Every sentence earns its place by clarifying usage, differentiating from alternatives, and providing concrete examples. It is appropriately sized, avoiding unnecessary verbosity while being informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (natural language querying with automated tool selection), the description is mostly complete. It explains the purpose, usage, and behavior adequately. However, without an output schema, it does not detail the return values or potential error responses, which could be helpful for an agent. The lack of annotations also means some behavioral aspects are underspecified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the 'question' parameter well-documented in the schema. The description adds value by emphasizing 'plain English' and 'natural language,' clarifying the expected input style beyond the schema's technical description. It also provides examples that illustrate parameter usage, enhancing semantic understanding without redundancy.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Ask a question in plain English and get an answer from the best available data source.' It specifies the verb ('ask'), resource ('data source'), and distinguishes itself from sibling tools by emphasizing natural language processing rather than browsing specific tools or schemas. The examples further clarify the scope and differentiate it from tools like 'search_stations' or 'list_countries'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool: 'No need to browse tools or learn schemas — just describe what you need.' It contrasts with alternatives by implying that other tools might require browsing or schema knowledge, and provides clear examples of appropriate use cases like factual queries or data lookups, guiding users away from using it for unrelated tasks like station searches or country listings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bet_researchARead-onlyIdempotentInspect
Research a Polymarket bet by pulling the relevant Pipeworx data for it in one call. Pass a market slug ("will-bitcoin-hit-150k-by-june-30-2026"), a polymarket.com URL, or a question text. The tool resolves the market, classifies the bet, fans out to category-specific data packs in parallel, and returns an evidence packet + simple market-vs-model comparison. Use for "should I bet on X", "what does the data say about Y", or "is there edge in Z". CLASSIFIERS: crypto_price, fed_rate, geopolitical, sports, sports_championship, drug_approval, election_candidate, tech_launch, space_launch, corporate, corporate_earnings, corporate_event, public_figure_speech, weather, other. FAN-OUT EXAMPLES: BTC bet → coingecko + fred + gdelt+gnews; Fed bet → fred + kalshi_macro + federal_register; Hormuz bet → imf_portwatch + airspace + gdelt; Yankees WS → mlb_stats_standings + parent_event partition + news; NVDA-vs-AAPL → finnhub get_quote + edgar shares-outstanding (derived market cap) + edgar filings + news. RESPONSE SHAPES: result.market carries best_bid/best_ask/spread_pp/liquidity/price_change_1h/1d/1w; result.analysis carries model_probability/edge_pp/kelly_fraction_half when a closed-form model fires; result.evidence is keyed by source. SAFETY: low-confidence resolutions short-circuit with status:"low_confidence_match" and suppress analysis fields so agents can't accidentally size on phantom matches. Closed/dead markets return status:"market_closed_or_inactive" and skip fan-out. Wide-spread markets (>10pp) carry tradeability:"illiquid_wide_spread" + an explanatory note.
| Name | Required | Description | Default |
|---|---|---|---|
| depth | No | quick = 2-3 evidence sources, thorough = full fan-out. Default thorough. | |
| market | Yes | Polymarket slug ("will-bitcoin-hit-150k-by-june-30-2026"), full URL ("https://polymarket.com/event/..."), or question text ("Will Bitcoin hit $150k by June 30?") | |
| include_raw | No | Default false. When false (recommended), FRED/FDA/GDELT/Federal-Register evidence is summarized to the few fields agents actually use — keeps responses under ~20KB. Pass true to get full upstream payloads (50KB-500KB) when you need to recompute deltas, cite specific observations, or post-process. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and openWorldHint=true. The description adds significant behavioral context: it resolves markets, classifies bets, fans out to appropriate data packs, and returns an evidence packet with a market-vs-model comparison. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph but not overly verbose; every sentence contributes meaningful information. It front-loads the key action and purpose. Could be slightly more structured, but is efficient overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description provides a reasonable overview of the return (evidence packet + market-vs-model comparison). It covers input formats, classification, and fan-out behavior. Lacks explicit detail on output structure, but openWorldHint justifies some variability.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline is 3. The description adds modest value: it clarifies that the 'market' parameter accepts slug, URL, or question text, and implicitly mentions 'depth' via 'quick = 2-3 evidence sources, thorough = full fan-out' but doesn't add meaning beyond the schema's own description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool researches a Polymarket bet by pulling Pipeworx data, specifying the action ('research') and resource ('bet' via slug, URL, or question text). It distinguishes itself from sibling tools by claiming it is the core demo product that converts better than having agents discover packs themselves.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly lists use cases: 'should I bet on X?', 'what does the data say about this Polymarket market?', 'is there edge in this bet?'. It does not explicitly mention when not to use it or compare to specific siblings, but the guidance is clear enough for the intended scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compare_entitiesARead-onlyIdempotentInspect
Compare 2-5 companies (or drugs) side by side in one call. Use for "compare X and Y", "X vs Y", "which is bigger", or rank-by-metric questions. type="company" — pulls LATEST 10-K revenue + net income + cash + long-term debt from SEC EDGAR/XBRL (post-Run-6 fix: returns the actual most-recent FY filing per concept, not arbitrarily-old data; off-calendar fiscal years like AAPL Sep, NVDA Jan handled correctly). type="drug" — pulls adverse-event report counts from FAERS, FDA approval counts, active trial counts. Returns paired data + pipeworx:// citation URIs per entity. Replaces 8-15 sequential lookups; results are sorted by the primary metric (revenue for company, adverse events for drug) so "largest" / "most" reads off the top of the response.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Entity type: "company" or "drug". | |
| values | Yes | For company: 2–5 tickers/CIKs (e.g., ["AAPL","MSFT"]). For drug: 2–5 names (e.g., ["ozempic","mounjaro"]). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses data sources (SEC EDGAR, FDA) and output includes paired data and URIs. However, it does not mention error handling, rate limits, or what happens if an entity is not found, which is a moderate gap.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with the main purpose, no wasted words. Efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and two parameters, the description covers input semantics and output highlights (paired data, URIs). It could be more detailed on output structure but is sufficient for agent use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. Description adds value by explaining 'type' parameter behavior and giving concrete examples for 'values' (tickers/CIKs for company, names for drug), which goes beyond the schema description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool compares 2–5 entities side by side, specifying two types (company and drug) with distinct data fields. It distinguishes itself from sibling tools like resolve_entity (single entity) and search_stations (different domain).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use: when comparing multiple entities in one call, and notes it replaces 8–15 sequential calls. However, it does not explicitly state when not to use or list alternatives, but the context is clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
discover_toolsARead-onlyIdempotentInspect
Find tools by describing the data or task. Use when you need to browse, search, look up, or discover what tools exist for: SEC filings, financials, revenue, profit, FDA drugs, adverse events, FRED economic data, Census demographics, BLS jobs/unemployment/inflation, ATTOM real estate, ClinicalTrials, USPTO patents, weather, news, crypto, stocks. Returns the top-N most relevant tools with names, descriptions, and full input schemas (with curated examples) — each result is ready to call directly, no second schema lookup needed. Call this FIRST when you have many tools available and want to see the option set (not just one answer).
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of tools to return (default 20, max 50) | |
| query | Yes | Natural language description of what you want to do (e.g., "analyze housing market trends", "look up FDA drug approvals", "find trade data between countries") |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It describes the search functionality and return format (tools with names and descriptions), but doesn't mention performance characteristics, error conditions, or authentication requirements. The description adds some context about when to use it, but lacks details on rate limits, pagination, or what happens with no matches.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is perfectly concise with two sentences that each earn their place. The first sentence explains what the tool does, and the second provides crucial usage guidance. There's zero waste or redundancy, and the most important information (when to use it) is appropriately front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (search functionality with 2 parameters) and 100% schema coverage, the description is reasonably complete. It explains the purpose and provides excellent usage guidance. However, with no output schema and no annotations, it could benefit from mentioning what the return structure looks like or any limitations of the search algorithm.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema. It mentions the general purpose ('search by describing what you need') but provides no additional syntax, format, or constraint details for the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('search', 'returns') and resources ('Pipeworx tool catalog', 'most relevant tools with names and descriptions'). It distinguishes from sibling tools by focusing on tool discovery rather than listing specific entities like stations, countries, or tags.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance: 'Call this FIRST when you have 500+ tools available and need to find the right ones for your task.' This gives clear conditions for when to use this tool versus alternatives, including the threshold (500+ tools) and the specific scenario (finding the right tools for a task).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
entity_profileARead-onlyIdempotentInspect
Get everything about a US public company in one call. Use when a user asks "tell me about X", "research Acme", "brief me on Tesla", or you'd otherwise call 10+ pack tools across SEC EDGAR, XBRL, USPTO, news, GLEIF. Returns: cik + company_name; recent_filings (up to 5 with pipeworx://edgar/company/{cik}/filings/{accession} URIs); fundamentals (LATEST 10-K Revenues + NetIncomeLoss + Cash, sorted period_end DESC — Run 6 fix landed real FY2025 numbers, not stale FY2022); patents (USPTO PatentsView API was sunset May 2025; pack soft-fails until reactivated); recent news mentions via GDELT→GNews fallback; LEI via GLEIF. Pass ticker "AAPL" or zero-padded CIK "0000320193" — names not supported (use resolve_entity first).
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Entity type. Only "company" supported today; person/place coming soon. | |
| value | Yes | Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193"). Names not supported — use resolve_entity first if you only have a name. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose behavior. It explains the tool is a read-only aggregation returning citation URIs and specific data types. It does not mention auth or rate limits, but the safety profile is clear: no destructive actions, and the output format is partially described.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single focused paragraph that front-loads the main purpose. It lists data types concisely and includes actionable guidance. A slight improvement could be bullet points, but it is efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains what is returned (specific data types and citation URIs). It covers key use cases and limitations (only company type, name resolution requirement). No pagination or formatting details, but sufficient for an agent to select the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (2 parameters described in schema). The description adds critical context: type is limited to 'company', value is ticker or CIK and not a name, and provides examples. This exceeds the baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns a full entity profile across relevant Pipeworx packs, listing specific data types for companies (SEC filings, revenue, patents, news, LEI) and notes it replaces 10-15 sequential calls. It distinguishes from alternatives like usa_recipient_profile for federal contracts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance: use when a comprehensive profile is needed, avoid for federal contracts (pointing to usa_recipient_profile), and use resolve_entity if only a name is given. However, it does not compare to other siblings like compare_entities.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
forgetCDestructiveIdempotentInspect
Delete a previously stored memory by key. Use when context is stale, the task is done, or you want to clear sensitive data the agent saved earlier. Pair with remember and recall.
| Name | Required | Description | Default |
|---|---|---|---|
| key | Yes | Memory key to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries full burden but only states the action ('Delete') without disclosing behavioral traits like whether deletion is permanent, requires specific permissions, or has side effects. It adds minimal context beyond the basic operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence with zero waste. It is front-loaded and appropriately sized for a simple tool, making it easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a deletion tool with no annotations and no output schema, the description is incomplete. It lacks details on behavioral implications (e.g., permanence, error handling) and doesn't compensate for the absence of structured data, leaving gaps in understanding.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents the single parameter ('key'). The description adds no additional meaning beyond what the schema provides, such as format examples or constraints, meeting the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Delete') and resource ('a stored memory by key'), making the purpose immediately understandable. It doesn't explicitly differentiate from sibling tools like 'recall' or 'remember', but the action is distinct enough to infer separation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The description lacks context about prerequisites (e.g., needing an existing memory key) or exclusions, leaving the agent to infer usage from the action alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_llms_txtARead-onlyIdempotentInspect
Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Full URL of the site to summarize, e.g. "https://example.com" or a specific landing page. | |
| max_links | No | Maximum number of link entries to include (default 25, max 50). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes the process (fetch, extract, emit) and output format. Annotations already indicate read-only, idempotent, non-destructive; the description adds valuable context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Efficient single paragraph with clear purpose, process, and use cases. No redundant or unnecessary sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and complete schema coverage, the description fully explains what the tool does, how it works, and what it produces.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description does not add additional semantics beyond what the schema already provides for the parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool generates an llms.txt file for any URL, specifying the action and resource. It distinguishes itself from siblings, none of which perform similar generation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit use cases (e.g., getting client site indexed, drafting own project, auditing competitors) but lacks explicit when-not-to-use or alternatives. Context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_top_stationsBRead-onlyIdempotentInspect
Get trending radio stations ranked by popularity, optionally filtered by country (e.g., 'US', 'GB', 'DE'). Returns station URLs, genres, and vote counts.
| Name | Required | Description | Default |
|---|---|---|---|
| count | No | Number of stations to return. Defaults to 10. | |
| country | No | Filter by country name (e.g. "Germany", "United States"). Omit for global results. |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | Number of stations returned |
| stations | Yes | List of top-ranked radio stations |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions the tool retrieves data ('Get') but doesn't specify if it's read-only, requires authentication, has rate limits, or describes the return format (e.g., list structure, pagination). For a tool with no annotation coverage, this leaves significant behavioral traits unexplained.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose and includes the optional filter. There is no wasted language, and every part earns its place by conveying essential information without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (2 parameters, no output schema, no annotations), the description is adequate but incomplete. It covers the basic purpose and optional filtering, but without annotations or output schema, it lacks details on behavioral traits (e.g., safety, response format) that would help an agent use it correctly. This meets minimum viability with clear gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, so the input schema already fully documents the parameters ('count' and 'country'). The description adds minimal value beyond the schema by hinting at the optional country filter, but it doesn't provide additional semantics like format examples beyond 'Germany' or clarify how popularity is calculated. Baseline 3 is appropriate as the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Get the most popular radio stations by vote count' specifies the action (get) and resource (radio stations) with a popularity metric (vote count). It distinguishes from siblings like 'search_stations' by focusing on popularity ranking rather than general search. However, it doesn't explicitly contrast with 'list_countries' or 'list_tags', keeping it from a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides implied usage through the optional country filter ('optionally filtered by country'), suggesting it can be used for global or country-specific queries. It doesn't explicitly state when to use this tool versus alternatives like 'search_stations' for non-popularity-based searches, nor does it mention prerequisites or exclusions, leaving some guidance gaps.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_countriesBRead-onlyIdempotentInspect
Browse available countries with radio stations. Returns country names and station counts to help target your search geographically.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | Total number of countries |
| countries | Yes | Countries with radio stations, sorted by station count |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions that the tool lists countries with station counts, but it doesn't describe key behavioral traits such as whether the list is paginated, sorted, or limited in scope, or if there are any rate limits or authentication requirements. This is a significant gap for a tool with zero annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence: 'List countries that have radio stations, with station counts.' It is front-loaded with the core purpose and includes no unnecessary words, making it highly concise and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (simple listing with no parameters) and the lack of annotations and output schema, the description is minimally adequate. It states what the tool does but misses details like output format, pagination, or error handling. With no output schema, the description should ideally explain return values more fully, but it provides a basic overview.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0 parameters with 100% coverage, meaning no parameters are documented in the schema. The description adds value by implying the output includes station counts, which is useful semantic information beyond the empty schema. However, it doesn't detail any optional parameters or filtering options, so it's not a perfect score.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'List countries that have radio stations, with station counts.' It specifies the verb ('list'), resource ('countries'), and includes the additional detail of providing station counts. However, it doesn't explicitly differentiate from sibling tools like 'search_stations' or 'list_tags,' which prevents a perfect score.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'search_stations' (which might filter stations) or 'list_tags' (which might list tags instead of countries), nor does it specify any context or exclusions for usage. This leaves the agent without clear direction on tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_tagsBRead-onlyIdempotentInspect
Discover radio genres and tags ranked by station count. Use to explore what categories are available before searching.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of tags to return. Defaults to 20. |
Output Schema
| Name | Required | Description |
|---|---|---|
| tags | Yes | Radio genres/tags sorted by station count |
| count | Yes | Number of tags returned |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It describes a read-only listing operation, which is clear, but lacks details on permissions, rate limits, pagination, or error handling. For a tool with zero annotation coverage, this is a significant gap in transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, efficient sentence that front-loads the core purpose without unnecessary words. It's appropriately sized for a simple listing tool and earns its place by clearly stating what the tool does.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (one optional parameter, no output schema, no annotations), the description is minimally adequate. It covers the basic purpose but lacks guidance on usage versus siblings and behavioral details like output format or error cases, which would be helpful for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the 'limit' parameter fully documented in the schema. The description doesn't add any parameter-specific details beyond what the schema provides, such as format constraints or examples. Baseline 3 is appropriate when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'List the most common radio station genres and tags by station count.' It specifies the verb ('List'), resource ('radio station genres and tags'), and scope ('by station count'). However, it doesn't explicitly differentiate from sibling tools like 'get_top_stations' or 'search_stations', which might also involve station data.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention sibling tools like 'get_top_stations' (which might list stations directly) or 'search_stations' (which might filter stations), leaving the agent to infer usage based on tool names alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pipeworx_feedbackAInspect
Tell the Pipeworx team something is broken, missing, or needs to exist. Use when a tool returns wrong/stale data (bug), when a tool you wish existed isn't in the catalog (feature/data_gap), or when something worked surprisingly well (praise). Describe the issue in terms of Pipeworx tools/packs — don't paste the end-user's prompt. The team reads digests daily and signal directly affects roadmap. Rate-limited to 5 per identifier per day. Free; doesn't count against your tool-call quota.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | bug = something broke or returned wrong data. feature = a new tool or capability you wish existed. data_gap = data Pipeworx does not currently expose. praise = positive note. other = anything else. | |
| context | No | Optional structured context: which tool, pack, or vertical this relates to. | |
| message | Yes | Your feedback in plain text. Be specific (which tool, what error, what data was missing). 1-2 sentences typical, 2000 chars max. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. Discloses rate limit (5 per day per identifier) and constraint on content. Additional details like whether feedback is anonymous would be helpful, but current info is sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four concise, front-loaded sentences. No redundancy. Each sentence adds distinct information: purpose, use cases, content guidance, rate limit.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema and low complexity (3 params), description covers all necessary aspects for an agent to use correctly. Includes parameter guidance, usage rules, and constraints.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% but description adds value beyond schema: clarifies usage for each enum value, recommends message length and specificity. Nested context object described but could benefit from more examples.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description explicitly states 'Send feedback to the Pipeworx team' and lists specific use cases (bug reports, feature requests, missing data, praise). It clearly distinguishes from sibling tools which are data-oriented.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use including types of feedback. Also warns not to include user's prompt verbatim and mentions rate limit. Lacks explicit alternatives, but sibling context makes it obvious this is the only feedback tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pipeworx_trendingARead-onlyIdempotentInspect
What other AI agents are calling on Pipeworx right now. Returns the top tools, top packs, and total call volume over a recent window (24h, 7d, or 30d). Useful for: (1) discovering what data sources are hot for current events, (2) confirming a popular tool is the canonical choice before asking your own question, (3) seeing whether your use case aligns with what most agents need. Self-aggregating signal — derived from CF analytics-engine, no PII, just (pack, tool, count). Cached 5min-1h depending on window.
| Name | Required | Description | Default |
|---|---|---|---|
| window | No | 24h (default) | 7d | 30d. Shorter windows surface what's hot right now; longer windows show steady-state demand. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate safe, idempotent read access. The description adds valuable context: caching behavior (5min-1h), data source (CF analytics-engine), privacy (no PII), and window trade-offs, going well beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with front-loaded purpose and bulleted use cases. It is slightly verbose but every sentence adds value, and the structure aids readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read-only tool with one parameter and no output schema, the description fully covers purpose, usage, behavioral traits, and parameter details, leaving no gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema covers the only parameter (window) with enum and description. The description mostly repeats the schema's guidance on window selection, adding minimal new semantic value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns trending tools, packs, and call volumes, distinguishing it from sibling tools like discover_tools or entity_profile by focusing on aggregate usage signals.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Three explicit use cases are listed, providing clear context for when to use the tool. However, no exclusions or alternatives are mentioned, though the sibling list provides implicit alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
polymarket_arbitrageARead-onlyIdempotentInspect
Find arbitrage opportunities on Polymarket via monotonicity violations + partition-sum checks. TWO MODES: (1) event — pass a single Polymarket event slug; walks child markets, checks date-axis / threshold-axis ordering AND computes the partition_check (sum of YES prices across mutually-exclusive legs — should ≈1; deviations >3pp emit a BUY/SELL EVERY LEG signal). (2) topic — pass a seed question ("Strait of Hormuz traffic returns to normal"); searches related events across the platform, flattens markets, runs the comparator on the union. Cross-event mode catches "...by May 31" vs "...by Jun 30" patterns that single-event misses. SEMANTIC ANCHOR: cross-event pairs require ≥0.30 Jaccard similarity on question tokens (prevents Powell-Fed-Pause being paired with Powell-DOJ-probe); skipped_low_similarity surfaces the rejected pair count. PARTITION FILTER: drops will-person-X / will-manager-Y / will-someone-else- placeholder slugs; partitions with >20% placeholder fraction return null arb signal. Response carries opportunities[] (gap_pp, suggested_trade, reasoning) plus partition_check when in event mode (with placeholders_filtered count).
| Name | Required | Description | Default |
|---|---|---|---|
| event | No | Single-event mode: Polymarket event slug (e.g. "when-will-bitcoin-hit-150k") or full URL. | |
| topic | No | Cross-event mode: a topic or seed question. Tool searches Polymarket for related markets across separate events and checks monotonicity across them. E.g. "Strait of Hormuz traffic returns to normal". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description goes significantly beyond annotations by explaining the internal logic (monotonicity checks), the difference between modes, and the output format (ranked opportunities with reasoning). It does not contradict any annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single well-structured paragraph that front-loads the core purpose and then explains modes. Every sentence is informative and necessary, with no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (two modes, no output schema), the description fully explains what it does, how to use it, and what to expect as output. It is complete for an AI agent to decide and invoke correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 100% schema coverage, the description adds value by explaining that parameters are mutually exclusive, providing example values, and clarifying they correspond to distinct modes. This adds meaningful context beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool finds arbitrage opportunities by checking monotonicity violations on Polymarket. It gives a specific verb, resource, and method, and distinguishes between two modes, setting it apart from siblings like polymarket_edges.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly details two modes ('event' and 'topic') with examples and explains when to use each, including why cross-event mode is necessary for cases where single-event mode fails. This provides clear guidance for tool selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
polymarket_edgesARead-onlyIdempotentInspect
Scan top Polymarket markets and return opportunities where Pipeworx data disagrees with market price. Built for "what should I bet on today" — agents discover opportunities without paging hundreds of markets. FIVE MODEL FAMILIES grouped into three response segments under by_segment: (1) MODEL_DRIVEN — crypto_price (lognormal barrier from 90d FRED log-returns) and news_momentum (GDELT 7d/21d article-volume ratio, soft signal w/ halved Kelly). (2) STRUCTURAL_ARBITRAGE — partition_overround on mutually-exclusive events; per-leg favorite-longshot bias correction with per-sport α (tennis 1.02, soccer 1.10, MMA 1.15, default 1.0); placeholder-slug filter drops will-person-X / will-team-Y / will-manager-Z / will-someone-else- backstops; partitions with >20% placeholder fraction skipped entirely. (3) CONCENTRATED_LONGSHOT — basket trade when one leg ≥85% AND ≥2 longshots ≤5% AND portfolio return ≥50:1; rare-by-design. EVERY OPPORTUNITY carries edge_pp_net (after slippage), kelly_fraction + kelly_fraction_half (capped at 0.25), market.liquidity, market.spread_pp, market.volume. TRADEABLE-EDGE KNOBS: min_liquidity / max_spread_pp drop opportunities where edge isn't realizable; min_partition_leg_kelly filters partitions by best per-leg Kelly. Cached 1h at the KV level keyed on all knobs. fed_rate bets are scanned but EXCLUDED from ranking (1m-T vs EFFR signal is unreliable at meeting-month horizons without paid OIS/SOFR-futures data); see fed_rate_context for raw spread.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Top N edges to return after ranking. Default 10, max 25. | |
| window | No | Polymarket volume window to filter markets. Default 1wk. | |
| min_kelly | No | Minimum half-Kelly fraction (as decimal, e.g. 0.005 = 0.5% of bankroll) to include single-leg opportunities. Default 0 (no filter). Skips opportunities that are too small to bet sensibly even if the edge is large. | |
| min_edge_pp | No | Minimum |edge| in percentage points to include (default 0.5). Edge is evaluated NET of slippage. | |
| slippage_pp | No | Assumed execution slippage in percentage points per leg (default 0.3). Subtracted from raw |edge| before ranking and Kelly sizing. Polymarket has zero trading fees as of 2024 but bid/ask + thin depth typically eats 20-50bp per trade. Bump for very thin partitions; drop to 0 if you have a smarter fill model. | |
| max_spread_pp | No | Tradeable-edge filter. Maximum bid/ask spread in percentage points on the representative market. Default null (no filter). Set to 2 to require tight books — anything wider eats most plausible edges. | |
| min_liquidity | No | Tradeable-edge filter. Minimum $ liquidity on the representative market (or for partition_overround, on at least one top_leg). Default 0 (no filter). Set to 5000 to drop thin-book opportunities where executing the edge would walk the book past breakeven. | |
| category_filter | No | Comma-separated list to restrict the output: "model_driven" (crypto_price + news_momentum), "structural_arbitrage" (partition_overround), "concentrated_longshot". Combine like "model_driven,structural_arbitrage". Default: all. | |
| min_partition_leg_kelly | No | Minimum BEST per-leg half-Kelly fraction across a partition_overround opportunity's top_legs (or longshot_basket legs). Default 0 (no filter). Partition arbs always return kelly_fraction_half=0 at the parent level by design (basket trades don't compose to single-leg Kelly), so min_kelly never filters them — this knob applies to the per-leg Kelly inside top_legs instead. Use to suppress thin partitions whose individual leg edges aren't worth the per-leg slippage cost. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true, openWorldHint=true, destructiveHint=false. The description adds valuable context: it uses external data (FRED, coinpaprika), computes model probability, ranks by edge, and returns top N with suggested direction. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, front-loaded with the core purpose and algorithm. Every sentence adds value; no wasted words. It is well-structured and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no output schema, the description explains the return format ('top N ranked by edge magnitude with suggested trade direction'). It covers input, algorithm, and output comprehensively for a read-only discovery tool with supportive annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and each parameter has a clear description. The description adds minimal extra meaning beyond the schema (e.g., 'after ranking' for limit). Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scans high-volume Polymarket markets, computes disagreement with Pipeworx data using a specific model, and returns top edges with trade direction. It distinguishes from siblings like polymarket_arbitrage and bet_research by focusing on opportunity discovery.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states it answers the 'what should I bet on today' question, helping users discover opportunities without manual browsing. It implicitly differentiates from other tools like bet_research (specific market analysis) and polymarket_arbitrage (arbitrage), but lacks explicit when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
polymarket_kalshi_spreadARead-onlyIdempotentInspect
Cross-venue spread between Kalshi and Polymarket for the same resolving question. Kalshi and Polymarket frequently price the same event 2-25pp apart because the venues have different participant pools — that delta is a real arb signal. TWO MODES: (1) topic — pre-mapped macro shortcuts ("fed", "btc", "cpi", "gdp", "sp500", "recession", "next_pope") that auto-fetch the matching event on each venue. (2) explicit kalshi_event_ticker + polymarket_event_slug for custom pairings. Returns: each venue's leg-by-leg prices (in raw probability, 0-1), and where a leg from each side maps to the same outcome, the spread (Kalshi − Polymarket) in percentage points.
| Name | Required | Description | Default |
|---|---|---|---|
| topic | No | Pre-mapped: fed | btc | cpi | gdp | sp500 | recession | next_pope | next_uk_pm | next_israel_pm | 2028_president | |
| kalshi_event_ticker | No | Explicit Kalshi event ticker, e.g. "KXFED-26OCT". Overrides the topic-mapped Kalshi side. | |
| polymarket_event_slug | No | Explicit Polymarket event slug, e.g. "fed-decision-in-june-825". Overrides the topic-mapped Polymarket side. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark it read-only, idempotent. Description adds beyond by detailing output format (leg-by-leg prices, spread in percentage points) and modes. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is relatively concise given the complexity of modes and output. It front-loads purpose, then details modes, then output. No unnecessary words; each sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, so description covers return values adequately (prices and spread). Explains both modes and parameter interactions. Missing edge case handling, but overall complete for typical usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (all params have descriptions). Description adds meaning: explains topic is pre-mapped shortcuts, explicit params override topic, provides examples. Exceeds baseline of 3 by clarifying inter-parameter relationships.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it computes cross-venue spread between Kalshi and Polymarket for the same event. It distinguishes itself from siblings like polymarket_arbitrage by focusing on cross-venue comparison. Specific verb 'cross-venue spread' and resource named.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explains two usage modes (topic shortcuts or explicit tickers) and gives examples. Implies use for arbitrage signal detection. Lacks explicit when-not-to-use guidance but provides sufficient context for appropriate invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recallARead-onlyIdempotentInspect
Retrieve a value previously saved via remember, or list all saved keys (omit the key argument). Use to look up context the agent stored earlier — the user's target ticker, an address, prior research notes — without re-deriving it from scratch. Scoped to your identifier (anonymous IP, BYO key hash, or account ID). Pair with remember to save, forget to delete.
| Name | Required | Description | Default |
|---|---|---|---|
| key | No | Memory key to retrieve (omit to list all keys) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden. It discloses that memories can be retrieved from current or previous sessions, which is useful behavioral context. However, it doesn't mention potential limitations like memory persistence, retrieval failures, or performance characteristics. The description doesn't contradict any annotations (none exist).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with zero waste. The first sentence states the purpose and parameter behavior, the second provides usage context. Every word earns its place, and the information is front-loaded appropriately.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 1 parameter with full schema coverage and no output schema, the description adequately covers the tool's purpose and basic usage. However, as a data retrieval tool with no annotations, it could benefit from more behavioral context about what 'memories' contain, format of returned data, or error conditions. It's minimally complete but has room for improvement.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents the key parameter. The description adds value by explaining the semantic behavior: 'omit to list all keys' clarifies what happens when the parameter is omitted. With 1 parameter and high schema coverage, this earns above the baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Retrieve a previously stored memory by key, or list all stored memories (omit key).' It specifies the verb ('retrieve'/'list') and resource ('memory'), but doesn't explicitly differentiate from sibling tools like 'remember' or 'forget' beyond mentioning retrieval vs. saving context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context for when to use the tool: 'Use this to retrieve context you saved earlier in the session or in previous sessions.' It also explains the key parameter behavior ('omit key' to list all). However, it doesn't explicitly state when NOT to use it or mention alternatives among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recent_changesARead-onlyIdempotentInspect
What's new with a company in the last N days/months? Use for "what's happening with X", "updates on Y", "news on Apple this month", or change-monitoring. Fans out in parallel to: SEC EDGAR (filings since since), GDELT→GNews fallback (news mentions in window — GDELT preferred, GNews when rate-limited or 5xx), USPTO (patents granted; PatentsView API sunset May 2025 so this soft-fails until reactivated). since accepts ISO date ("2026-04-01") or relative shorthand ("7d", "30d", "3m", "1y"). Returns structured changes[] grouped by source + total_changes count + pipeworx:// citation URIs. Use entity_profile instead when you want the static profile (filings + fundamentals + LEI + patents) regardless of window.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Entity type. Only "company" supported today. | |
| since | Yes | Window start — ISO date ("2026-04-01") or relative ("7d", "30d", "3m", "1y"). Use "30d" or "1m" for typical monitoring. | |
| value | Yes | Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193"). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses parallel fan-out, supported date formats, and return structure (structured changes, total_changes count, pipeworx:// URIs). With no annotations, this is transparent; could add error handling or limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, each sentence adds value. Extremely efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers key aspects: entity type, date formats, parallel sources, return fields. No output schema, so return description is helpful. Could mention pagination or sorting but adequate for most uses.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers all parameters with descriptions; the description adds usage examples ('use "30d" or "1m" for typical monitoring') and clarifies format (ISO date or relative). Enriches beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'what's new about an entity since a given point in time' with specific sources (SEC EDGAR, GDELT, USPTO). Distinguishes from sibling tools like 'entity_profile' by focusing on changes and time windows.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use for "brief me on what happened with X" or change-monitoring workflows.' Provides clear use cases, though doesn't explicitly mention when not to use it or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
rememberAIdempotentInspect
Save data the agent will need to reuse later — across this conversation or across sessions. Use when you discover something worth carrying forward (a resolved ticker, a target address, a user preference, a research subject) so you don't have to look it up again. Stored as a key-value pair scoped by your identifier. Authenticated users get persistent memory; anonymous sessions retain memory for 24 hours. Pair with recall to retrieve later, forget to delete.
| Name | Required | Description | Default |
|---|---|---|---|
| key | Yes | Memory key (e.g., "subject_property", "target_ticker", "user_preference") | |
| value | Yes | Value to store (any text — findings, addresses, preferences, notes) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It clearly describes the tool's behavior as a storage operation, specifies persistence rules (authenticated vs. anonymous sessions), and implies it's a write operation. However, it doesn't mention potential limitations like storage size, rate limits, or error conditions, leaving some behavioral aspects uncovered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized and front-loaded, with the first sentence stating the core purpose and the second adding crucial context about persistence. Every sentence earns its place by providing essential information without redundancy or fluff, making it highly efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (a write operation with persistence rules), no annotations, and no output schema, the description does well by covering purpose, usage, and key behavioral traits. However, it lacks details on return values (e.g., confirmation message or error handling) and doesn't fully address all potential edge cases, leaving minor gaps in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already documents both parameters ('key' and 'value') with good descriptions. The description adds minimal value beyond the schema by implying the parameters are used for storage but doesn't provide additional syntax, format, or usage details. This meets the baseline of 3 when the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Store a key-value pair') and resource ('in your session memory'), distinguishing it from sibling tools like 'recall' (likely for retrieval) and 'forget' (likely for deletion). It provides concrete examples of what can be stored ('intermediate findings, user preferences, or context across tool calls'), making the purpose unambiguous and differentiated.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('to save intermediate findings, user preferences, or context across tool calls') and provides clear context about persistence differences ('Authenticated users get persistent memory; anonymous sessions last 24 hours'), which helps the agent decide when to use it versus alternatives like 'recall' for retrieval. It effectively guides usage without being misleading.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
resolve_entityARead-onlyIdempotentInspect
Resolve a user-spoken name to the canonical/official identifiers other tools require as input. Use FIRST when you have a name but need an ID. SUPPORTED TYPES: "company" (returns ticker + 10-digit CIK + company_name from SEC EDGAR + pipeworx://edgar/company/{cik} citation URI; accepts ticker, CIK, or company name as input — auto-disambiguated), "drug" (returns RxCUI + ingredient + brand from RxNorm + pipeworx://rxnorm/{rxcui} citation; accepts brand or generic name). Each call cascades through several lookup endpoints internally — using resolve_entity replaces 2-3 manual lookups.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Entity type: "company" or "drug". | |
| value | Yes | For company: ticker (AAPL), CIK (0000320193), or name. For drug: brand or generic name (e.g., "ozempic", "metformin"). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description carries the full burden. It discloses input formats and output fields (ticker, CIK, company name, pipeworx:// URIs), and highlights it's a single call. It doesn't mention authentication requirements or error handling, but for a read-only lookup, the disclosure is reasonably transparent and sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise, consisting of two short sentences plus a third line with examples. It front-loads the purpose and immediately gives actionable details. Every sentence adds value, and there is no unnecessary verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's low complexity (2 required parameters, simple return structure) and absence of an output schema, the description adequately covers the purpose, input formats, and output fields. It doesn't need to explain pagination or error handling as it's a straightforward lookup. The description is complete for the tool's scope.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already describes both parameters with 100% coverage. The description adds value by providing concrete examples (e.g., 'AAPL', '0000320193', 'Apple') and clarifying that v1 only supports 'company', which goes beyond the schema's enum description. This helps the agent understand acceptable formats.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool resolves entities to canonical IDs across Pipeworx data sources. It specifies the verb 'resolve' and the resource 'entity to canonical IDs', and distinguishes itself by claiming it replaces 2-3 lookup calls, which differentiates it from potential sibling tools that might do similar lookups individually.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear guidance on when to use the tool by giving examples of accepted input types (ticker, CIK, name) and specifying it's for v1 type='company'. It also implies efficiency advantages over multiple lookups. However, it doesn't explicitly state when not to use it or provide alternatives, though no direct sibling seems to serve the same purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
scan_competitor_ai_presenceARead-onlyIdempotentInspect
Compare AI visibility across multiple entities side-by-side. Probes each entity (your brand + N competitors) with ai_visibility_check, ranks by score, surfaces which is most/least recognized. Useful for competitive AI-marketing audits: "does Claude know about us as well as our competitors?". Returns ranked list with score, confidence, signal density per entity.
| Name | Required | Description | Default |
|---|---|---|---|
| models | No | Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai. | |
| _apiKey | No | Optional Anthropic API key — only if "anthropic" is in models. Passed to api.anthropic.com per probe. | |
| context | No | Optional shared context applied to every probe (e.g. "B2B SaaS", "Boston restaurant"). Disambiguates common names. | |
| entities | Yes | Array of 2-8 entities to compare (brand/business/product names). First entry treated as the "subject" for narrative; rest are competitors. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint, idempotentHint, openWorldHint, destructiveHint=false. Description adds that it probes with ai_visibility_check internally and returns ranked list with score, confidence, signal density, which is consistent and adds useful context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, front-loaded with key action and purpose. Some extra wording could be trimmed, but overall efficient and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description lists return fields (score, confidence, signal density). Lacks details on error handling or pagination, but covers the main outputs adequately for a comparison tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. Description adds meaning by explaining entities treat first entry as subject, ranks by score, and clarifies model/_apiKey usage, going beyond schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it compares AI visibility across multiple entities side-by-side, using specific verbs like 'compare', 'probes', 'ranks', 'surfaces'. It distinguishes itself from siblings like ai_visibility_check (single entity) and compare_entities by focusing on competitive audits.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'for competitive AI-marketing audits' and gives an example question. Does not explicitly list when not to use or alternatives, but the context and sibling names imply the distinction.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_stationsBRead-onlyIdempotentInspect
Search radio stations by name. Returns station name, URL, country, genres, and popularity vote count. Use when looking for a specific station or browsing by keyword.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of results to return. Defaults to 10. | |
| query | Yes | Station name to search for. |
Output Schema
| Name | Required | Description |
|---|---|---|
| count | Yes | Number of stations returned |
| stations | Yes | List of radio stations matching the query |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It adds useful context beyond the input schema: it specifies that 'Results are ordered by votes (popularity),' which informs the agent about sorting behavior. However, it lacks details on other behavioral traits such as rate limits, authentication needs, error handling, or what the output looks like (e.g., format, pagination). For a search tool with no annotations, this is a moderate but incomplete disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized and front-loaded: it's two concise sentences that directly state the purpose and key behavioral trait (ordering by votes). Every sentence earns its place by providing essential information without redundancy or fluff, making it efficient and easy to parse for an AI agent.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (a search function with 2 parameters), no annotations, and no output schema, the description is partially complete. It covers the purpose and sorting behavior but misses details like output format, error cases, or usage context relative to siblings. Without annotations or output schema, more information would be helpful for the agent to fully understand the tool's behavior and integration.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, meaning the input schema fully documents both parameters ('query' and 'limit') with descriptions. The description adds no additional meaning beyond the schema; it doesn't explain parameter interactions, constraints, or usage examples. Since the schema does the heavy lifting, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Search for radio stations by name.' It specifies the verb ('search') and resource ('radio stations'), and distinguishes it from siblings like 'get_top_stations' (which likely returns top stations without search) and 'list_countries'/'list_tags' (which list metadata). However, it doesn't explicitly differentiate from potential siblings like 'search_stations_by_genre' or 'search_stations_by_country', so it's not fully specific to all alternatives.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It doesn't mention when to choose 'search_stations' over 'get_top_stations' (e.g., for popularity-based results vs. search-based), 'list_countries' (e.g., for filtering by country), or 'list_tags' (e.g., for genre-based searches). There's only an implied usage based on the purpose, with no explicit context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
validate_claimARead-onlyIdempotentInspect
Fact-check, verify, validate, or confirm/refute a natural-language factual claim or statement against authoritative sources. Use when an agent needs to check whether something a user said is true ("Is it true that…?", "Was X really…?", "Verify the claim that…", "Validate this statement…"). v1 supports company-financial claims (revenue, net income, cash position for public US companies) via SEC EDGAR + XBRL. Returns a verdict (confirmed / approximately_correct / refuted / inconclusive / unsupported), extracted structured form, actual value with pipeworx:// citation, and percent delta. Replaces 4–6 sequential calls (NL parsing → entity resolution → data lookup → numeric comparison).
| Name | Required | Description | Default |
|---|---|---|---|
| claim | Yes | Natural-language factual claim, e.g., "Apple's FY2024 revenue was $400 billion" or "Microsoft made about $100B in profit last year". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries burden. It discloses return values (verdict types, citation, delta) and mentions it replaces sequential calls. However, it does not disclose limitations (e.g., only US public companies, only certain metrics) or error cases, leaving gaps for an agent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, no wasted words. Efficient and to the point.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description explains return values well. For a single-parameter tool, it is fairly complete, but lacks mention of error handling, authentication, or scope limitations beyond 'company-financial claims'. Almost complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (one param with description). The description adds examples and explains output structure, but the schema already documents the parameter adequately. Baseline 3 is appropriate as description reinforces but does not significantly extend meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it fact-checks natural-language claims against authoritative sources, with specific domain (company-financial) and sources (SEC EDGAR+XBRL). However, it does not differentiate from sibling tools explicitly; no sibling seems to overlap directly, so clarity is high but missing explicit distinction.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies usage for company-financial claims (v1 support), but no explicit when-to-use, when-not-to-use, or alternatives. Mentions it replaces 4-6 sequential agent calls, which hints at efficiency but not clear exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!