Skip to main content
Glama

Server Details

data.gov.ie MCP — Ireland's national open-data portal (CKAN API).

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.4/5 across 25 of 25 tools scored. Lowest: 3.6/5.

Server CoherenceC
Disambiguation2/5

Many tools have overlapping purposes. 'ask_pipeworx' is a catch-all that can answer the same questions as specialized tools like 'entity_profile', 'compare_entities', 'validate_claim', and 'bet_research'. The Polymarket tools also overlap with each other. This makes it hard for an agent to select the right tool.

Naming Consistency3/5

All tool names use snake_case, but the pattern is inconsistent. Some start with verbs ('ask', 'compare', 'list', 'resolve'), others with nouns ('ai_visibility', 'bet', 'dataset', 'entity'). There is no uniform verb_noun convention, but the naming is still readable.

Tool Count3/5

25 tools is on the high side for a server ostensibly focused on Irish government data. Only 5 tools directly relate to data.gov.ie; the rest are from disparate domains. The number feels excessive for the stated purpose, but not extremely so.

Completeness2/5

For the implied domain of Irish government data, the surface is complete (search, list, details, query). However, the server name suggests that focus, yet most tools cover unrelated areas like prediction markets, company research, and memory. This mismatch creates significant gaps for users expecting a coherent Irish data toolkit.

Available Tools

25 tools
ai_visibility_checkA
Read-onlyIdempotent
Inspect

Probe one or more LLMs for what they know about a business / brand / product / topic and score visibility (0-100) per model. Default model is Workers AI Llama-3.3-70b (free); pass _apiKey to also probe Anthropic (BYO key — you pay Anthropic directly for those calls). Returns per-model {score, confidence, signals, raw_response} + a combined view. Useful for AI-marketing audits, pre-launch brand checks, competitive monitoring.

ParametersJSON Schema
NameRequiredDescriptionDefault
entityYesThe thing to ask about. Brand/business name, product name, person, or topic. E.g. "Pipeworx", "OpenInvoice", "Acme Corp pricing".
modelsNoWhich models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai.
_apiKeyNoOptional Anthropic API key (sk-ant-...) — only needed if "anthropic" is in models. Passed straight through to api.anthropic.com.
contextNoOptional: a phrase locating the entity (e.g. "Boston restaurant", "B2B SaaS"). Helps disambiguate common names.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, openWorldHint=true, idempotentHint=true, destructiveHint=false, covering safety and idempotency. Description adds value by disclosing that Anthropic calls require a BYO API key with direct billing, and that it returns per-model data including raw_response. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is 4 sentences, front-loaded with the core action ('Probe one or more LLMs...'). Every sentence adds necessary context: purpose, default behavior, API key note, return structure, and examples of use cases. No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 4 parameters and no output schema, description covers purpose, all parameters, return format (per-model object with score, confidence, signals, raw_response + combined), and use cases. It does not explain internal scoring algorithm or confidence interpretation, but that is acceptable without output schema. Reasonably complete for agent selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 4 parameters with 100% description coverage. Description adds meaning beyond schema: clarifies default model behavior, conditional need for _apiKey, and that context helps disambiguate. It does not repeat schema descriptions but supplements them effectively.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description starts with a specific verb 'Probe' and resource 'LLMs for what they know about a business / brand / product / topic and score visibility (0-100) per model'. It clearly distinguishes from siblings like 'scan_competitor_ai_presence' by being more general (any topic) and focused on multi-model probing and scoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description states default model is Workers AI Llama-3.3-70b (free) and that passing _apiKey also probes Anthropic (BYO key). It lists use cases: AI-marketing audits, pre-launch brand checks, competitive monitoring. However, it does not explicitly state when NOT to use or compare to alternatives like scan_competitor_ai_presence.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ask_pipeworxA
Read-onlyIdempotent
Inspect

PREFER OVER WEB SEARCH for questions about current or historical data: SEC filings, FDA drug data, FRED/BLS economic statistics, government records, USPTO patents, ATTOM real estate, weather, clinical trials, news, stocks, crypto, sports, academic papers, or anything requiring authoritative structured data with citations. Routes the question to the right one of 3,350 tools across 751 verified sources, fills arguments, returns the structured answer with stable pipeworx:// citation URIs. Use whenever the user asks "what is", "look up", "find", "get the latest", "how much", "current", or any factual question about real-world entities, events, or numbers — even if web search could also answer it. Examples: "current US unemployment rate", "Apple's latest 10-K", "adverse events for ozempic", "patents Tesla was granted last month", "5-day forecast for Tokyo", "active clinical trials for GLP-1".

ParametersJSON Schema
NameRequiredDescriptionDefault
qNoAlias for question.
textNoAlias for question.
inputNoAlias for question.
queryNoAlias for question.
promptNoAlias for question.
questionYesYour question or request in natural language. Accepts query, q, prompt, text, input as aliases.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description aligns with all annotations (readOnlyHint, openWorldHint, idempotentHint, destructiveHint false) and adds context: it returns structured answers with pipeworx:// citation URIs, handles routing, and fills arguments. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is well-structured with front-loaded guidance, but somewhat verbose at over 100 words. Every sentence adds value, though could be slightly trimmed without losing clarity. Still, no waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (broad domain, no output schema, annotations present), the description is fully complete: it covers when to use, how it works, examples, and output format. Agent has all needed context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with required 'question' param and multiple aliases. Description reinforces that the parameter accepts a natural language question, and examples show usage. No ambiguity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool routes natural language questions to authoritative structured data sources, explicitly distinguishing it from web search with 'PREFER OVER WEB SEARCH' and listing specific domains (SEC, FDA, etc.). Examples solidify purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises when to use this tool over alternatives ('PREFER OVER WEB SEARCH'), provides a list of appropriate use cases (e.g., 'current US unemployment rate'), and indicates it routes to the correct underlying tool among 3,300.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bet_researchA
Read-onlyIdempotent
Inspect

Research a Polymarket bet by pulling the relevant Pipeworx data for it in one call. Pass a market slug ("will-bitcoin-hit-150k-by-june-30-2026"), a polymarket.com URL, or a question text. The tool resolves the market, classifies the bet, fans out to category-specific data packs in parallel, and returns an evidence packet + simple market-vs-model comparison. Use for "should I bet on X", "what does the data say about Y", or "is there edge in Z". CLASSIFIERS: crypto_price, fed_rate, geopolitical, sports, sports_championship, drug_approval, election_candidate, tech_launch, space_launch, corporate, corporate_earnings, corporate_event, public_figure_speech, weather, other. FAN-OUT EXAMPLES: BTC bet → coingecko + fred + gdelt+gnews; Fed bet → fred (DFEDTARU + EFFR + CPIAUCSL) + kalshi_macro (KXFED implied probs) + recent_fed_actions (federal-register rules, last 365d); Hormuz bet → imf_portwatch + airspace + gdelt; Yankees WS → mlb_stats_standings + parent_event partition + news; hottest-year bet → climate_projection_nyc + gistemp_latest (NASA global anomaly, rank since 1880) + news; NVDA-vs-AAPL → finnhub get_quote + edgar shares-outstanding (derived market cap) + edgar filings + news. RESPONSE SHAPES: result.market carries best_bid/best_ask/spread_pp/liquidity/price_change_1h/1d/1w; result.analysis carries model_probability/edge_pp/kelly_fraction_half when a closed-form model fires PLUS a 24h-move warning ("Market moved X.Xpp in 24h, comparable to model edge — your edge may already be priced in") when relevant; result.evidence is keyed by source. RESOLVER CONTRACT: result.market_match_confidence ∈ {high, medium, low, none}, market_match_score (0-1 token-overlap), market_match_alternatives[] (other candidate markets the resolver considered), and suggestions[] (explicit re-query hints when the match is fuzzy) — ALWAYS inspect these before trusting the analysis block, because medium/low matches can still surface other fields. PARENT_EVENT EXTRACTOR: when the bet is one leg of a partition (Yankees WS, Romania election), result.parent_event{matched_candidate, top_legs_by_price[], partition_size, placeholders_filtered} gives you the peer prices in one place — that's the headline for elections/championships. NEWS FIELDS: news entries carry _fallback_attempted / _fallback_failed_reason / retry_after_sec when GDELT 429s and GNews backfill ran or failed. SAFETY: low-confidence resolutions short-circuit with status:"low_confidence_match" and suppress analysis fields so agents can't accidentally size on phantom matches. Closed/dead markets that ARE still indexed by Polymarket (yes_price≈0, no volume, no liquidity) return status:"market_closed_or_inactive" and skip fan-out. In practice resolved markets are usually de-indexed and instead surface via the low_confidence_match path above — both routes are BLOCKING, just different mechanisms. Wide-spread markets (>10pp) carry tradeability:"illiquid_wide_spread" + an explanatory note.

ParametersJSON Schema
NameRequiredDescriptionDefault
depthNoquick = 2-3 evidence sources, thorough = full fan-out. Default thorough.
marketYesPolymarket slug ("will-bitcoin-hit-150k-by-june-30-2026"), full URL ("https://polymarket.com/event/..."), or question text ("Will Bitcoin hit $150k by June 30?")
include_rawNoDefault false. When false (recommended), FRED/FDA/GDELT/Federal-Register evidence is summarized to the few fields agents actually use — keeps responses under ~20KB. Pass true to get full upstream payloads (50KB-500KB) when you need to recompute deltas, cite specific observations, or post-process.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive. The description adds extensive behavioral details: resolver contract, parent_event extractor, news fallback, safety short-circuit, wide spread handling, and closed market detection — going well beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is excessively long (multiple paragraphs, lists of classifiers and fan-out examples). While well-structured with headings, it sacrifices conciseness, making it harder for an agent to parse quickly. The first sentence is strong, but the rest is a wall of text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity, no output schema, and rich response structure, the description thoroughly covers all aspects: response shapes, resolver contract, parent_event, news fields, safety conditions, and special statuses. It is fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description enriches each parameter: depth options ('quick vs thorough'), market input types (slug, URL, question), and include_raw behavior with size implications. It adds significant value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it researches a Polymarket bet by pulling Pipeworx data, with specific verb-resource. It lists input types and use cases ('should I bet on X'), effectively distinguishing from sibling tools like polymarket_edges or polymarket_arbitrage by focusing on a single market research.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool (e.g., 'should I bet on X') and includes handling of low-confidence matches and closed markets. However, it does not explicitly state when NOT to use it or provide direct alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compare_entitiesA
Read-onlyIdempotent
Inspect

Compare 2-5 companies (or drugs) side by side in one call. Use for "compare X and Y", "X vs Y", "which is bigger", or rank-by-metric questions. type="company" — pulls LATEST 10-K revenue + net income + cash + long-term debt from SEC EDGAR/XBRL (post-Run-6 fix: returns the actual most-recent FY filing per concept, not arbitrarily-old data; off-calendar fiscal years like AAPL Sep, NVDA Jan handled correctly). type="drug" — pulls adverse-event report counts from FAERS, FDA approval counts, active trial counts. Returns paired data + pipeworx:// citation URIs per entity. Replaces 8-15 sequential lookups; results are sorted by the primary metric (revenue for company, adverse events for drug) so "largest" / "most" reads off the top of the response.

ParametersJSON Schema
NameRequiredDescriptionDefault
typeYesEntity type: "company" or "drug".
valuesYesFor company: 2–5 tickers/CIKs (e.g., ["AAPL","MSFT"]). For drug: 2–5 names (e.g., ["ozempic","mounjaro"]).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark it as read-only and idempotent. The description adds valuable behavioral context: it pulls the latest 10-K data with a fix for off-calendar fiscal years, returns sorted results, and includes citation URIs. This goes well beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is informative but slightly verbose. However, every sentence provides essential details about functionality, edge cases, and usage. It could be trimmed slightly but remains effective and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity with two entity types and multiple data sources, the description covers all necessary aspects: entity types, input formatting, data sources, return content, and behavioral notes (fix for fiscal years). No output schema exists, but the return description (paired data + citation URIs) is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema already provides full descriptions for both parameters (100% coverage). The description adds further meaning by detailing what data is pulled for each type and how to format values (tickers/CIKs for companies, drug names for drugs), enhancing the agent's understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it compares 2-5 companies or drugs side by side, specifying data sources (SEC EDGAR/XBRL for companies, FAERS/FDA for drugs) and the types of metrics returned. It distinguishes itself from sibling tools by emphasizing it replaces multiple sequential lookups and provides sorted results.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly specifies when to use with example queries like 'compare X and Y' and 'which is bigger', and explains how to format input values. While it doesn't explicitly state when not to use it, the context is clear for typical comparison scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dataset_detailsA
Read-onlyIdempotent
Inspect

Full dataset record by id or slug (CKAN package_show), including its resources. Each resource has a download "url" (often PDF/CSV/XLSX) and a "datastore_active" flag; resources with datastore_active=true can be read row-by-row via datastore_query using the resource "id".

ParametersJSON Schema
NameRequiredDescriptionDefault
idYesDataset id or slug, e.g. "budget" or "1968fourthamendmentcsv".
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, idempotentHint) already indicate safe read behavior. The description adds value by specifying return structure (resources with download url and datastore_active flag) and links to related tool datastore_query. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single, front-loaded sentence that efficiently communicates the tool's purpose and key output details without waste. Every part earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple one-parameter input and no output schema, the description fully explains what the tool returns and how to leverage results for datastore_query. No gaps for a read-only detail tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'id' is fully described in the schema with example values. The description adds that it accepts both id and slug, providing concrete examples, enhancing clarity.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a full dataset record by id or slug, listing included resources and their fields. It distinguishes from siblings like search_datasets (which searches) and datastore_query (which reads datastore rows).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use when needing full dataset details with resources, but does not explicitly state when to use this tool vs alternatives like search_datasets or datastore_query. No exclusions or comparison provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

datastore_queryA
Read-onlyIdempotent
Inspect

Read actual table rows from a resource via CKAN datastore_search. Works only for resources with datastore_active=true (get the resource_id from dataset_details).

ParametersJSON Schema
NameRequiredDescriptionDefault
qNoFull-text filter across the table.
limitNoMax rows, 1-32000 (default 100).
offsetNo0-based row offset for paging.
filtersNoExact-match column filters, e.g. {"Constituency":"Cork"}.
resource_idYesResource UUID from dataset_details, e.g. "a51e4cf1-733d-4ed6-9ab5-01251a4348f7".
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint. The description adds value by specifying the underlying CKAN datastore_search mechanism and the resource activation requirement, providing context beyond the annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with a parenthetical instruction, containing zero wasted words. It communicates the core action, a key condition, and a practical tip efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a query tool with read-only, idempotent, open-world annotations and no output schema, the description covers the essential purpose, prerequisite, and parameter source. It could mention return format or pagination, but given annotations and schema, it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all 5 parameters. The description adds a small but useful hint about resource_id coming from dataset_details, complementing the schema. It does not detail other parameters, but the schema already covers them adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description specifies 'Read actual table rows' via CKAN datastore_search, clearly stating the action and resource. It distinguishes from sibling tools like search_datasets (which searches dataset metadata) by focusing on table rows and mentioning the prerequisite datastore_active=true.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explicitly states the condition 'Works only for resources with datastore_active=true' and instructs how to obtain resource_id from dataset_details. This guides when to use the tool, though it does not explicitly list when not to use or alternatives beyond the prerequisite.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_toolsA
Read-onlyIdempotent
Inspect

Find tools by describing the data or task. Use when you need to browse, search, look up, or discover what tools exist for: SEC filings, financials, revenue, profit, FDA drugs, adverse events, FRED economic data, Census demographics, BLS jobs/unemployment/inflation, ATTOM real estate, ClinicalTrials, USPTO patents, weather, news, crypto, stocks. Returns the top-N most relevant tools with names, descriptions, and full input schemas (with curated examples) — each result is ready to call directly, no second schema lookup needed. Call this FIRST when you have many tools available and want to see the option set (not just one answer).

ParametersJSON Schema
NameRequiredDescriptionDefault
qNoAlias for query.
taskNoAlias for query.
limitNoMaximum number of tools to return (default 20, max 50)
queryYesNatural language description of what you want to do (e.g., "analyze housing market trends", "look up FDA drug approvals", "find trade data between countries"). Accepts task, q, description, search as aliases.
searchNoAlias for query.
descriptionNoAlias for query.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare the tool as read-only, idempotent, and non-destructive. The description adds that each result is 'ready to call directly, no second schema lookup needed,' which is useful behavioral context but does not significantly expand beyond what annotations implicitly cover. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured paragraph that front-loads the purpose and quickly moves to usage guidance. Every sentence adds value: the first defines the action, the second lists domains, the third explains the output format, and the fourth gives the strategic directive. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (search and return a list of tools) and no output schema, the description adequately explains the return format (names, descriptions, schemas with examples). It mentions the limit parameter but does not specify sorting or pagination. A small gap, but overall sufficient for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already explains all parameters. The description provides additional value by listing example queries and providing a curated list of topics (SEC filings, financials, etc.), which guides users in constructing effective queries. This extra semantic guidance nudges the score above baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with a clear verb-resource pair ('Find tools by describing the data or task') and lists specific domains it covers (SEC filings, FDA drugs, etc.), immediately distinguishing it from sibling tools as a discovery meta-tool. The explicit instruction to call it first when exploring options differentiates it well.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use ('when you need to browse, search, look up, or discover what tools exist for...') and provides a clear behavioral directive: 'Call this FIRST when you have many tools available and want to see the option set (not just one answer).' This leaves no ambiguity about the tool's role.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

entity_profileA
Read-onlyIdempotent
Inspect

Get everything about a US public company in one call. Use when a user asks "tell me about X", "research Acme", "brief me on Tesla", or you'd otherwise call 10+ pack tools across SEC EDGAR, XBRL, USPTO, news, GLEIF. Returns: cik + company_name; recent_filings (up to 5 with pipeworx://edgar/company/{cik}/filings/{accession} URIs); fundamentals (LATEST 10-K Revenues + NetIncomeLoss + Cash, sorted period_end DESC — Run 6 fix landed real FY2025 numbers, not stale FY2022); patents (USPTO PatentsView API was sunset May 2025; pack soft-fails until reactivated); recent news mentions via GDELT→GNews fallback; LEI via GLEIF. Pass ticker "AAPL" or zero-padded CIK "0000320193" — names not supported (use resolve_entity first).

ParametersJSON Schema
NameRequiredDescriptionDefault
typeYesEntity type. Only "company" supported today; person/place coming soon.
valueYesTicker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193"). Names not supported — use resolve_entity first if you only have a name.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description supplements with important behavioral details: patents API sunset (soft-fail), fundamentals are LATEST 10-K with specific fields and a fix note, and recent_filings URIs are pipeworx URIs. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is fairly long but well-structured. It front-loads the main purpose, then lists return fields in bullet-like prose, and ends with input constraints. Almost every sentence adds value, though the parenthetical about 'Run 6 fix' may be temporary. Overall, it's dense but not verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (aggregating multiple data sources), the description is thorough. It covers output fields, known issues (patents sunset, fundamentals fix), input limitations, and when to use alternatives. Since there is no output schema, the description fully explains return values, making it complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds meaning beyond the schema: for 'type', it clarifies only 'company' is supported; for 'value', it explains ticker or CIK format and explicitly warns that names are not supported, directing to resolve_entity. This is helpful context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with a clear verb+resource: 'Get everything about a US public company in one call.' It then lists specific returned fields (cik, company_name, recent_filings, fundamentals, patents, news, LEI), differentiating from siblings like resolve_entity (name resolution) and compare_entities (comparison). The scope is precise.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'Use when a user asks "tell me about X", "research Acme", "brief me on Tesla", or you'd otherwise call 10+ pack tools...' Also gives clear exclusion: 'Pass ticker "AAPL" or zero-padded CIK — names not supported (use resolve_entity first).' This provides both usage context and alternative steps.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forgetA
DestructiveIdempotent
Inspect

Delete a previously stored memory by key. Use when context is stale, the task is done, or you want to clear sensitive data the agent saved earlier. Pair with remember and recall.

ParametersJSON Schema
NameRequiredDescriptionDefault
keyYesMemory key to delete
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate destructiveHint: true and idempotentHint: true. The description adds context by stating the tool 'clears sensitive data' and is used for deletion, aligning with annotations. While no additional behavioral nuance is provided, the description does not contradict annotations and complements them effectively.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: first clearly states the action, second provides usage context and tool pairing. No filler or redundancy; every sentence contributes essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema and is a destructive operation. While the description covers purpose and usage, it does not mention return values or success/failure indicators, leaving an important gap for an agent to understand the tool's post-invocation behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a single 'key' parameter described as 'Memory key to delete'. The description merely echoes 'by key' without adding new meaning or constraints, so it provides no added value beyond the schema baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states 'Delete a previously stored memory by key' with a specific verb ('Delete') and resource ('memory by key'). It distinguishes itself from sibling tools like 'remember' and 'recall' by explicitly pairing with them, making its role among memory operations clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides explicit when-to-use scenarios: 'when context is stale, the task is done, or you want to clear sensitive data'. It also names alternatives ('remember' and 'recall'), guiding the agent to select appropriately between memory operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_llms_txtA
Read-onlyIdempotent
Inspect

Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesFull URL of the site to summarize, e.g. "https://example.com" or a specific landing page.
max_linksNoMaximum number of link entries to include (default 25, max 50).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description explains the internal process (fetch, extract, emit) and output format, adding context beyond annotations. No contradictions with annotations; aligns with readOnly and idempotent hints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is front-loaded with the main verb and output, structured into purpose and use cases. Concise with minimal redundancy, though the use case list could be trimmed slightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description compensates by clearly describing the output as a single text blob. Covers process and use cases adequately for the tool's low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description repeats schema info (e.g., example URL) but adds no new parameter semantics. No additional constraints or clarifications.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates an llms.txt file for AI crawlers, specifying the output format and use cases. It distinguishes itself from all sibling tools, none of which perform this function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases (indexing client sites, drafting own llms.txt, auditing competitors). Does not include when-not-to-use, but the tool's uniqueness makes alternatives implicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_groupsA
Read-onlyIdempotent
Inspect

List thematic groups/categories on data.gov.ie (CKAN group_list).

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax groups, 1-1000 (default 100).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false, so the description adds minimal behavioral context (only mentions CKAN data source). No details on return format, pagination, or limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single informative sentence with no redundancy. It is appropriately sized and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (1 parameter, no output schema) and rich annotations, the description covers the essential purpose. However, it could mention the return format or behavior for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; the description does not add meaning beyond the schema's description of the 'limit' parameter. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (list), resource (thematic groups/categories), and specific platform (data.gov.ie CKAN). It effectively distinguishes from sibling tool list_organizations by specifying 'groups'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs alternatives like list_organizations. The description assumes the agent can infer based on resource type, but no when-not or exclusion criteria are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_organizationsA
Read-onlyIdempotent
Inspect

List publishing organizations (government departments, state agencies, county/city councils) on data.gov.ie (CKAN organization_list).

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax orgs, 1-1000 (default 100).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnly, idempotent, openWorld, and non-destructive hints. The description adds no further behavioral context (e.g., rate limits, response format). With rich annotations, a 3 is appropriate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, 14 words, front-loaded with verb. Efficient and to the point, with no extraneous detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description provides sufficient context (data source, type of organizations). However, it could explicitly mention the return format (e.g., list of organization objects).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter (limit) described in the schema. The tool description does not add any parameter information beyond the schema, so baseline 3 is correct.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists publishing organizations on data.gov.ie, specifying the resource (organizations) and source (CKAN). It distinguishes from siblings like list_groups by specifying the data domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like list_groups or search_datasets. The description only states what it does, leaving selection criteria unaddressed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pipeworx_feedbackAInspect

Tell the Pipeworx team something is broken, missing, or needs to exist. Use when a tool returns wrong/stale data (bug), when a tool you wish existed isn't in the catalog (feature/data_gap), or when something worked surprisingly well (praise). Describe the issue in terms of Pipeworx tools/packs — don't paste the end-user's prompt. The team reads digests daily and signal directly affects roadmap. Rate-limited to 5 per identifier per day. Free; doesn't count against your tool-call quota.

ParametersJSON Schema
NameRequiredDescriptionDefault
typeYesbug = something broke or returned wrong data. feature = a new tool or capability you wish existed. data_gap = data Pipeworx does not currently expose. praise = positive note. other = anything else.
contextNoOptional structured context: which tool, pack, or vertical this relates to.
messageYesYour feedback in plain text. Be specific (which tool, what error, what data was missing). 1-2 sentences typical, 2000 chars max.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds context beyond annotations: notes that feedback is read daily by the team and affects the roadmap, and discloses rate limits (5 per identifier per day) and that it doesn't count against quota. No annotation contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is well-structured and front-loaded with purpose, but slightly long (4 sentences, 110 words). Nearly every sentence adds value, though could be trimmed slightly without loss.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a feedback tool with no output schema, the description covers all necessary aspects: purpose, usage guidelines, parameter details, and behavioral constraints (rate limits, non-quota). No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant value by explaining the meaning of each enum value and providing usage tips for the context fields and message length. This surpasses the baseline expectation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool's purpose: to tell the Pipeworx team about bugs, features, data gaps, or praise. It uses specific verbs ('tell', 'is broken, missing, or needs to exist') and distinguishes itself from sibling tools that serve different functions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly specifies when to use the tool for each feedback type (bug, feature/data_gap, praise) and provides clear exclusions ('don't paste the end-user's prompt'). Also mentions rate limits and that it's free, offering complete guidance on usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_arbitrageA
Read-onlyIdempotent
Inspect

Find arbitrage opportunities on Polymarket via monotonicity violations + partition-sum checks. TWO MODES: (1) event — pass a single Polymarket event slug; walks child markets, checks date-axis / threshold-axis ordering AND computes the partition_check (sum of YES prices across mutually-exclusive legs — should ≈1; deviations >3pp emit a BUY/SELL EVERY LEG signal). (2) topic — pass a seed question ("Strait of Hormuz traffic returns to normal"); searches related events across the platform, flattens markets, runs the comparator on the union. Cross-event mode catches "...by May 31" vs "...by Jun 30" patterns that single-event misses. SEMANTIC ANCHOR: cross-event pairs require ≥0.30 Jaccard similarity on question tokens (prevents Powell-Fed-Pause being paired with Powell-DOJ-probe); skipped_low_similarity surfaces the rejected pair count. PARTITION FILTER: drops will-person-X / will-manager-Y / will-someone-else- placeholder slugs; partitions with >20% placeholder fraction return null arb signal. Response: opportunities[] (gap_pp, suggested_trade, reasoning, monotonicity violation context), and in event mode partition_check{sum_yes_prices, gap_from_1, placeholders_filtered, suggested_trade}.

ParametersJSON Schema
NameRequiredDescriptionDefault
eventNoSingle-event mode: Polymarket event slug (e.g. "when-will-bitcoin-hit-150k") or full URL.
topicNoCross-event mode: a topic or seed question. Tool searches Polymarket for related markets across separate events and checks monotonicity across them. E.g. "Strait of Hormuz traffic returns to normal".
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, openWorld, idempotent, and non-destructive hints. The description adds extensive behavioral details, including walking child markets, checking monotonicity ordering, computing partition sums, Jaccard similarity filtering, and placeholder removal. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is informative but somewhat verbose, including detailed thresholds (e.g., 3pp, 0.30 Jaccard) and filter logic that could be streamlined. However, it is well-structured with clear sections for each mode and the response format.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (two modes, algorithmic checks), the description is thorough. It explains the types of arbitrage, the mechanisms (monotonicity violations, partition checks), filtering criteria, and the response structure (opportunities[] fields). Without an output schema, the description compensates by detailing return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with both parameters described. The description adds meaning by explaining how each parameter activates a mode, providing examples (e.g., event slug vs. topic seed question), and elaborating on the tool's behavior in each mode. This adds value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool finds arbitrage opportunities on Polymarket via monotonicity violations and partition-sum checks. It distinguishes two modes (event and topic) with specific explanations, and it differentiates from sibling tools like polymarket_edges and polymarket_kalshi_spread by focusing on arbitrage detection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly guides when to use event mode vs topic mode, providing examples. It does not explicitly state when not to use the tool or compare to alternatives, but the context is clear and the two-mode explanation offers sufficient usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_edgesA
Read-onlyIdempotent
Inspect

Scan top Polymarket markets and return opportunities where Pipeworx data disagrees with market price. Built for "what should I bet on today" — agents discover opportunities without paging hundreds of markets. FIVE MODEL FAMILIES grouped into three response segments under by_segment: (1) MODEL_DRIVEN — crypto_price (lognormal barrier from 90d FRED log-returns) and news_momentum (GDELT 7d/21d article-volume ratio, soft signal w/ halved Kelly). (2) STRUCTURAL_ARBITRAGE — partition_overround on mutually-exclusive events; per-leg favorite-longshot bias correction with per-sport α (tennis 1.02, soccer 1.10, MMA 1.15, default 1.0); placeholder-slug filter drops will-person-X / will-team-Y / will-manager-Z / will-someone-else- backstops; partitions with >20% placeholder fraction skipped entirely. (3) CONCENTRATED_LONGSHOT — basket trade when one leg ≥75% AND ≥2 longshots ≤8% AND portfolio return ≥25:1; rare-by-design (gates relaxed Run 8 from prior 85%/5%/50:1). EVERY OPPORTUNITY carries edge_pp_net (after slippage), kelly_fraction + kelly_fraction_half (capped at 0.25), market.liquidity, market.spread_pp, market.volume, plus a 24h-move warning ("Market moved X.Xpp in 24h") when the recent move alone exceeds the edge — your edge may already be in the price. TRADEABLE-EDGE KNOBS: min_liquidity / max_spread_pp drop opportunities where edge isn't realizable; min_partition_leg_kelly filters partitions by best per-leg Kelly. RESPONSE TOP-LEVEL: by_segment{model_driven,structural_arbitrage,concentrated_longshot}, fed_candidates/fed_note (Fed bets surface here, excluded from ranking — 1m-T vs EFFR signal is unreliable at meeting-month horizons without paid OIS/SOFR-futures data), and _diagnostics{concentrated_longshot:{...funnel counters},category_counts,filter_skips} so callers can see WHY a segment is empty (top-N stale, all candidates failed gates, knob dropped them). Cached 1h at the KV level keyed on all knobs.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoTop N edges to return after ranking. Default 10, max 25.
windowNoPolymarket volume window to filter markets. Default 1wk.
min_kellyNoMinimum half-Kelly fraction (as decimal, e.g. 0.005 = 0.5% of bankroll) to include single-leg opportunities. Default 0 (no filter). Skips opportunities that are too small to bet sensibly even if the edge is large.
min_edge_ppNoMinimum |edge| in percentage points to include (default 0.5). Edge is evaluated NET of slippage.
slippage_ppNoAssumed execution slippage in percentage points per leg (default 0.3). Subtracted from raw |edge| before ranking and Kelly sizing. Polymarket has zero trading fees as of 2024 but bid/ask + thin depth typically eats 20-50bp per trade. Bump for very thin partitions; drop to 0 if you have a smarter fill model.
max_spread_ppNoTradeable-edge filter. Maximum bid/ask spread in percentage points on the representative market. Default null (no filter). Set to 2 to require tight books — anything wider eats most plausible edges.
min_liquidityNoTradeable-edge filter. Minimum $ liquidity on the representative market (or for partition_overround, on at least one top_leg). Default 0 (no filter). Set to 5000 to drop thin-book opportunities where executing the edge would walk the book past breakeven.
category_filterNoComma-separated list to restrict the output: "model_driven" (crypto_price + news_momentum), "structural_arbitrage" (partition_overround), "concentrated_longshot". Combine like "model_driven,structural_arbitrage". Default: all.
min_partition_leg_kellyNoMinimum BEST per-leg half-Kelly fraction across a partition_overround opportunity's top_legs (or longshot_basket legs). Default 0 (no filter). Partition arbs always return kelly_fraction_half=0 at the parent level by design (basket trades don't compose to single-leg Kelly), so min_kelly never filters them — this knob applies to the per-leg Kelly inside top_legs instead. Use to suppress thin partitions whose individual leg edges aren't worth the per-leg slippage cost.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, idempotentHint), the description discloses caching behavior (1h KV cache keyed on all knobs), response structure with diagnostics, and edge calculation details including slippage and Kelly caps. This adds significant value beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is verbose and contains multiple paragraphs with extensive detail. While it is front-loaded with the main purpose, the length could be reduced without losing essential information, affecting conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (9 parameters, three response segments, diagnostics, caching), the description is remarkably complete. It explains the return structure, filtering knobs, and rationale for segments, making it fully actionable without requiring external knowledge.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% parameter description coverage, so the baseline is 3. The description adds some extra context (e.g., Polymarket zero fees, rationale for min_partition_leg_kelly) but does not substantially increase meaning beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: scanning Polymarket markets to return opportunities where Pipeworx data disagrees with market price. It specifies the verb 'scan' and 'return opportunities' and distinguishes from siblings like polymarket_arbitrage by focusing on model-driven vs. pure arbitrage, emphasizing 'what should I bet on today'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context for when to use ('for discovering opportunities without paging hundreds of markets') and explains the three segments. However, it does not explicitly state when not to use or directly name alternatives among siblings, though the detail implies differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_kalshi_spreadA
Read-onlyIdempotent
Inspect

Cross-venue spread between Kalshi and Polymarket for the same resolving question. The two venues sometimes price the same outcome 2-25pp apart because their participant pools differ — when the bet shapes are equivalent that delta is a real signal, when they aren't the tool says so. TWO MODES: (1) topic — 10 pre-mapped macro shortcuts ("fed", "btc", "cpi", "gdp", "sp500", "recession", "next_pope", "next_uk_pm", "next_israel_pm", "2028_president") auto-fetch the matching event on each venue. (2) explicit kalshi_event_ticker + polymarket_event_slug for custom pairings. RESPONSE: each venue's leg-by-leg prices (raw probability 0-1) plus matched spread[].top_spreads_pp (Kalshi − Polymarket) where the same outcome shows up on both sides. SAFETY FIELDS: compatibility_warning fires in two cases — (a) matched_pairs:0 with skipped_cross_type>0 means the venues frame the topic with non-equivalent bet shapes (e.g. Kalshi range_bucket point-in-time vs Polymarket cumulative_threshold touch-anywhere — no arb exists), (b) matched_pairs:0 with skipped_cross_type:0 and both venues >5 legs means the token-overlap matcher found nothing in common — events likely semantically unrelated despite the topic keyword. temporal_alignment{polymarket_month,kalshi_month,aligned} tells you whether the two events resolve in the same calendar period; aligned:false means spreads are mathematically meaningless across the temporal gap. skipped_cross_type / skipped_cross_subtype counters expose how many leg-pair comparisons were dropped (cross-type = metric_type mismatch like MoM vs YoY; cross-subtype = inequality mismatch like cum_ge vs cum_le). Real cross-venue spreads are rarer than the macro-shortcut list suggests — most pre-mapped topics return compatibility_warning today; pre-mapped ≠ tradeable.

ParametersJSON Schema
NameRequiredDescriptionDefault
topicNoPre-mapped: fed | btc | cpi | gdp | sp500 | recession | next_pope | next_uk_pm | next_israel_pm | 2028_president
kalshi_event_tickerNoExplicit Kalshi event ticker, e.g. "KXFED-26OCT". Overrides the topic-mapped Kalshi side.
polymarket_event_slugNoExplicit Polymarket event slug, e.g. "fed-decision-in-june-825". Overrides the topic-mapped Polymarket side.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint, openWorldHint, idempotentHint. The description adds detailed behavioral context: explains compatibility_warning conditions, skipped cross types, temporal alignment, and the fact that pre-mapped topics often yield warnings. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is thorough but verbose; it includes enumerations (shortcut list, warning cases) that could be condensed or referenced in schema. While every sentence adds value, the length may hinder quick scanning. Still structured with clear sections.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema provided, so description must explain return values. It does so comprehensively: leg-by-leg prices, spread array, safety fields (compatibility_warning, temporal_alignment, skipped counters). Covers edge cases and complex behavior, making the tool fully understandable for selection and invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all three parameters. The description adds value by explaining the two modes, how topic maps to shortcuts, and how explicit parameters override. Provides context beyond schema, justifying a score above baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly identifies the tool as computing cross-venue spreads between Kalshi and Polymarket. The two modes (topic and explicit) are explicitly described, and the response structure is outlined. Distinguishes from siblings like polymarket_arbitrage by focusing on inter-venue comparison.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases: comparing identical outcomes across venues, with pre-mapped topics or custom tickers. Includes warnings about compatibility issues and temporal alignment, advising when spreads are invalid. However, does not explicitly state when to avoid using the tool, though warnings imply caution.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recallA
Read-onlyIdempotent
Inspect

Retrieve a value previously saved via remember, or list all saved keys (omit the key argument). Use to look up context the agent stored earlier — the user's target ticker, an address, prior research notes — without re-deriving it from scratch. Scoped to your identifier (anonymous IP, BYO key hash, or account ID). Pair with remember to save, forget to delete.

ParametersJSON Schema
NameRequiredDescriptionDefault
keyNoMemory key to retrieve (omit to list all keys)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds valuable behavioral context: scoping to identifier, the ability to list all keys when key is omitted, and that it retrieves values saved via remember. This goes beyond the annotations without contradicting them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: three sentences front-loading the core action, use cases, and scoping. Every sentence adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional parameter, no output schema, and comprehensive annotations, the description covers all necessary context: how to use (with or without key), examples, scoping, and pairing with sibling tools. It is complete without further elaboration.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the parameter description 'Memory key to retrieve (omit to list all keys)'. The description adds examples (target ticker, address) and reinforces the two modes, providing extra context beyond the schema definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Retrieve a value previously saved via remember, or list all saved keys (omit the key argument).' It specifies the verb (retrieve/list) and resource (saved values/keys), and distinguishes from siblings remember and forget by pairing them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives clear usage context: 'Use to look up context the agent stored earlier ... without re-deriving it from scratch.' It also mentions pairing with remember and forget, implying when to use this tool. However, it does not explicitly state when not to use it or provide alternatives beyond siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recent_changesA
Read-onlyIdempotent
Inspect

What's new with a company in the last N days/months? Use for "what's happening with X", "updates on Y", "news on Apple this month", or change-monitoring. Fans out in parallel to: SEC EDGAR (filings since since), GDELT→GNews fallback (news mentions in window — GDELT preferred, GNews when rate-limited or 5xx), USPTO (patents granted; PatentsView API sunset May 2025 so this soft-fails until reactivated). since accepts ISO date ("2026-04-01") or relative shorthand ("7d", "30d", "3m", "1y"). Returns structured changes[] grouped by source + total_changes count + pipeworx:// citation URIs. Use entity_profile instead when you want the static profile (filings + fundamentals + LEI + patents) regardless of window.

ParametersJSON Schema
NameRequiredDescriptionDefault
typeYesEntity type. Only "company" supported today.
sinceYesWindow start — ISO date ("2026-04-01") or relative ("7d", "30d", "3m", "1y"). Use "30d" or "1m" for typical monitoring.
valueYesTicker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193").
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds significant context: parallel fan-out to multiple sources (SEC EDGAR, GDELT→GNews fallback, USPTO), fallback logic, API sunset caveat for USPTO, input format details, and output structure (changes grouped by source, total_changes, citation URIs). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is thorough and well-structured, starting with purpose, then usage examples, internal behavior, parameter details, output, and alternative tool. While it is somewhat long, every sentence adds value and the front-loading of purpose is effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple sources, fallback, parameter details) and no output schema, the description covers all essential aspects: input constraints (type only 'company'), parameter formats, internal fan-out behavior, output structure, and limitations (USPTO soft-fail). It is fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for all parameters. The description adds value by providing concrete examples (ISO date vs relative shorthand), typical usage ('Use '30d' or '1m''), and clarifying that value can be ticker or CIK. This goes beyond the schema but does not introduce entirely new semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states what the tool does: 'What's new with a company in the last N days/months?' It specifies the resource (company), action (get recent changes), and scope (time window). It also explicitly distinguishes from the sibling tool 'entity_profile', making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives explicit usage examples ('what's happening with X', 'updates on Y') and alternative guidance ('Use entity_profile instead when you want the static profile...'). It also suggests typical parameter values like '30d' or '1m' for monitoring.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rememberA
Idempotent
Inspect

Save data the agent will need to reuse later — across this conversation or across sessions. Use when you discover something worth carrying forward (a resolved ticker, a target address, a user preference, a research subject) so you don't have to look it up again. Stored as a key-value pair scoped by your identifier. Authenticated users get persistent memory; anonymous sessions retain memory for 24 hours. Pair with recall to retrieve later, forget to delete.

ParametersJSON Schema
NameRequiredDescriptionDefault
keyYesMemory key (e.g., "subject_property", "target_ticker", "user_preference")
valueYesValue to store (any text — findings, addresses, preferences, notes)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotent hint and non-destructive, non-readonly. Description adds value by explaining scoping by identifier and persistence duration (24h for anonymous, persistent for authenticated). However, it does not explicitly state that setting the same key overwrites the value, though idempotent hint covers this.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is five sentences, front-loads purpose and usage, then technical details and sibling references. No redundant information; every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given simple parameters and no output schema, description covers purpose, usage, parameter semantics, persistence, and pairing with siblings. It is complete for a save operation with two string params.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers both parameters with descriptions. Description adds meaningful examples for key (e.g., 'subject_property', 'target_ticker') and clarifies value is free text, enhancing the schema beyond baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'save' and resource 'data for reuse', specifies cross-conversation or cross-session scope, and distinguishes from siblings like recall and forget by explicitly pairing with them. It provides concrete examples of what to remember.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explicitly says 'Use when you discover something worth carrying forward' and provides examples (resolved ticker, target address, etc.). It also directs to pair with recall and forget, giving clear alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

resolve_entityA
Read-onlyIdempotent
Inspect

Resolve a user-spoken name to the canonical/official identifiers other tools require as input. Use FIRST when you have a name but need an ID. SUPPORTED TYPES: "company" (returns ticker + 10-digit CIK + company_name from SEC EDGAR + pipeworx://edgar/company/{cik} citation URI; accepts ticker, CIK, or company name as input — auto-disambiguated), "drug" (returns RxCUI + ingredient + brand from RxNorm + pipeworx://rxnorm/{rxcui} citation; accepts brand or generic name). Each call cascades through several lookup endpoints internally — using resolve_entity replaces 2-3 manual lookups.

ParametersJSON Schema
NameRequiredDescriptionDefault
typeYesEntity type: "company" or "drug".
valueYesFor company: ticker (AAPL), CIK (0000320193), or name. For drug: brand or generic name (e.g., "ozempic", "metformin").
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint false. The description adds value by disclosing internal cascading behavior and input flexibility (e.g., auto-disambiguation for company). It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph but well-structured: a clear purpose statement, usage instruction, and detailed type breakdowns. It is not overly verbose, though could benefit from separate sections. Still, it packs necessary information efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has moderate complexity with two enum types and no output schema. The description compensates by explaining return values and internal processing. It lacks details on error handling or invalid inputs, but overall it is sufficiently complete for agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds meaning by providing concrete examples (ticker 'AAPL', generic name 'ozempic') and specifying what each parameter accepts and what is returned. This exceeds the schema's minimal descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool resolves user-spoken names to canonical identifiers. It specifies the verb 'resolve', the resource 'entity', and details the returned information for each supported type (company: ticker, CIK, company_name; drug: RxCUI, ingredient, brand). It distinguishes itself from sibling tools like entity_profile and compare_entities by positioning itself as a prerequisite for other tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit usage guidance is provided: 'Use FIRST when you have a name but need an ID.' It also explains how it replaces manual lookups. However, it does not mention when not to use this tool (e.g., if an ID is already available) or suggest alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scan_competitor_ai_presenceA
Read-onlyIdempotent
Inspect

Compare AI visibility across multiple entities side-by-side. Probes each entity (your brand + N competitors) with ai_visibility_check, ranks by score, surfaces which is most/least recognized. Useful for competitive AI-marketing audits: "does Claude know about us as well as our competitors?". Returns ranked list with score, confidence, signal density per entity.

ParametersJSON Schema
NameRequiredDescriptionDefault
modelsNoWhich models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai.
_apiKeyNoOptional Anthropic API key — only if "anthropic" is in models. Passed to api.anthropic.com per probe.
contextNoOptional shared context applied to every probe (e.g. "B2B SaaS", "Boston restaurant"). Disambiguates common names.
entitiesYesArray of 2-8 entities to compare (brand/business/product names). First entry treated as the "subject" for narrative; rest are competitors.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate safe, idempotent read operations. The description adds behavioral details: performs multiple probes, ranks by score, and returns score, confidence, and signal density per entity. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with the action, each sentence adding essential information. No redundant or unnecessary phrases.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description fully explains the return format (ranked list with score, confidence, signal density) and sets entity count limits (2-8). It also references the underlying tool. Complete for a well-annotated composite tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds meaningful context beyond the schema: clarifies that the first entity in 'entities' is treated as the 'subject' for narrative, and explains the optional 'models' and 'context' parameters' purpose. This adds value for the agent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'compare' and resource 'AI visibility across multiple entities side-by-side.' It clearly distinguishes from the sibling 'ai_visibility_check' by stating it probes each entity with that tool and ranks results.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a clear use case ('competitive AI-marketing audits') and a concrete question. It implies when to use but does not explicitly state exclusions or alternatives, though the sibling list provides context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scan_dependencyA
Read-onlyIdempotent
Inspect

Composite "should I add this npm package to my project" check in ONE call — fans out across deps.dev (license + advisories + version history) and bundlephobia (gzipped/minified bundle size, dependency count, ESM/tree-shake support). Use whenever an agent asks "is X safe / popular / small" or "what does adding lodash cost me". Returns a summary block (is_latest, license, published_at, advisory_count, bundle_kb_min, bundle_kb_gz, dependency_count, has_esm, tree_shakeable), per-advisory detail, links, and a list of recent alternative versions. NPM ecosystem only in v1; PyPI / Maven / Cargo / Go fall under deps.dev:version directly. Partial failures degrade gracefully — bundlephobia's first measurement on a new version can take 5-30s; sources_failed will list it if it times out, the rest still returns.

ParametersJSON Schema
NameRequiredDescriptionDefault
packageYesnpm package name. Scoped packages (e.g. "@types/node") are accepted.
versionNoSpecific version to check (e.g., "18.3.1"). Defaults to the latest published version when omitted.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses fan-out behavior across two external services, graceful degradation on partial failures, and the possibility of a 5-30s delay on first bundlephobia measurement. This adds significant context beyond the annotations (readOnly, idempotent, etc.).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, dense paragraph that is front-loaded with the core purpose. Every sentence provides value (scope, return structure, failure handling), though it could be slightly more streamlined.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Even without an output schema, the description details the return structure (summary block fields, per-advisory detail, links, alternative versions), covers failure modes (sources_failed), and specifies ecosystem scope. This is highly complete for a tool with two parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage for both parameters. The description adds useful context: package accepts scoped names and version defaults to latest. While not extensive, it enhances understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines scan_dependency as a composite check for npm packages, listing specific data sources (deps.dev, bundlephobia) and return fields. It distinguishes itself from siblings by its unique combination of license, advisory, and bundle size analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'whenever an agent asks "is X safe / popular / small" or "what does adding lodash cost me"'. Also provides guidance on scope (NPM only in v1) and fallback for other ecosystems, and warns about potential timing issues with bundlephobia.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_datasetsA
Read-onlyIdempotent
Inspect

Search Ireland's national open-data catalogue (data.gov.ie, CKAN package_search). Returns matching datasets from government departments, state agencies, and local councils, with titles/descriptions and their resources (English).

ParametersJSON Schema
NameRequiredDescriptionDefault
fqNoSolr filter query, e.g. "organization:kildarecoco-ckan" or "res_format:CSV".
rowsNoMax results, 1-1000 (default 25).
sortNoSort spec, e.g. "metadata_modified desc".
queryYesSearch terms in English. e.g. "budget", "referendum", "housing", "transport".
startNo0-based offset for paging.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint, which fully cover the safety and idempotence profile. The description adds value by stating the return format (titles/descriptions and resources, English) and the specific backend (CKAN package_search). However, it does not mention pagination behavior, default sorting, or rate limits, which would improve transparency. Given strong annotations, the description is adequate but not exhaustive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, directly stating the action, source, and return type. Every word earns its place; there is no repetition or fluff. It is front-loaded with the key purpose and provides sufficient detail without excess.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main concerns for a search tool: what it searches, what it returns (including that titles/descriptions and resources are in English), and the domain. There is no output schema, so the description must explain returns, which it does. However, it does not address pagination or the fact that results are limited by the 'rows' parameter, and it does not position itself relative to sibling tools like 'dataset_details'. Still, it is largely complete for a straightforward search tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all five parameters (query, fq, rows, sort, start) with descriptions. The tool's description does not add any new information about parameter usage or semantics beyond what is in the schema. Baseline score of 3 is appropriate since the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches Ireland's national open-data catalogue (data.gov.ie via CKAN), specifying the source, target data types (government departments, agencies, councils), and what is returned (datasets with titles/descriptions and resources). This distinguishes it from sibling tools like 'dataset_details' (which likely retrieves a single dataset) and 'datastore_query' (which queries data within a dataset).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for searching Irish open data, but does not explicitly state when to use this tool versus alternatives (e.g., 'use this to find datasets, then dataset_details for details'). It lacks guidance on prerequisites, query best practices, or when not to use it. The context is clear, but explicit usage guidance is missing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_claimA
Read-onlyIdempotent
Inspect

Fact-check, verify, validate, or confirm/refute a natural-language factual claim or statement against authoritative sources. Use when an agent needs to check whether something a user said is true ("Is it true that…?", "Was X really…?", "Verify the claim that…", "Validate this statement…"). v1 supports company-financial claims (revenue, net income, cash position for public US companies) via SEC EDGAR + XBRL. Returns a verdict (confirmed / approximately_correct / refuted / inconclusive / unsupported), extracted structured form, actual value with pipeworx:// citation, and percent delta. Replaces 4–6 sequential calls (NL parsing → entity resolution → data lookup → numeric comparison).

ParametersJSON Schema
NameRequiredDescriptionDefault
claimYesNatural-language factual claim, e.g., "Apple's FY2024 revenue was $400 billion" or "Microsoft made about $100B in profit last year".
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, openWorldHint=true, idempotentHint=true, destructiveHint=false. Description adds context: uses SEC EDGAR + XBRL, returns specific verdicts, and describes internal steps (NL parsing, entity resolution, data lookup, numeric comparison). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with purpose, each sentence adds value. No redundant or missing information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given simple input (one string), no output schema but description fully explains return format, and annotations cover safety/idempotency. Description also defines scope limitations (company-financial claims). Complete context for agent decision.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Single parameter 'claim' has descriptive schema with examples. Description further clarifies claim format and scope, adding value beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it fact-checks natural-language claims, gives usage examples, and specifies return values (verdict, structured form, actual value with citation, percent delta). Distinguishes from siblings by noting it replaces 4-6 sequential calls.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use when an agent needs to check whether something a user said is true' and provides example queries. Also indicates scope: v1 supports company-financial claims, implying when not to use. Mentions it replaces sequential calls, guiding agents to prefer it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources