Census Trade

Name: Census Trade
Author: pipeworx-io

by io.github.pipeworx-io

Server Details

Census Trade MCP — US Census Bureau International Trade data

Status: Healthy
Last Tested: 2026-05-31 10:53
Transport: Streamable HTTP
URL
Repository: pipeworx-io/mcp-census-trade
GitHub Stars: 0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

B3.3/5.0

Tool DescriptionsA

Average 4.1/5 across 21 of 23 tools scored. Lowest: 3.2/5.

Server CoherenceC

Disambiguation2/5

Multiple tools have overlapping purposes, notably ask_pipeworx which subsumes many other tools, and the Polymarket tools (bet_research, polymarket_arbitrage, polymarket_edges, polymarket_kalshi_spread) have unclear boundaries. Memory tools and AI visibility tools are unrelated to the server name, causing confusion.

Naming Consistency3/5

All tool names use snake_case consistently, but the verb/noun pattern is mixed: some start with verbs (ask, generate, validate, resolve, scan) while others are noun phrases (bet_research, entity_profile, recent_changes). This is readable but lacks a uniform pattern.

Tool Count2/5

With 23 tools, the server is bloated for its stated 'Census Trade' focus. Many tools (Polymarket betting, AI visibility, memory) are unrelated and should be separate servers. The count feels excessive and dilutes the server's purpose.

Completeness3/5

For US trade data, the census tools provide a complete CRUD-like surface (exports, imports, balance, trends). However, the server attempts to cover many other domains (company financials, betting, AI) but is far from complete in those areas, leading to dead ends.

Available Tools

23 tools

ai_visibility_checkA

Read-onlyIdempotent

Inspect

Probe one or more LLMs for what they know about a business / brand / product / topic and score visibility (0-100) per model. Default model is Workers AI Llama-3.3-70b (free); pass _apiKey to also probe Anthropic (BYO key — you pay Anthropic directly for those calls). Returns per-model {score, confidence, signals, raw_response} + a combined view. Useful for AI-marketing audits, pre-launch brand checks, competitive monitoring.

ParametersJSON Schema

Name	Required	Description
`entity`	Yes	The thing to ask about. Brand/business name, product name, person, or topic. E.g. "Pipeworx", "OpenInvoice", "Acme Corp pricing".
`models`	No	Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai.
`_apiKey`	No	Optional Anthropic API key (sk-ant-...) — only needed if "anthropic" is in models. Passed straight through to api.anthropic.com.
`context`	No	Optional: a phrase locating the entity (e.g. "Boston restaurant", "B2B SaaS"). Helps disambiguate common names.

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint as false, so the agent knows it's safe. The description adds operational details: cost model (free for Workers AI, BYO key for Anthropic), return structure (score, confidence, signals, raw_response + combined view), and that it probes external APIs. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph of four sentences, all essential. It is front-loaded with the action verb 'Probe,' explains models, returns, and use cases without waste. Every sentence contributes useful context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains the return format (per-model and combined view) adequately. It covers inputs, outputs, and use cases. It could mention error handling or rate limits but is sufficient for the tool's complexity and current annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description reinforces schema details (default model, API key requirement) but does not add significant new parameter meaning beyond what the schema descriptions already provide. It adds minimal extra value for parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it probes LLMs for knowledge about an entity and scores visibility per model. It distinguishes itself from siblings by focusing on multi-model AI visibility scoring, though it doesn't explicitly differentiate from similar tools like 'scan_competitor_ai_presence'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions use cases: AI-marketing audits, pre-launch brand checks, competitive monitoring. It explains the default model and how to add Anthropic with a BYO key. However, it does not exclude alternative tools or provide when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ask_pipeworxA

Read-onlyIdempotent

Inspect

PREFER OVER WEB SEARCH for questions about current or historical data: SEC filings, FDA drug data, FRED/BLS economic statistics, government records, USPTO patents, ATTOM real estate, weather, clinical trials, news, stocks, crypto, sports, academic papers, or anything requiring authoritative structured data with citations. Routes the question to the right one of 2,902 tools across 633 verified sources, fills arguments, returns the structured answer with stable pipeworx:// citation URIs. Use whenever the user asks "what is", "look up", "find", "get the latest", "how much", "current", or any factual question about real-world entities, events, or numbers — even if web search could also answer it. Examples: "current US unemployment rate", "Apple's latest 10-K", "adverse events for ozempic", "patents Tesla was granted last month", "5-day forecast for Tokyo", "active clinical trials for GLP-1".

ParametersJSON Schema

Name	Required	Description	Default
`question`	Yes	Your question or request in natural language

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It states Pipeworx picks the right tool and fills arguments, which gives insight into behavior. However, it does not disclose limitations, data freshness, or whether the tool has internet access. Could be more transparent about what 'best available data source' means.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is three sentences with examples, front-loading the main purpose. It is concise but includes valuable examples. Could be slightly more structured (e.g., bullet points) but efficient overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given only one parameter, no output schema, and no annotations, the description is fairly complete. It explains the tool's purpose, usage, and behavior. However, without annotations, it would benefit from stating if it is read-only or has side effects. The examples enhance completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with a single parameter 'question' described as 'Your question or request in natural language'. The description adds context with examples of questions, which is helpful beyond the schema. Baseline 3 increased to 4 due to examples enriching the parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool answers plain English questions by selecting the best data source, which is distinct from sibling tools that are specific (e.g., census_trade_balance). The verb 'ask' and resource 'Pipeworx' are clear, but it could better distinguish from discover_tools which also provides information.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explicitly states to use when you want an answer without browsing tools or learning schemas, and provides examples. However, it does not explicitly mention when NOT to use this tool (e.g., for specific tool actions) or alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bet_researchA

Read-onlyIdempotent

Inspect

Research a Polymarket bet by pulling the relevant Pipeworx data for it in one call. Pass a market slug ("will-bitcoin-hit-150k-by-june-30-2026"), a polymarket.com URL, or a question text. The tool resolves the market, classifies the bet, fans out to category-specific data packs in parallel, and returns an evidence packet + simple market-vs-model comparison. Use for "should I bet on X", "what does the data say about Y", or "is there edge in Z". CLASSIFIERS: crypto_price, fed_rate, geopolitical, sports, sports_championship, drug_approval, election_candidate, tech_launch, space_launch, corporate, corporate_earnings, corporate_event, public_figure_speech, weather, other. FAN-OUT EXAMPLES: BTC bet → coingecko + fred + gdelt+gnews; Fed bet → fred + kalshi_macro + federal_register; Hormuz bet → imf_portwatch + airspace + gdelt; Yankees WS → mlb_stats_standings + parent_event partition + news; NVDA-vs-AAPL → finnhub get_quote + edgar shares-outstanding (derived market cap) + edgar filings + news. RESPONSE SHAPES: result.market carries best_bid/best_ask/spread_pp/liquidity/price_change_1h/1d/1w; result.analysis carries model_probability/edge_pp/kelly_fraction_half when a closed-form model fires; result.evidence is keyed by source. SAFETY: low-confidence resolutions short-circuit with status:"low_confidence_match" and suppress analysis fields so agents can't accidentally size on phantom matches. Closed/dead markets return status:"market_closed_or_inactive" and skip fan-out. Wide-spread markets (>10pp) carry tradeability:"illiquid_wide_spread" + an explanatory note.

ParametersJSON Schema

Name	Required	Description
`depth`	No	quick = 2-3 evidence sources, thorough = full fan-out. Default thorough.
`market`	Yes	Polymarket slug ("will-bitcoin-hit-150k-by-june-30-2026"), full URL ("https://polymarket.com/event/..."), or question text ("Will Bitcoin hit $150k by June 30?")
`include_raw`	No	Default false. When false (recommended), FRED/FDA/GDELT/Federal-Register evidence is summarized to the few fields agents actually use — keeps responses under ~20KB. Pass true to get full upstream payloads (50KB-500KB) when you need to recompute deltas, cite specific observations, or post-process.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and openWorldHint; description adds detail about fan-out to multiple data packs and market classification. It explains behavior beyond what annotations convey.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single paragraph with no wasted words. Every sentence adds value: input, process, output, use cases. Well-structured and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Without output schema, description explains returns (evidence packet, comparison). Covers inputs, process, and output. Sufficient for an agent to use correctly, though return format specifics are vague.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are fully documented. Description reiterates the market parameter but adds no new semantics beyond the schema. Depth is implicitly described but already in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool researches Polymarket bets, specifies inputs (slug, URL, question text), and outputs (evidence packet, market-vs-model comparison). It is distinct from siblings like ask_pipeworx or validate_claim.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit use cases are provided: 'should I bet on X?', 'what does the data say?', 'is there edge?'. It also notes this is the core demo product. However, it does not mention when not to use or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

census_exportsA

Read-onlyIdempotent

Inspect

Search US export data by HS commodity code (e.g., "8471" for computers) and/or country (e.g., "Mexico"). Returns export values, quantities, and commodity details.

ParametersJSON Schema

Name	Required	Description
`year`	Yes	Trade year (e.g., "2024")
`limit`	No	Maximum number of records to return (default 20)
`month`	No	Trade month 01-12. Optional — omit for annual data.
`hs_code`	Yes	HS commodity code at 2, 4, or 6 digit level (e.g., "8471" for computers)
`country_code`	No	Census country code (e.g., "5700" for China). Optional — omit for all countries.

Output Schema

ParametersJSON Schema

Name	Required	Description
`type`	Yes	Trade direction indicator
`count`	Yes	Number of records returned
`period`	Yes	Trade period (year or year-month)
`hs_code`	Yes	HS commodity code queried
`records`	Yes

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description indicates the tool is a read operation that returns data from the US Census Bureau. Since there are no annotations (e.g., destructiveHint or readOnlyHint), the description carries the burden but adequately implies non-destructive behavior. However, it does not disclose any limitations, rate limits, or data freshness details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at two sentences, front-loading the core purpose. Each sentence adds value: first states what it does, second lists output types. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the moderate complexity (5 parameters, 2 required) and no output schema, the description sufficiently covers the purpose and output. It lacks mention of return limits or pagination, but for a data retrieval tool with sibling tools, it provides adequate context to select the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so the baseline is 3. The description adds little beyond the schema: it mentions 'HS commodity code' and 'country' but does not clarify the meaning of 'limit' or 'month' beyond what the schema already states. No significant extra context is provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool gets US export data by HS commodity code and/or country, specifying the returned data types (export values, quantities, commodity details, country names). It distinguishes from sibling tools like 'census_imports' and 'census_trade_balance' by focusing on exports. However, it does not explicitly differentiate from 'census_trade_trends' which may also deal with exports.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for retrieving US export data with specific filters (HS code, country, time period) but does not provide explicit guidance on when to use this tool versus alternatives like 'census_imports' or 'census_trade_trends'. No exclusions or when-not-to-use advice is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

census_importsA

Read-onlyIdempotent

Inspect

Search US import data by HS commodity code (e.g., "8471" for computers) and/or country (e.g., "China"). Returns import values, quantities, and commodity details.

ParametersJSON Schema

Name	Required	Description
`year`	Yes	Trade year (e.g., "2024")
`limit`	No	Maximum number of records to return (default 20)
`month`	No	Trade month 01-12 (e.g., "06" for June). Optional — omit for annual data.
`hs_code`	Yes	HS commodity code at 2, 4, or 6 digit level (e.g., "8471" for computers, "87" for vehicles)
`country_code`	No	Census country code (e.g., "5700" for China, "2010" for Mexico). Optional — omit for all countries.

Output Schema

ParametersJSON Schema

Name	Required	Description
`type`	Yes	Trade direction indicator
`count`	Yes	Number of records returned
`period`	Yes	Trade period (year or year-month)
`hs_code`	Yes	HS commodity code queried
`records`	Yes

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries burden. It discloses return fields (import values, quantities, commodity details, country names) but does not mention any behavioral traits like rate limits, pagination, or data freshness. The description is accurate but incomplete for full transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is a single sentence that is well-front-loaded with the core action and filters. It efficiently conveys what the tool does without unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters, all described in schema, and no output schema, the description adequately summarizes inputs and outputs. However, could mention optional month vs annual data distinction, which is clear from schema but not description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description does not add new meaning beyond what the schema already provides for parameters; it merely summarizes the tool's purpose.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'Get', resource 'US import data', and key filters 'HS commodity code and/or country'. It distinguishes from siblings (e.g., census_exports, census_trade_balance) by specifying 'import data' and mentioning US Census Bureau as source.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for retrieving US import data, but does not explicitly state when to prefer this over census_exports or census_trade_trends. No exclusion criteria or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

census_trade_balanceB

Read-onlyIdempotent

Inspect

Check US trade balance with a specific country for a given year. Returns net trade value and breakdown by end-use commodity category.

ParametersJSON Schema

Name	Required	Description	Default
`year`	Yes	Trade year (e.g., "2024")
`country_code`	Yes	Census country code (e.g., "5700" for China, "2010" for Mexico)

Output Schema

ParametersJSON Schema

Name	Required	Description
`year`	Yes	Trade year
`country`	Yes	Country name
`country_code`	Yes	Census country code
`total_exports_usd`	Yes	Total exports in USD
`total_imports_usd`	Yes	Total imports in USD
`trade_balance_usd`	Yes	Net trade balance (exports minus imports)
`deficit_or_surplus`	Yes	Trade balance classification

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description bears full burden. It discloses the tool aggregates using end-use commodity categories, but does not mention data freshness, potential errors, or return format. Adequate but minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, efficient and front-loaded with the core purpose. No fluff, but could mention that year is string format if not obvious.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and moderate complexity, the description is adequate but incomplete: does not specify if trade balance is in USD, if the result is a single number or a breakdown, or if data is available for all years. An output schema would help.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds no additional parameter context beyond what the schema provides (e.g., no examples of country codes beyond those in schema). Neutral.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it gets the US trade balance with a specific country for a given year, using end-use commodity categories. It distinguishes from siblings like census_exports and census_imports by focusing on the balance.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for retrieving trade balance data but does not explicitly state when to use this tool vs alternatives. Siblings include exports, imports, and trends, but no guidance on selection is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

census_trade_trendsB

Read-onlyIdempotent

Inspect

Get monthly US trade trends for a commodity and/or country over time. Returns month-by-month values to identify seasonal patterns and shifts.

ParametersJSON Schema

Name	Required	Description
`hs_code`	No	HS commodity code. Optional — omit for aggregate trade.
`end_year`	Yes	End year (e.g., "2024")
`start_year`	Yes	Start year (e.g., "2022")
`country_code`	No	Census country code. Optional — omit for all countries.

Output Schema

ParametersJSON Schema

Name	Required	Description
`months`	Yes	Number of monthly data points
`trends`	Yes
`hs_code`	Yes	HS commodity code or 'all' for aggregate
`end_year`	Yes	End year of trend range
`start_year`	Yes	Start year of trend range
`country_code`	Yes	Census country code or 'all' for all countries

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavioral traits. It mentions it gets trends and shows changes but does not state whether it is read-only, if there are rate limits, or what the return format looks like. This is insufficient for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no wasted words. Efficiently conveys purpose and key optional parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description does not explain return values or behavior (e.g., whether it returns aggregated or raw data, pagination). It is incomplete for a tool with 4 parameters and no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds that hs_code and country_code are optional, but the schema already states that. No additional semantic meaning beyond the schema is provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool gets monthly US trade trends over a period and shows how trade values change month by month. It mentions optional filtering by commodity and/or country, distinguishing it from sibling tools like census_exports and census_imports, though not explicitly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for monthly trend analysis but does not specify when to use this versus other trade tools (e.g., census_trade_balance). No explicit when-not or alternative guidance is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compare_entitiesA

Read-onlyIdempotent

Inspect

Compare 2-5 companies (or drugs) side by side in one call. Use for "compare X and Y", "X vs Y", "which is bigger", or rank-by-metric questions. type="company" — pulls LATEST 10-K revenue + net income + cash + long-term debt from SEC EDGAR/XBRL (post-Run-6 fix: returns the actual most-recent FY filing per concept, not arbitrarily-old data; off-calendar fiscal years like AAPL Sep, NVDA Jan handled correctly). type="drug" — pulls adverse-event report counts from FAERS, FDA approval counts, active trial counts. Returns paired data + pipeworx:// citation URIs per entity. Replaces 8-15 sequential lookups; results are sorted by the primary metric (revenue for company, adverse events for drug) so "largest" / "most" reads off the top of the response.

ParametersJSON Schema

Name	Required	Description	Default
`type`	Yes	Entity type: "company" or "drug".
`values`	Yes	For company: 2–5 tickers/CIKs (e.g., ["AAPL","MSFT"]). For drug: 2–5 names (e.g., ["ozempic","mounjaro"]).

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description must fully disclose behavior. It states it 'returns paired data + pipeworx:// resource URIs' and explains data sources (SEC EDGAR for companies, FDA-related for drugs). It does not mention permissions, rate limits, or any side effects, but as a read-only comparison tool, this is acceptable. The description is adequate but not exhaustive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences, each dense with information. It front-loads the core purpose and then details specifics per type. No redundant or filler content. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains the return format (paired data + URIs) and what metrics are included for each entity type. It covers both use cases. It could optionally mention data freshness or limitations, but overall it provides sufficient context for an agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents parameters. The description adds value by explaining the meaning of each 'type' value and providing example formats for 'values' (tickers/CIKs for companies, drug names). This goes beyond the schema's minimal descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'compare' and specifies the resource (2–5 entities of type company or drug). It lists exact data points for each type (revenue, net income, etc. for companies; adverse-event counts, FDA approvals, trials for drugs). It distinguishes from siblings by noting it replaces 8–15 sequential agent calls, implying it is more efficient than individual lookups.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context: for comparing multiple entities efficiently. The phrase 'Replaces 8–15 sequential agent calls' suggests it should be used instead of multiple calls to other tools. It does not explicitly state when not to use, but the purpose is clear enough for an agent to decide. No explicit alternative tools are named, but the sibling list provides context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_toolsA

Read-onlyIdempotent

Inspect

Find tools by describing the data or task. Use when you need to browse, search, look up, or discover what tools exist for: SEC filings, financials, revenue, profit, FDA drugs, adverse events, FRED economic data, Census demographics, BLS jobs/unemployment/inflation, ATTOM real estate, ClinicalTrials, USPTO patents, weather, news, crypto, stocks. Returns the top-N most relevant tools with names, descriptions, and full input schemas (with curated examples) — each result is ready to call directly, no second schema lookup needed. Call this FIRST when you have many tools available and want to see the option set (not just one answer).

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Maximum number of tools to return (default 20, max 50)
`query`	Yes	Natural language description of what you want to do (e.g., "analyze housing market trends", "look up FDA drug approvals", "find trade data between countries")

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses that the tool searches a catalog and returns tool names and descriptions, which is the core behavior. It also hints at the scope ('500+ tools'). However, it does not mention any rate limits, auth requirements, or side effects. Since this is a search tool, destructive behavior is not expected, but transparency is still good.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the key purpose, and contains no wasted words. It is well-structured for an agent to quickly understand.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool is a search/discovery tool with no output schema, the description explains what it returns ('most relevant tools with names and descriptions') and when to use it. It is complete enough for an agent to invoke correctly. Lacks info on whether results are ranked, but that is a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both 'query' and 'limit' parameters. The description adds a brief note about default and max for limit but does not add significant meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Search' and the resource 'Pipeworx tool catalog'. It specifies the purpose: finding relevant tools by describing what you need, and distinguishes itself by telling the agent to call this FIRST when many tools are available.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Call this FIRST when you have 500+ tools available and need to find the right ones for your task', providing a clear directive on when to use this tool. It implies that this tool is for discovery before invoking other tools, distinguishing it from sibling tools that perform specific operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

entity_profileA

Read-onlyIdempotent

Inspect

Get everything about a US public company in one call. Use when a user asks "tell me about X", "research Acme", "brief me on Tesla", or you'd otherwise call 10+ pack tools across SEC EDGAR, XBRL, USPTO, news, GLEIF. Returns: cik + company_name; recent_filings (up to 5 with pipeworx://edgar/company/{cik}/filings/{accession} URIs); fundamentals (LATEST 10-K Revenues + NetIncomeLoss + Cash, sorted period_end DESC — Run 6 fix landed real FY2025 numbers, not stale FY2022); patents (USPTO PatentsView API was sunset May 2025; pack soft-fails until reactivated); recent news mentions via GDELT→GNews fallback; LEI via GLEIF. Pass ticker "AAPL" or zero-padded CIK "0000320193" — names not supported (use resolve_entity first).

ParametersJSON Schema

Name	Required	Description	Default
`type`	Yes	Entity type. Only "company" supported today; person/place coming soon.
`value`	Yes	Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193"). Names not supported — use resolve_entity first if you only have a name.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It transparently states only 'company' type is supported, mentions return format (pipeworx:// URIs), and explains it replaces many sequential calls. Does not discuss rate limits or authentication, but overall clear.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four concise sentences with no fluff. Front-loaded with main purpose, then specifics, then return format, then usage alternative. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, description adequately explains return value (citation URIs) and lists all bundled data sources. Provides enough context for an agent to understand what it gets and how to use it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so baseline is 3. The description adds no additional parameter meaning beyond what schema already provides (e.g., value description already mentions ticker/CIK and not names).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it provides a full entity profile across multiple Pipeworx packs, listing specific data sources for type='company' (SEC filings, XBRL, patents, news, LEI). It differentiates from siblings like resolve_entity and compare_entities by focusing on comprehensive bundling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use the alternative usa_recipient_profile for federal contracts and hints at using resolve_entity for name resolution. Could be improved by more direct comparisons to siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forgetB

DestructiveIdempotent

Inspect

Delete a previously stored memory by key. Use when context is stale, the task is done, or you want to clear sensitive data the agent saved earlier. Pair with remember and recall.

ParametersJSON Schema

Name	Required	Description	Default
`key`	Yes	Memory key to delete

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It states the action but does not disclose side effects, irreversibility, or authorization needs. For a deletion tool, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no redundancy. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and no annotations, the description should provide more behavioral detail. It is too minimal for a deletion tool that could have irreversible effects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters, so baseline is 3. Description does not add meaning beyond schema; it merely restates 'key' as 'Memory key to delete'. No additional value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses a clear verb ('Delete') and resource ('stored memory by key'), immediately distinguishing it from sibling tools like 'recall' (retrieve) and 'remember' (store).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives; no mention of prerequisites or safety considerations. Description is purely functional without context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_llms_txtA

Read-onlyIdempotent

Inspect

Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.

ParametersJSON Schema

Name	Required	Description	Default
`url`	Yes	Full URL of the site to summarize, e.g. "https://example.com" or a specific landing page.
`max_links`	No	Maximum number of link entries to include (default 25, max 50).

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds concrete behavioral details: it fetches the page, extracts title/description/key links, and emits standard markdown. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, front-loading the core action and output. Each sentence serves a purpose: the first explains what it does, the second explains output and use cases. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with two well-documented parameters and no output schema, the description fully explains the process, output format, and practical applications. The context signals (annotations, schema coverage) support completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with adequate parameter descriptions. The description does not add new semantic information beyond what the schema provides (e.g., 'url' and 'max_links' are documented). Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: generating an llms.txt file for any URL, specifying the output format and how it works. It distinguishes itself from siblings by focusing on AI crawler indexing, which is unique among the listed sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit use cases (e.g., indexing a client's site, drafting for own project, auditing competitors), but does not specify when not to use it or mention alternatives. Still, the context is clear enough for an AI agent to decide.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pipeworx_feedbackAInspect

Tell the Pipeworx team something is broken, missing, or needs to exist. Use when a tool returns wrong/stale data (bug), when a tool you wish existed isn't in the catalog (feature/data_gap), or when something worked surprisingly well (praise). Describe the issue in terms of Pipeworx tools/packs — don't paste the end-user's prompt. The team reads digests daily and signal directly affects roadmap. Rate-limited to 5 per identifier per day. Free; doesn't count against your tool-call quota.

ParametersJSON Schema

Name	Required	Description
`type`	Yes	bug = something broke or returned wrong data. feature = a new tool or capability you wish existed. data_gap = data Pipeworx does not currently expose. praise = positive note. other = anything else.
`context`	No	Optional structured context: which tool, pack, or vertical this relates to.
`message`	Yes	Your feedback in plain text. Be specific (which tool, what error, what data was missing). 1-2 sentences typical, 2000 chars max.

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses the rate limit and the instruction to avoid including prompts. However, it does not specify the outcome of sending feedback (e.g., whether a response is expected), if the action is synchronous or asynchronous, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise: three short sentences. The first sentence states the core purpose. The second lists use cases. The third gives critical constraints and rate limit. No unnecessary words, front-loaded with the most important info.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple feedback tool with no output schema, the description covers purpose, usage scenarios, parameter guidelines, and rate limits. While it does not describe the return value (e.g., success confirmation), this is acceptable given the tool's straightforward nature. The nested object in schema is not elaborated, but it's optional and self-explanatory.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by mapping the 'type' enum to real-world use cases (bug, feature, data_gap, praise) and advising on what to include in the 'message' (describe tools/data tried, avoid prompt verbatim). This enhances understanding beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Send feedback to the Pipeworx team.' It lists specific use cases (bug reports, feature requests, missing data, praise) and distinguishes itself from sibling tools (which focus on data querying and recall).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear usage guidance: use for bug reports, features, missing data, or praise. It includes specific instructions (describe what you tried in Pipeworx tools/data, do not include end-user prompt verbatim) and mentions rate limits (5 per day). However, it does not explicitly contrast against alternatives or mention when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pipeworx_trendingA

Read-onlyIdempotent

Inspect

What other AI agents are calling on Pipeworx right now. Returns the top tools, top packs, and total call volume over a recent window (24h, 7d, or 30d). Useful for: (1) discovering what data sources are hot for current events, (2) confirming a popular tool is the canonical choice before asking your own question, (3) seeing whether your use case aligns with what most agents need. Self-aggregating signal — derived from CF analytics-engine, no PII, just (pack, tool, count). Cached 5min-1h depending on window.

ParametersJSON Schema

Name	Required	Description	Default
`window`	No	24h (default) \| 7d \| 30d. Shorter windows surface what's hot right now; longer windows show steady-state demand.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide safety hints (readOnly, idempotent, non-destructive). The description adds valuable behavioral context: data is derived from CF analytics-engine, no PII, just counts, and caching behavior (5min-1h). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with bullet points for use cases, making key information scannable. It is moderately concise but could be slightly tightened without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description covers purpose, usage guidance, data source, caching, and parameter semantics comprehensively. No gaps remain given the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear enum and description for the 'window' parameter. The description adds extra context about shorter vs. longer windows and defaults, which enhances understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it returns top tools, top packs, and total call volume over a time window, with clear verb+resource. It distinguishes itself from siblings by focusing on real-time call volume, unlike 'discover_tools' which likely lists available tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides three specific use cases (discovering hot data sources, confirming canonical choice, checking alignment) that guide when to use the tool. However, it does not explicitly state when NOT to use it or mention sibling alternatives, which would improve clarity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_arbitrageA

Read-onlyIdempotent

Inspect

Find arbitrage opportunities on Polymarket via monotonicity violations + partition-sum checks. TWO MODES: (1) event — pass a single Polymarket event slug; walks child markets, checks date-axis / threshold-axis ordering AND computes the partition_check (sum of YES prices across mutually-exclusive legs — should ≈1; deviations >3pp emit a BUY/SELL EVERY LEG signal). (2) topic — pass a seed question ("Strait of Hormuz traffic returns to normal"); searches related events across the platform, flattens markets, runs the comparator on the union. Cross-event mode catches "...by May 31" vs "...by Jun 30" patterns that single-event misses. SEMANTIC ANCHOR: cross-event pairs require ≥0.30 Jaccard similarity on question tokens (prevents Powell-Fed-Pause being paired with Powell-DOJ-probe); skipped_low_similarity surfaces the rejected pair count. PARTITION FILTER: drops will-person-X / will-manager-Y / will-someone-else- placeholder slugs; partitions with >20% placeholder fraction return null arb signal. Response carries opportunities[] (gap_pp, suggested_trade, reasoning) plus partition_check when in event mode (with placeholders_filtered count).

ParametersJSON Schema

Name	Required	Description	Default
`event`	No	Single-event mode: Polymarket event slug (e.g. "when-will-bitcoin-hit-150k") or full URL.
`topic`	No	Cross-event mode: a topic or seed question. Tool searches Polymarket for related markets across separate events and checks monotonicity across them. E.g. "Strait of Hormuz traffic returns to normal".

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description details the tool's behavior: walking child markets, searching across events, grouping, checking monotonicity, and returning ranked opportunities. Annotations already indicate read-only and non-destructive nature; the description adds significant behavioral context beyond those.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear mode separation and explanation of cross-event benefit. It is somewhat verbose but every sentence contributes value, making it appropriately sized for the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's two-mode complexity and no output schema, the description covers the purpose, usage, and logic comprehensively. It provides sufficient context for an agent to select and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Both parameters are described in the schema (100% coverage). The description adds extra context, explaining how each is used in the respective mode and providing an example for the topic parameter, adding meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it finds arbitrage opportunities via monotonicity violations on Polymarket, clearly distinguishing two modes (event and topic) and contrasting with single-event mode. This provides a precise and unique purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives explicit guidance on when to use each mode: pass an event slug for single event, pass a topic for cross-event. It explains why cross-event mode is necessary, providing clear usage context, though it does not explicitly mention alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_edgesA

Read-onlyIdempotent

Inspect

Scan top Polymarket markets and return opportunities where Pipeworx data disagrees with market price. Built for "what should I bet on today" — agents discover opportunities without paging hundreds of markets. FIVE MODEL FAMILIES grouped into three response segments under by_segment: (1) MODEL_DRIVEN — crypto_price (lognormal barrier from 90d FRED log-returns) and news_momentum (GDELT 7d/21d article-volume ratio, soft signal w/ halved Kelly). (2) STRUCTURAL_ARBITRAGE — partition_overround on mutually-exclusive events; per-leg favorite-longshot bias correction with per-sport α (tennis 1.02, soccer 1.10, MMA 1.15, default 1.0); placeholder-slug filter drops will-person-X / will-team-Y / will-manager-Z / will-someone-else- backstops; partitions with >20% placeholder fraction skipped entirely. (3) CONCENTRATED_LONGSHOT — basket trade when one leg ≥85% AND ≥2 longshots ≤5% AND portfolio return ≥50:1; rare-by-design. EVERY OPPORTUNITY carries edge_pp_net (after slippage), kelly_fraction + kelly_fraction_half (capped at 0.25), market.liquidity, market.spread_pp, market.volume. TRADEABLE-EDGE KNOBS: min_liquidity / max_spread_pp drop opportunities where edge isn't realizable; min_partition_leg_kelly filters partitions by best per-leg Kelly. Cached 1h at the KV level keyed on all knobs. fed_rate bets are scanned but EXCLUDED from ranking (1m-T vs EFFR signal is unreliable at meeting-month horizons without paid OIS/SOFR-futures data); see fed_rate_context for raw spread.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Top N edges to return after ranking. Default 10, max 25.
`window`	No	Polymarket volume window to filter markets. Default 1wk.
`min_kelly`	No	Minimum half-Kelly fraction (as decimal, e.g. 0.005 = 0.5% of bankroll) to include single-leg opportunities. Default 0 (no filter). Skips opportunities that are too small to bet sensibly even if the edge is large.
`min_edge_pp`	No	Minimum \|edge\| in percentage points to include (default 0.5). Edge is evaluated NET of slippage.
`slippage_pp`	No	Assumed execution slippage in percentage points per leg (default 0.3). Subtracted from raw \|edge\| before ranking and Kelly sizing. Polymarket has zero trading fees as of 2024 but bid/ask + thin depth typically eats 20-50bp per trade. Bump for very thin partitions; drop to 0 if you have a smarter fill model.
`max_spread_pp`	No	Tradeable-edge filter. Maximum bid/ask spread in percentage points on the representative market. Default null (no filter). Set to 2 to require tight books — anything wider eats most plausible edges.
`min_liquidity`	No	Tradeable-edge filter. Minimum $ liquidity on the representative market (or for partition_overround, on at least one top_leg). Default 0 (no filter). Set to 5000 to drop thin-book opportunities where executing the edge would walk the book past breakeven.
`category_filter`	No	Comma-separated list to restrict the output: "model_driven" (crypto_price + news_momentum), "structural_arbitrage" (partition_overround), "concentrated_longshot". Combine like "model_driven,structural_arbitrage". Default: all.
`min_partition_leg_kelly`	No	Minimum BEST per-leg half-Kelly fraction across a partition_overround opportunity's top_legs (or longshot_basket legs). Default 0 (no filter). Partition arbs always return kelly_fraction_half=0 at the parent level by design (basket trades don't compose to single-leg Kelly), so min_kelly never filters them — this knob applies to the per-leg Kelly inside top_legs instead. Use to suppress thin partitions whose individual leg edges aren't worth the per-leg slippage cost.

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive, open-world behavior. The description adds detailed behavioral context: scanning, grouping by asset, single fetch per asset, model computation, ranking by edge, and returning suggestions. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is concise (3-4 sentences), front-loaded with purpose, and efficiently covers what, how, and use case. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description states it returns top N with suggested direction, which is adequate. Could specify exact fields, but not critical given tool simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are already documented. The description adds no new semantics beyond what the schema provides (limit, window, min_edge_pp). Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool scans high-volume Polymarket markets and returns those where Pipeworx data disagrees with market price, using a clear verb-resource pair (scan, return). It distinguishes from siblings like polymarket_arbitrage by focusing on model-driven edge detection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description frames the tool as answering 'what should I bet on today,' implying discovery use case. It doesn't explicitly state when not to use or name alternatives, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_kalshi_spreadA

Read-onlyIdempotent

Inspect

Cross-venue spread between Kalshi and Polymarket for the same resolving question. Kalshi and Polymarket frequently price the same event 2-25pp apart because the venues have different participant pools — that delta is a real arb signal. TWO MODES: (1) topic — pre-mapped macro shortcuts ("fed", "btc", "cpi", "gdp", "sp500", "recession", "next_pope") that auto-fetch the matching event on each venue. (2) explicit kalshi_event_ticker + polymarket_event_slug for custom pairings. Returns: each venue's leg-by-leg prices (in raw probability, 0-1), and where a leg from each side maps to the same outcome, the spread (Kalshi − Polymarket) in percentage points.

ParametersJSON Schema

Name	Required	Description
`topic`	No	Pre-mapped: fed \| btc \| cpi \| gdp \| sp500 \| recession \| next_pope \| next_uk_pm \| next_israel_pm \| 2028_president
`kalshi_event_ticker`	No	Explicit Kalshi event ticker, e.g. "KXFED-26OCT". Overrides the topic-mapped Kalshi side.
`polymarket_event_slug`	No	Explicit Polymarket event slug, e.g. "fed-decision-in-june-825". Overrides the topic-mapped Polymarket side.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false. The description adds value by explaining the behavioral traits: it fetches live prices from both venues, computes the spread, and returns raw probabilities and percentage-point deltas. It also notes that the delta is a real arbitrage signal, providing context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately concise but packs significant detail. It front-loads the main purpose, then describes two modes, and ends with return format. Every sentence adds value, though a slightly more streamlined structure could improve readability without losing information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains return values: leg-by-leg prices (0-1) and spread in percentage points. It covers all essential aspects given the tool's moderate complexity: modes, parameters, and expected output. No gaps are apparent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for all three parameters. However, the description adds meaning by explaining how explicit parameters override the topic-mapped side, and lists the available topic shortcuts. This extra context helps the agent understand parameter relationships and default behaviors.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool computes the cross-venue spread between Kalshi and Polymarket for the same resolving question. It uses a specific verb ('cross-venue spread') and resource ('Kalshi and Polymarket'), and distinguishes itself from siblings like 'polymarket_arbitrage' and 'polymarket_edges' by focusing on inter-venue differences.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly outlines two modes—pre-mapped topic shortcuts and explicit event tickers—with examples. It implies when to use each (topic for convenience, explicit for custom pairings) but does not explicitly state when not to use or mention alternative tools. The guidance is clear but lacks exclusive conditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recallA

Read-onlyIdempotent

Inspect

Retrieve a value previously saved via remember, or list all saved keys (omit the key argument). Use to look up context the agent stored earlier — the user's target ticker, an address, prior research notes — without re-deriving it from scratch. Scoped to your identifier (anonymous IP, BYO key hash, or account ID). Pair with remember to save, forget to delete.

ParametersJSON Schema

Name	Required	Description	Default
`key`	No	Memory key to retrieve (omit to list all keys)

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that the tool retrieves previously stored memories and that omitting the key lists all memories. However, it doesn't mention potential side effects (none likely), performance implications, or whether retrieval is read-only. For a simple retrieval tool, this is adequate but not detailed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loading the core action and then adding an alternative use case. It is concise but could be slightly more structured by separating retrieval and listing into distinct usage notes.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single optional parameter, no output schema), the description is largely complete. It explains both retrieval modes. However, it doesn't describe the output format or what happens if the key doesn't exist, which could be useful for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds value by explaining the behavior when the parameter is omitted ('list all stored memories'), which is not obvious from the schema alone. This goes beyond the schema definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: retrieving a memory by key or listing all memories when key is omitted. It distinguishes itself from 'remember' (store) and 'forget' (delete).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: use when you need to retrieve previously saved context, and how to list all keys by omitting the parameter. However, it doesn't contrast with siblings like 'forget' or 'remember' in terms of when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recent_changesA

Read-onlyIdempotent

Inspect

What's new with a company in the last N days/months? Use for "what's happening with X", "updates on Y", "news on Apple this month", or change-monitoring. Fans out in parallel to: SEC EDGAR (filings since since), GDELT→GNews fallback (news mentions in window — GDELT preferred, GNews when rate-limited or 5xx), USPTO (patents granted; PatentsView API sunset May 2025 so this soft-fails until reactivated). since accepts ISO date ("2026-04-01") or relative shorthand ("7d", "30d", "3m", "1y"). Returns structured changes[] grouped by source + total_changes count + pipeworx:// citation URIs. Use entity_profile instead when you want the static profile (filings + fundamentals + LEI + patents) regardless of window.

ParametersJSON Schema

Name	Required	Description
`type`	Yes	Entity type. Only "company" supported today.
`since`	Yes	Window start — ISO date ("2026-04-01") or relative ("7d", "30d", "3m", "1y"). Use "30d" or "1m" for typical monitoring.
`value`	Yes	Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193").

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden and discloses the parallel fan-out, date format acceptance, return structure (structured changes + total_changes count + URIs). It does not cover rate limits or error handling, but the core behavior is transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, each adding essential information: purpose, type-specific behavior, parameter details, and usage advice. No redundancy or filler; front-loaded with the main goal.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a multi-source fan-out tool with no output schema, the description covers the core behavior, return format, and parameter usage. Minor omissions (e.g., pagination, empty results) but still fairly complete for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good descriptions, so baseline is 3. The description adds value by explaining the `since` parameter in detail (examples of ISO and relative formats) and the `value` parameter (ticker or CIK), plus the specialized behavior of `type` (only company).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns what's new about an entity since a given time. It specifies the type parameter (only "company") and details the fan-out to SEC, GDELT, and USPTO sources, which distinguishes it from sibling tools like entity_profile that likely provide static information.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says use for 'brief me on what happened with X' or change-monitoring workflows, and explains date formats ('ISO date or relative'). It does not mention when not to use it or name alternative tools, but the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rememberA

Idempotent

Inspect

Save data the agent will need to reuse later — across this conversation or across sessions. Use when you discover something worth carrying forward (a resolved ticker, a target address, a user preference, a research subject) so you don't have to look it up again. Stored as a key-value pair scoped by your identifier. Authenticated users get persistent memory; anonymous sessions retain memory for 24 hours. Pair with recall to retrieve later, forget to delete.

ParametersJSON Schema

Name	Required	Description	Default
`key`	Yes	Memory key (e.g., "subject_property", "target_ticker", "user_preference")
`value`	Yes	Value to store (any text — findings, addresses, preferences, notes)

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses persistence behavior: 'Authenticated users get persistent memory; anonymous sessions last 24 hours'. No annotations provided, so description carries full burden, which it meets well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each adding distinct value: purpose, use cases, persistence behavior. No wasted words. Front-loaded with main action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple store tool with no output schema, description covers purpose, usage, and important behavioral detail (persistence). Could mention that keys are case-sensitive or naming conventions, but not necessary for basic use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description adds example values for key ('subject_property', etc.) and clarifies value accepts any text, but does not add meaning beyond schema's own descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states 'Store a key-value pair in your session memory', with clear verb 'store' and resource 'session memory'. Differentiates from sibling 'recall' (retrieve) and 'forget' (delete) by its write nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States use cases: 'save intermediate findings, user preferences, or context across tool calls'. Does not explicitly say when NOT to use or list alternatives, but siblings 'forget' and 'recall' cover complementary operations.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

resolve_entityA

Read-onlyIdempotent

Inspect

Resolve a user-spoken name to the canonical/official identifiers other tools require as input. Use FIRST when you have a name but need an ID. SUPPORTED TYPES: "company" (returns ticker + 10-digit CIK + company_name from SEC EDGAR + pipeworx://edgar/company/{cik} citation URI; accepts ticker, CIK, or company name as input — auto-disambiguated), "drug" (returns RxCUI + ingredient + brand from RxNorm + pipeworx://rxnorm/{rxcui} citation; accepts brand or generic name). Each call cascades through several lookup endpoints internally — using resolve_entity replaces 2-3 manual lookups.

ParametersJSON Schema

Name	Required	Description	Default
`type`	Yes	Entity type: "company" or "drug".
`value`	Yes	For company: ticker (AAPL), CIK (0000320193), or name. For drug: brand or generic name (e.g., "ozempic", "metformin").

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses the v1 limitation to 'company' type and the output format, but does not discuss error behavior, rate limits, or side effects beyond what is described.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three focused sentences with no redundancy. Front-loaded with the core action and followed by specific details. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 2 parameters and no output schema, the description explains return fields and references alternative approaches. Missing error handling or constraints, but adequate for the tool's simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, and the description adds value by explaining the accepted formats for 'value' and the v1 restriction on 'type', reinforcing the enum meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it resolves an entity to canonical IDs across Pipeworx data sources, provides examples of input formats, and frames it as a replacement for 2-3 lookup calls, distinguishing it from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies efficiency by noting it replaces multiple calls, but does not explicitly state when not to use or list alternatives. The sibling tools offer search or recall functions, but no direct comparison is made.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scan_competitor_ai_presenceA

Read-onlyIdempotent

Inspect

Compare AI visibility across multiple entities side-by-side. Probes each entity (your brand + N competitors) with ai_visibility_check, ranks by score, surfaces which is most/least recognized. Useful for competitive AI-marketing audits: "does Claude know about us as well as our competitors?". Returns ranked list with score, confidence, signal density per entity.

ParametersJSON Schema

Name	Required	Description
`models`	No	Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai.
`_apiKey`	No	Optional Anthropic API key — only if "anthropic" is in models. Passed to api.anthropic.com per probe.
`context`	No	Optional shared context applied to every probe (e.g. "B2B SaaS", "Boston restaurant"). Disambiguates common names.
`entities`	Yes	Array of 2-8 entities to compare (brand/business/product names). First entry treated as the "subject" for narrative; rest are competitors.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, openWorldHint, and no destructiveness. The description goes beyond by detailing the probing mechanism, ranking, and returned fields (score, confidence, signal density). It also explains default models and API key requirements, adding rich behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences with no waste. It front-loads the core purpose, then explains the mechanism and use case. Every sentence contributes unique information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains the return format (ranked list with score, confidence, signal density). It covers parameter behavior, internal tool usage, and provides a concrete example. The tool's function is fully understandable without additional context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good descriptions, so baseline is 3. The description adds value by explaining that the first entity is treated as the 'subject' for narrative (subtle but important nuance) and that entities are compared against each other. It reinforces the purpose of each parameter implicitly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it compares AI visibility across multiple entities side-by-side, uses ai_visibility_check internally, and ranks results. It distinguishes from sibling tools like ai_visibility_check (single entity) and compare_entities (generic comparison) by specifying the exact purpose and method.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives an explicit use case ('competitive AI-marketing audits') and context ('does Claude know about us?'). It does not explicitly list when not to use, but the purpose is well-defined. Sibling tool ai_visibility_check is implied for single-entity checks, so alternatives are implicitly clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_claimA

Read-onlyIdempotent

Inspect

Fact-check, verify, validate, or confirm/refute a natural-language factual claim or statement against authoritative sources. Use when an agent needs to check whether something a user said is true ("Is it true that…?", "Was X really…?", "Verify the claim that…", "Validate this statement…"). v1 supports company-financial claims (revenue, net income, cash position for public US companies) via SEC EDGAR + XBRL. Returns a verdict (confirmed / approximately_correct / refuted / inconclusive / unsupported), extracted structured form, actual value with pipeworx:// citation, and percent delta. Replaces 4–6 sequential calls (NL parsing → entity resolution → data lookup → numeric comparison).

ParametersJSON Schema

Name	Required	Description	Default
`claim`	Yes	Natural-language factual claim, e.g., "Apple's FY2024 revenue was $400 billion" or "Microsoft made about $100B in profit last year".

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully bears the burden. It discloses that the tool returns a verdict, structured form, actual value with citation, and percent delta. It is a read-only fact-checking operation, and the description accurately conveys the behavior without contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with three sentences covering purpose, scope, and return value. It is front-loaded with the primary action and efficiently packs information without unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one required parameter, no output schema), the description is remarkably complete. It specifies supported claim types, data sources, and the full return structure, leaving no ambiguity for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'claim' is described in the schema as a natural-language claim, and the description adds concrete examples (e.g., 'Apple's FY2024 revenue was $400 billion'). Schema coverage is 100%, so the description adds meaningful context beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it fact-checks natural-language claims, specifies the domain (company-financial claims for US public companies), and lists the verdict types and return fields. It distinguishes itself from sibling tools like ask_pipeworx by focusing on structured fact-checking with citations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'v1 supports company-financial claims... via SEC EDGAR + XBRL', which tells the agent when to use it. It also mentions replacing 4-6 sequential agent calls, implying efficiency. It doesn't explicitly list when not to use or suggest alternatives, but the scope is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Census Trade

Server Details

Tool Definition Quality

Available Tools

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Output Schema

Tool Definition Quality

Output Schema

Tool Definition Quality

Output Schema

Tool Definition Quality

Output Schema

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Discussions

Your Connectors

Resources