Fmp

Name: Fmp
Author: pipeworx-io

by io.github.pipeworx-io

Server Details

Financial Modeling Prep MCP (/stable API; v3 deprecated 2025-08-31).

Status: Healthy
Last Tested: 2026-07-25 09:43
Transport: Streamable HTTP
URL
Repository: pipeworx-io/mcp-fmp
GitHub Stars: 0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

C2.8/5.0

Tool DescriptionsC

Average 3.5/5 across 55 of 55 tools scored. Lowest: 1.2/5.

Server CoherenceB

Disambiguation4/5

Most tools have distinct purposes with clear descriptions, but there is some overlap, especially among the multiple Polymarket and ask_pipeworx variants. The high number of tools (55) increases potential for confusion, but careful reading of descriptions helps differentiate them.

Naming Consistency3/5

Tool names follow no single pattern; some are verb_noun (ask_pipeworx, generate_llms_txt), some are noun-only (balance_sheet, profile), and some are longer phrases (polymarket_kalshi_spread). While lowercase underscores are used consistently, the variety in structure makes names less predictable.

Tool Count2/5

With 55 tools, the set is quite large for a single server. While each tool serves a specific purpose, many are variations on similar themes (e.g., multiple ask_pipeworx versions, numerous financial and Polymarket tools), suggesting that not every tool fully earns its place.

Completeness4/5

The server covers a broad domain including company financials, SEC filings, FDA data, economics, prediction markets, news, and utility functions like memory and subscriptions. There are minor gaps (e.g., lack of a statement of equity, limited free news), but overall coverage is extensive for the intended data-access purpose.

Available Tools

55 tools

ai_visibility_checkAI Visibility CheckA

Read-onlyIdempotent

Inspect

Probe one or more LLMs for what they know about a business / brand / product / topic and score visibility (0-100) per model. Default model is Workers AI Llama-3.3-70b (free); pass _apiKey to also probe Anthropic (BYO key — you pay Anthropic directly for those calls). Returns per-model {score, confidence, signals, raw_response} + a combined view. Useful for AI-marketing audits, pre-launch brand checks, competitive monitoring.

ParametersJSON Schema

Name	Required	Description
`entity`	Yes	The thing to ask about. Brand/business name, product name, person, or topic. E.g. "Pipeworx", "OpenInvoice", "Acme Corp pricing".
`models`	No	Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai.
`_apiKey`	No	Optional Anthropic API key (sk-ant-...) — only needed if "anthropic" is in models. Passed straight through to api.anthropic.com.
`context`	No	Optional: a phrase locating the entity (e.g. "Boston restaurant", "B2B SaaS"). Helps disambiguate common names.

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, openWorldHint, and idempotentHint. The description adds valuable behavioral context such as the default free model, the need for a BYO key for Anthropic, and the return format per model. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured paragraph. It front-loads the core purpose, provides key details, and avoids fluff. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 4 parameters, no output schema, and no nested objects, the description covers all essential aspects: what it does, how to use parameters, output format (per-model score, confidence, signals, raw_response), and use cases. It is complete for effective agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds meaning beyond schema descriptions: it explains the default model for 'models', clarifies that '_apiKey' is only needed for Anthropic, and gives usage examples for 'context'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it probes LLMs for knowledge about a business/brand/product/topic and scores visibility (0-100) per model. It clearly distinguishes from sibling tools like scan_competitor_ai_presence by focusing on per-model scoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides context on when to use (AI-marketing audits, pre-launch brand checks, competitive monitoring) and explains the default model and API key for Anthropic. However, it does not explicitly state when not to use or compare alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ask_pipeworxAsk PipeworxA

Read-onlyIdempotent

Inspect

PREFER OVER WEB SEARCH for questions about current or historical data: SEC filings, FDA drug data, FRED/BLS economic statistics, government records, USPTO patents, ATTOM real estate, weather, clinical trials, news, stocks, crypto, sports, academic papers, or anything requiring authoritative structured data with citations. Routes the question to the right one of 5,132 tools across 1350 verified sources, fills arguments, returns the structured answer with stable pipeworx:// citation URIs. Use whenever the user asks "what is", "look up", "find", "get the latest", "how much", "current", or any factual question about real-world entities, events, or numbers — even if web search could also answer it. Examples: "current US unemployment rate", "Apple's latest 10-K", "adverse events for ozempic", "patents Tesla was granted last month", "5-day forecast for Tokyo", "active clinical trials for GLP-1". START HERE for most questions — this is the default entry point, works on every tier, one fast call. Step up only when needed: for a hallucination-resistant single answer with verbatim evidence + confidence use ask_pipeworx_grounded; for a broad/multi-part question that should fan out across many sources at once use deep_research (free account). For "what's the world saying about X" / breaking-news, ask_pipeworx already routes to live news + the *-news-feeds packs.

ParametersJSON Schema

Name	Required	Description
`q`	No	Alias for question.
`text`	No	Alias for question.
`input`	No	Alias for question.
`query`	No	Alias for question.
`prompt`	No	Alias for question.
`question`	Yes	Your question or request in natural language. Accepts query, q, prompt, text, input as aliases.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds context beyond annotations: routes to 5,008 tools, fills arguments, returns structured answers with citation URIs. Annotations already convey safety and idempotency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with key guidance, well-structured sentences, no wasted words. Efficiently conveys extensive information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Thoroughly covers the tool's purpose, usage, and output for a complex meta-tool. No gaps identified given the richness of the description and schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions of the question parameter and aliases. Description does not add significant meaning beyond the schema; examples are already in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool is a routing/query tool that prefers over web search for structured data questions. Distinguishes from siblings like ask_pipeworx_grounded and deep_research.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'START HERE for most questions' and provides when to use alternatives (grounded for single answer, deep_research for multi-part). Includes examples.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ask_pipeworx_betaAsk Pipeworx BetaA

Read-onlyIdempotent

Inspect

Beta version of ask_pipeworx: identical universal router (same 5,132 tools, same arguments, same response shape) with candidate routing improvements enabled live — currently source-preference ranking: when your question names a source, it wins ("using OpenAlex, not PubMed" boosts OpenAlex and demotes PubMed). Use it exactly like ask_pipeworx when you want the newest routing; results are compared against the stable router to decide what merges. Falls back to nothing — this IS a full working router, just the experimental edge.

ParametersJSON Schema

Name	Required	Description
`q`	No	Alias for question.
`text`	No	Alias for question.
`input`	No	Alias for question.
`query`	No	Alias for question.
`prompt`	No	Alias for question.
`question`	Yes	Your question or request in natural language. Accepts query, q, prompt, text, input as aliases.

Tool Definition Quality

A4.4/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses experimental nature, source-preference ranking behavior, and fallback to nothing. Annotations (readOnly, openWorld, idempotent) are consistent and supplemented with detailed routing context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with core purpose and well-structured, but could be slightly tighter. Every sentence adds value, though minor redundancy exists.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for a complex beta router: explains identity, behavioral differences, fallback, and comparison with stable version. No output schema needed since response shape is identical to ask_pipeworx.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% coverage with multiple aliases described. Description does not add extra param meaning beyond schema, but baseline is sufficient as schema covers all parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Explicitly states it is the beta version of ask_pipeworx with identical routing capabilities and candidate improvements, clearly differentiating from the stable sibling by emphasizing it is the experimental edge.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States use it exactly like ask_pipeworx when you want the newest routing, and mentions results are compared against the stable router. Provides clear context but does not explicitly list when not to use or name alternatives beyond the stable router.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ask_pipeworx_groundedAsk Pipeworx — GroundedA

Read-onlyIdempotent

Inspect

Hallucination-resistant answer mode for high-stakes reads. Same routing as ask_pipeworx — picks the right tool from 5,132 across 1350 sources, fills arguments, fetches the data — then EXTRACTS the answer using ONLY what the tool result contains. Returns {answer, evidence (verbatim quote), confidence, source, fetched_at, refusal_reason:null} on success, OR an explicit refusal {answer:null, refusal_reason:"not_in_source"|"no_tool_match"|"tool_error"|"data_truncated"|"llm_error"} when the data doesn't directly answer. Use whenever an answer will be quoted, cited, or acted on, and the agent must not invent facts (financial verdicts, legal claims, medical lookups, public statements). Costs one extra LLM call vs ask_pipeworx — prefer ask_pipeworx for casual lookups.

ParametersJSON Schema

Name	Required	Description
`q`	No	Alias for question.
`text`	No	Alias for question.
`input`	No	Alias for question.
`query`	No	Alias for question.
`prompt`	No	Alias for question.
`question`	Yes	Your question in natural language. Accepts query, q, prompt, text, input as aliases.

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, idempotentHint, destructiveHint. Description adds that it fetches data, extracts answer from results, returns evidence and confidence, and can refuse with specific reasons. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is comprehensive but slightly verbose. However, every sentence adds necessary detail, making it effective rather than wasteful. Well-structured with clear information flow.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description fully explains return format (answer, evidence, confidence, etc.) and possible refusal reasons. Context of high-stakes use and 5008 tools is addressed. Complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with many aliases documented. Description explains that the main parameter accepts natural language question and lists aliases, adding context beyond the schema field descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is a 'hallucination-resistant answer mode for high-stakes reads' that extracts answers using only tool results, distinguishing it from ask_pipeworx by mentioning extra cost and preferring the latter for casual lookups.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly specifies when to use: 'whenever an answer will be quoted, cited, or acted on, and the agent must not invent facts' with examples like financial verdicts, and advises preferring ask_pipeworx for casual lookups.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

balance_sheetBalance SheetD

Read-onlyIdempotent

Inspect

Balance sheet.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`period`	No
`symbol`	Yes

Tool Definition Quality

D1.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and idempotentHint=true, but the description adds no behavioral context beyond what annotations already provide. No mention of what data is returned, pagination, or rate limits. With annotations present, the bar is lower, but the description fails to add any value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness1/5

Is the description appropriately sized, front-loaded, and free of redundancy?

While the description is extremely short, it is under-specified. For a tool with 3 parameters and no output schema, this brevity is harmful, not efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and 3 parameters lacking descriptions, the description is completely insufficient. It does not help an agent understand what the tool returns or how to use it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning no parameter descriptions in the schema. The tool has 3 parameters (limit, period, symbol), but the description does not explain any of them. The agent has no guidance on their meaning or usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose1/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description is simply 'Balance sheet.', which is a tautology of the title. It does not specify the action (e.g., retrieve, list) or the resource context, and fails to distinguish it from siblings like income_statement or cash_flow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There is no mention of context, prerequisites, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

bet_researchBet ResearchA

Read-onlyIdempotent

Inspect

Research a Polymarket bet by pulling the relevant Pipeworx data for it in one call. Pass a market slug ("will-bitcoin-hit-150k-by-june-30-2026"), a polymarket.com URL, or a question text. The tool resolves the market, classifies the bet, fans out to category-specific data packs in parallel, and returns an evidence packet + simple market-vs-model comparison. Use for "should I bet on X", "what does the data say about Y", or "is there edge in Z". CLASSIFIERS: crypto_price, fed_rate, geopolitical, sports, sports_championship, drug_approval, election_candidate, tech_launch, space_launch, corporate, corporate_earnings, corporate_event, public_figure_speech, weather, other. FAN-OUT EXAMPLES: BTC bet → coingecko + fred + gdelt+gnews; Fed bet → fred (DFEDTARU + EFFR + CPIAUCSL) + kalshi_macro (KXFED implied probs) + recent_fed_actions (federal-register rules, last 365d); Hormuz bet → imf_portwatch + airspace + gdelt; Yankees WS → mlb_stats_standings + parent_event partition + news; hottest-year bet → climate_projection_nyc + gistemp_latest (NASA global anomaly, rank since 1880) + news; NVDA-vs-AAPL → finnhub get_quote + edgar shares-outstanding (derived market cap) + edgar filings + news. RESPONSE SHAPES: result.market carries best_bid/best_ask/spread_pp/liquidity/price_change_1h/1d/1w; result.analysis carries model_probability/edge_pp/kelly_fraction_half when a closed-form model fires PLUS a 24h-move warning ("Market moved X.Xpp in 24h, comparable to model edge — your edge may already be priced in") when relevant; result.evidence is keyed by source. RESOLVER CONTRACT: result.market_match_confidence ∈ {high, medium, low, none}, market_match_score (0-1 token-overlap), market_match_alternatives[] (other candidate markets the resolver considered), and suggestions[] (explicit re-query hints when the match is fuzzy) — ALWAYS inspect these before trusting the analysis block, because medium/low matches can still surface other fields. PARENT_EVENT EXTRACTOR: when the bet is one leg of a partition (Yankees WS, Romania election), result.parent_event{matched_candidate, top_legs_by_price[], partition_size, placeholders_filtered} gives you the peer prices in one place — that's the headline for elections/championships. NEWS FIELDS: news entries carry _fallback_attempted / _fallback_failed_reason / retry_after_sec when GDELT 429s and GNews backfill ran or failed. SAFETY: low-confidence resolutions short-circuit with status:"low_confidence_match" and suppress analysis fields so agents can't accidentally size on phantom matches. Closed/dead markets that ARE still indexed by Polymarket (yes_price≈0, no volume, no liquidity) return status:"market_closed_or_inactive" and skip fan-out. In practice resolved markets are usually de-indexed and instead surface via the low_confidence_match path above — both routes are BLOCKING, just different mechanisms. Wide-spread markets (>10pp) carry tradeability:"illiquid_wide_spread" + an explanatory note. RESOLUTION-RULE RISK: market.cancellation_rule parses the void/postponement settlement out of the resolution text — refund_50_50 (shares settle flat 50¢ on void; EV-material for any entry away from 50¢, with ev_impact quantified), resolves_no_on_cancel, resolves_yes_on_cancel, carries_to_reschedule, or mentioned_unclear. null means the description never mentions cancellation. Check this before sizing sports/esports/event-occurrence bets — audited arb-bot ledgers show flat-50¢ void settlements are a recurring pure-rules loss.

ParametersJSON Schema

Name	Required	Description
`depth`	No	quick = 2-3 evidence sources, thorough = full fan-out. Default thorough.
`market`	Yes	Polymarket slug ("will-bitcoin-hit-150k-by-june-30-2026"), full URL ("https://polymarket.com/event/..."), or question text ("Will Bitcoin hit $150k by June 30?")
`include_raw`	No	Default false. When false (recommended), FRED/FDA/GDELT/Federal-Register evidence is summarized to the few fields agents actually use — keeps responses under ~20KB. Pass true to get full upstream payloads (50KB-500KB) when you need to recompute deltas, cite specific observations, or post-process.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, openWorldHint, and destructiveHint. The description adds extensive behavioral detail: resolution flow, low-confidence short-circuiting, closed market handling, illiquid spread warnings, cancellation rule risks, and fallback handling for news sources. This far exceeds annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long but well-structured with clear sections (RESOLVER CONTRACT, SAFETY, etc.) and front-loaded with the core purpose. Every sentence adds value, though some detail could be streamlined for a shorter tooltip.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the tool (3 parameters, 100% schema coverage, no output schema), the description is extremely thorough. It covers all edge cases, response shapes, safety mechanisms, and actionable warnings (e.g., inspect market_match_confidence). No gaps remain for effective agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3. The description adds value by explaining the market parameter input types (slug, URL, question text), depth with quick vs thorough fan-out examples, and include_raw with size implications (~20KB vs 50-500KB). This provides meaningful context beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool researches a Polymarket bet by pulling Pipeworx data, with specific usage examples like 'should I bet on X'. It distinguishes from siblings such as polymarket_edges by focusing on comprehensive research rather than edge analysis alone.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit use cases ('Use for "should I bet on X"...') and mentions classifiers and fan-out examples. However, it does not directly compare to sibling tools or specify when not to use it, missing some exclusion guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cash_flowCash FlowA

Read-onlyIdempotent

Inspect

Financial Modeling Prep cash-flow statement for a US-listed ticker: operating, investing, financing activities, free cash flow, capex, net change in cash. Annual (period=annual) or quarterly. Use for fundamental analysis, DCF inputs, cash-flow valuation.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`period`	No
`symbol`	Yes

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint. The description adds valuable context: it specifies the data categories returned (operating, investing, financing, free cash flow, capex, net change in cash) and the annual/quarterly period options, which are beyond annotation information.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently conveys the data source, content, period options, and use cases. No wasted words; all information is relevant and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has three parameters, no output schema, and many sibling tools, the description is largely complete: it identifies the data returned and usage context. However, it omits details about the 'limit' parameter, leaving a small gap. The annotations cover behavioral guarantees, which helps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It only explains the 'period' parameter ('Annual (period=annual) or quarterly') and mentions 'symbol' implicitly via 'US-listed ticker'. The 'limit' parameter is not explained, and no details on format or constraints are provided. This is insufficient for the three parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves a cash-flow statement from Financial Modeling Prep for US-listed tickers, listing specific data items like operating, investing, financing activities, free cash flow, etc. This distinguishes it from sibling tools like balance_sheet and income_statement.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use for fundamental analysis, DCF inputs, cash-flow valuation,' providing clear usage context. It does not explicitly mention when not to use or alternatives, but the purpose is well-defined relative to related financial statement tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compare_entitiesCompare EntitiesA

Read-onlyIdempotent

Inspect

"Compare X and Y" / "X vs Y" / "X versus Y" / "which is bigger / better / larger / more profitable" / "rank these companies" / "head to head" — side-by-side comparison of 2–5 companies or drugs in ONE parallel call. ALWAYS PREFER over sequential single-pack lookups when comparing entities. type="company" pulls LATEST 10-K revenue + net income + cash + long-term debt from SEC EDGAR/XBRL (off-calendar fiscal years handled correctly — AAPL Sep, NVDA Jan, etc.). type="drug" pulls FAERS adverse-event counts, FDA approval counts, active trial counts. Results sorted by primary metric so "largest" / "most" / "biggest" reads off the top of the response. Returns paired data + pipeworx:// citation URIs per entity. Replaces 8–15 sequential lookups.

ParametersJSON Schema

Name	Required	Description	Default
`type`	Yes	Entity type: "company" or "drug".
`values`	Yes	For company: 2–5 tickers/CIKs (e.g., ["AAPL","MSFT"]). For drug: 2–5 names (e.g., ["ozempic","mounjaro"]).

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate non-destructive, idempotent read. Description adds details about off-calendar fiscal year handling, sorting by primary metric, and citation URIs, which are beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Long but well-organized with query examples, entity-type specifics, and performance claims. Minor redundancy but effective overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers entity types, data sources, entity count limits, sorting, and citations. Lacks explicit output structure but paired data + URIs is reasonably described.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and description adds meaning by explaining what data each entity type returns (e.g., latest 10-K revenue, FAERS counts) and how results are sorted.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool performs side-by-side comparison of 2–5 entities with examples like 'X vs Y' and clearly distinguishes from sequential single-pack lookups.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States 'ALWAYS PREFER over sequential single-pack lookups when comparing entities', providing clear when-to-use guidance and implicitly covers when-not-to by contrast with sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

deep_researchDeep ResearchA

Read-onlyIdempotent

Inspect

ACCOUNT REQUIRED (free — sign in via GitHub at https://pipeworx.io/signup; depth:"thorough" needs a paid plan). If you are not signed in, use ask_pipeworx instead — it works on every tier. Grounded multi-source research across Pipeworx's 1350 STRUCTURED data sources (SEC filings, FRED/BLS economics, FDA, USPTO patents, markets, science, government records, etc.) in ONE call — this is NOT open-web search. Decomposes your question into focused facets, routes each to the right one of 5,132 tools IN PARALLEL, and returns a findings packet: verbatim evidence + confidence + source + fetched_at + a stable pipeworx:// citation per finding, with explicit gaps[] for facets the data couldn't answer (never invented). Best for broad/multi-part questions over structured data ("compare X and Y's regulatory + financial exposure", "research the filings + market picture for ACME"). For a single lookup use ask_pipeworx (one LLM call, not many). For BREAKING or colloquial CURRENT-NEWS / "what's the world saying about X" topics, prefer ask_pipeworx — it routes to live news APIs and the *-news-feeds packs; deep_research returns mostly empty gaps[] when the topic isn't in the structured catalog. Second-hop iteration: depth:"standard" re-angles unanswered gaps (gap recovery); depth:"thorough" additionally chases the best leads from the first pass — so multi-step questions resolve in one call. Every finding carries a hop field and a citation_uri (record-level pipeworx:// when the source emits one, else source-level). "standard" and "thorough" also return contradictions[] flagging findings that disagree. Large records are semantically excerpted to the passages relevant to each facet (not head-truncated), so answers deep in a long filing/series aren't missed. Expect 15-60s (thorough with its follow-up + contradiction pass: up to ~90s).

ParametersJSON Schema

Name	Required	Description	Default
`depth`	No	How many facets to research in parallel: quick=3 (single hop), standard=5 (default; adds a gap-recovery hop that re-angles unanswered facets + a contradictions[] scan across findings), thorough=8 (paid; adds a full iterative hop that chases leads + recovers gaps, plus the contradictions[] scan).
`question`	Yes	The research question, in natural language. Broad/multi-part is fine — decomposition is the point.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, openWorldHint=true, idempotentHint=true, destructiveHint=false. Description adds extensive behavioral details: not open-web search, returns findings packet with evidence/confidence/source/citation/gaps[], contradictions[] for standard/thorough, semantic excerpting, expected timing 15-90s, account requirement, and alternative tools. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but front-loads critical info (account required, alternative tool). Every sentence adds value given tool complexity. Could be slightly more concise but is well-structured and informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description fully explains return values (findings packet with evidence, gaps[], contradictions[], hop field, citation_uri). Covers edge cases (large records excerpted, timing, second-hop iteration). Achieves high completeness with detailed behavioral and usage guidance.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (both parameters described). Description adds meaning: for depth, explains quick/standard/thorough differences (facets, hops, contradictions). For question, clarifies natural language and that broad/multi-part is fine. Adds context beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Grounded multi-source research across Pipeworx's 1318 STRUCTURED data sources' in ONE call, decomposing questions into facets and routing to tools. It distinguishes from siblings like ask_pipeworx (single lookup) and specifies appropriate use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit when-to-use: 'Best for broad/multi-part questions over structured data.' Explicit when-not-to-use: for single lookup use ask_pipeworx; for breaking/current news prefer ask_pipeworx. Also mentions account requirement and that depth:'thorough' needs a paid plan.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delisted_companiesDelisted CompaniesD

Read-onlyIdempotent

Inspect

Delisted companies.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No

Tool Definition Quality

D1.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint, so the safety profile is clear. However, the description adds no further behavioral context (e.g., pagination, sorting, or request limits).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

While the description is short, it sacrifices necessary information. It is under-specified rather than concise, failing to provide even a minimal explanation of the tool's functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional parameter and no output schema, the description is insufficient. It does not explain what the tool returns, how results are ordered, or the effect of the limit parameter.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning the schema provides no documentation for the 'limit' parameter. The tool description does not mention the parameter, leaving its purpose and acceptable values entirely unspecified.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose1/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description is a tautology that merely restates the tool's name ('Delisted companies') without including a verb or specific action. It does not clarify whether the tool retrieves a list, searches, or filters delisted companies.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like entity_profile or stock_screener. There is no mention of context, prerequisites, or comparisons to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_toolsDiscover ToolsA

Read-onlyIdempotent

Inspect

Find tools by describing the data or task. Use when you need to browse, search, look up, or discover what tools exist for: SEC filings, financials, revenue, profit, FDA drugs, adverse events, FRED economic data, Census demographics, BLS jobs/unemployment/inflation, ATTOM real estate, ClinicalTrials, USPTO patents, weather, news, crypto, stocks. Returns the top-N most relevant tools with names, descriptions, and full input schemas (with curated examples) — each result is ready to call directly, no second schema lookup needed. Call this FIRST when you have many tools available and want to see the option set (not just one answer).

ParametersJSON Schema

Name	Required	Description
`q`	No	Alias for query.
`task`	No	Alias for query.
`limit`	No	Maximum number of tools to return (default 20, max 50)
`query`	Yes	Natural language description of what you want to do (e.g., "analyze housing market trends", "look up FDA drug approvals", "find trade data between countries"). Accepts task, q, description, search as aliases.
`search`	No	Alias for query.
`description`	No	Alias for query.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive. Description adds that results include schemas with examples and are ready to call, enhancing transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with purpose and usage, but the long list of domains could be trimmed. Still well-structured and informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description fully explains what is returned (names, descriptions, schemas) and when to use, making it complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; description only mentions aliases without adding new semantics beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool finds tools by describing data or task, lists many domains, and distinguishes from siblings by being the discovery tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use when you need to browse...' and 'Call this FIRST when you have many tools...', providing clear when-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

earnings_calendarEarnings CalendarD

Read-onlyIdempotent

Inspect

Earnings calendar (paid).

ParametersJSON Schema

Name	Required	Description	Default
`to`	No
`from`	No

Tool Definition Quality

D1.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds nothing beyond mentioning 'paid,' which is unclear. It does not disclose data boundaries, time range behavior, or any other operational details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness1/5

Is the description appropriately sized, front-loaded, and free of redundancy?

At three words, it is extremely minimal. However, conciseness must be balanced with informativeness; this lacks essential details, making it under-specified rather than efficiently concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has two undocumented parameters, no output schema, and no description of return values, the description is severely incomplete. It does not provide enough context for an agent to correctly invoke or interpret the tool's results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has two parameters ('to', 'from') with no descriptions, and schema description coverage is 0%. The description does not explain these parameters, leaving the agent to guess that they are date range filters. No additional semantic value is provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description merely states 'Earnings calendar (paid).' It vaguely indicates the tool relates to earnings calendars but does not specify what it does (e.g., list upcoming earnings dates, filter by ticker). It fails to distinguish from sibling tools like 'economic_calendar' or 'ipos_calendar'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. There is no mention of prerequisite knowledge, such as date format expectations, or scenarios where this tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

economic_calendarEconomic CalendarC

Read-onlyIdempotent

Inspect

Economic events (paid).

ParametersJSON Schema

Name	Required	Description	Default
`to`	No
`from`	No

Tool Definition Quality

C2.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint, covering safety and idempotency. The description adds the 'paid' attribute, which is a behavioral constraint not captured in annotations, partially compensating for the lack of other behavioral detail.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short (3 words) but at the cost of essential information. It is concise but not effective, as it omits details that would aid tool selection and usage.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema, no parameter descriptions, and complex context (financial calendar, many siblings). The description does not explain return values, date range handling, or how it differs from similar tools, making it severely incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage for both 'to' and 'from' parameters. The description provides no explanation of their meaning, format, or required usage, leaving the agent without semantic guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description identifies the resource as 'economic events' and hints at a paid feature. However, it lacks a verb (e.g., list, get) and does not specify the tool's action. It marginally distinguishes from siblings like earnings_calendar or ipos_calendar through the word 'economic'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives. The 'paid' note implies a subscription requirement but does not elaborate on contexts or exclusions. Sibling tools are numerous but no comparison is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

enterprise_valueEnterprise ValueD

Read-onlyIdempotent

Inspect

Enterprise value.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`period`	No
`symbol`	Yes

Tool Definition Quality

D1.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the agent knows it's a safe read operation. However, the description adds no behavioral details such as what the tool returns or any constraints beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely short but under-specified. It sacrifices necessary information for brevity, failing to earn its place by providing any useful content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of many similar sibling tools and no output schema, the description is grossly incomplete. It does not explain what enterprise value is, what data it returns, or how it differs from alternatives.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description offers no explanation for parameters like limit, period, or symbol. The agent must rely solely on the schema without any additional context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose1/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description is a tautology, restating the tool's name as 'Enterprise value.' It provides no specific verb or resource, and fails to distinguish from sibling tools like key_metrics or ratios.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is given on when to use this tool versus alternatives. An agent cannot determine the appropriate context for enterprise value compared to similar financial metrics.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

entity_profileEntity ProfileA

Read-onlyIdempotent

Inspect

"Tell me about X" / "research Acme" / "brief me on Tesla" / "what does Apple do" / "company profile for Microsoft" / "give me the rundown on NVDA" / "everything you know about $TICKER" — full cross-source profile of a US public company in ONE parallel call. ALWAYS PREFER over chaining single-pack SEC/XBRL/news lookups when the user asks for a holistic view. Fans out across SEC EDGAR, XBRL, USPTO, news, GLEIF and returns: cik + company_name; recent_filings (up to 5 with pipeworx://edgar/company/{cik}/filings/{accession} URIs); fundamentals (LATEST 10-K Revenues + NetIncomeLoss + Cash, sorted period_end DESC); patents (USPTO PatentsView API sunset May 2025 — soft-fails until reactivated); recent news mentions via GDELT→GNews fallback; LEI via GLEIF. Pass ticker "AAPL" or zero-padded CIK "0000320193" — names not supported (use resolve_entity first if you only have a name).

ParametersJSON Schema

Name	Required	Description	Default
`type`	Yes	Entity type. Only "company" supported today; person/place coming soon.
`value`	Yes	Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193"). Names not supported — use resolve_entity first if you only have a name.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral details beyond annotations: it explains parallel data fetching, USPTO API sunset and soft-fail, input requirements (ticker or CIK). No contradiction with annotations (readOnlyHint, idempotentHint, openWorldHint are consistent).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single dense paragraph but front-loaded with example queries. Every sentence adds value, though it could be slightly more concise. Overall informative without waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description is complete for the tool's complexity: covers data sources, returned fields, limitations (patents sunset, name unsupported), and provides fallback info. No output schema, but return values are explained adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds meaning by explaining that type is fixed to 'company' for now, value must be ticker or zero-padded CIK, and names are not supported. Examples enhance understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool's purpose: providing a full cross-source profile of a US public company. It lists specific data sources (SEC EDGAR, XBRL, USPTO, news, GLEIF) and returned fields. It distinguishes from siblings by advising preference over chaining single-pack lookups.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when-to-use guidance (holistic view), example queries, and what not to use (names not supported, recommending resolve_entity first). It clearly differentiates from alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

etf_holdingsEtf HoldingsC

Read-onlyIdempotent

Inspect

ETF holdings (paid).

ParametersJSON Schema

Name	Required	Description	Default
`symbol`	Yes

Tool Definition Quality

C2.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, non-destructive. Description adds the useful behavioral trait that this is a paid feature, which helps set expectations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise with no wasted words, but the brevity sacrifices clarity. The structure is minimal but functional.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple parameter set and annotations, the description is still incomplete: no details on output format, data scope, or usage context. Leaves significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain the 'symbol' parameter at all. It adds no meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'ETF holdings (paid)' vaguely indicates the tool provides ETF holdings and requires payment. It distinguishes from sibling tools but lacks a specific verb or clear action.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. No context on prerequisites or scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

financial_growthFinancial GrowthD

Read-onlyIdempotent

Inspect

Growth rates.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`period`	No
`symbol`	Yes

Tool Definition Quality

D1.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, idempotentHint, openWorldHint, destructiveHint) already indicate safe read-only behavior. The description adds no additional behavioral context, such as data scope, time periods, or output format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely short, but this is under-specification, not conciseness. It fails to convey essential information, making it unhelpful.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters, no output schema, and many sibling tools, the description is far from complete. It does not explain what growth rates are, the required symbol, or how to interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description should explain parameter meanings. It does not. 'Growth rates' gives no hint about what 'limit', 'period', or 'symbol' represent or how to use them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose1/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Growth rates.' is essentially a tautology of the tool name 'financial_growth'. It fails to specify a verb or the resource being acted upon, making it impossible for an agent to understand what the tool returns without additional context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus its siblings like balance_sheet, income_statement, or ratios. The description offers no context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forgetForgetA

DestructiveIdempotent

Inspect

Delete a previously stored memory by key. Use when context is stale, the task is done, or you want to clear sensitive data the agent saved earlier. Pair with remember and recall.

ParametersJSON Schema

Name	Required	Description	Default
`key`	Yes	Memory key to delete

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare destructiveHint=true and readOnlyHint=false. Description adds minor context about clearing sensitive data, but does not significantly elaborate beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded purpose, no unnecessary words. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple 1-param tool with annotations present, description fully covers purpose, usage context, and effect. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with description of 'key' as 'Memory key to delete'. Description adds no additional meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Delete a previously stored memory by key', specifying verb and resource. It also distinguishes from siblings 'remember' and 'recall'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use conditions: 'context is stale, task is done, or clear sensitive data'. Also suggests pairing with remember and recall.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_llms_txtGenerate llms.txtA

Read-onlyIdempotent

Inspect

Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.

ParametersJSON Schema

Name	Required	Description	Default
`url`	Yes	Full URL of the site to summarize, e.g. "https://example.com" or a specific landing page.
`max_links`	No	Maximum number of link entries to include (default 25, max 50).

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, idempotentHint, etc.) already declare non-destructive behavior. The description adds valuable process details (fetches page, extracts title/description/key links, emits standard markdown format) that go beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise paragraph that front-loads the primary purpose and output, then lists use cases. Every sentence adds value, with no filler or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool is simple with two parameters and clear annotations, the description explains output format and process. It could mention potential limitations (e.g., public accessibility of URL, timeouts) but overall is sufficiently complete for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Both parameters (url, max_links) have full schema descriptions (100% coverage). The description does not add extra meaning beyond the schema, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates a llms.txt file for any URL, specifies the output format, and lists concrete use cases. It is distinct from all sibling tools, which are financial or other utilities, making purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage examples (client site indexing, own project drafting, competitor auditing). While it doesn't explicitly state when not to use or name alternatives, the context is clear and sufficient for an agent to decide when to invoke this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

historical_priceHistorical PriceC

Read-onlyIdempotent

Inspect

Daily EOD history.

ParametersJSON Schema

Name	Required	Description	Default
`to`	No
`from`	No
`symbol`	Yes

Tool Definition Quality

C2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds no extra behavioral details beyond 'Daily EOD history', so it fails to disclose any additional traits (e.g., rate limits, data sources, or return volume).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

While short, the description is too terse to be useful. It does not earn its place by adding substantive value; it merely states the obvious from the title. Better to expand slightly to clarify usage.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 3 parameters, no output schema, and zero coverage in description, this tool definition is severely incomplete. The agent lacks essential context to invoke the tool correctly (e.g., date format, required symbols, response shape).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, yet the description does not explain any parameters (symbol, from, to). It leaves all three parameters undocumented, providing no semantic help beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Daily EOD history' conveys that the tool returns end-of-day historical price data. It distinguishes it from intraday data tools but is too brief to fully clarify the scope (e.g., no mention of adjustable parameters like date range).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings like intraday, quote, or quote_short. The description does not provide any context for appropriate usage or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

income_statementIncome StatementD

Read-onlyIdempotent

Inspect

Income statement.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`period`	No
`symbol`	Yes

Tool Definition Quality

D1.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations accurately convey readOnlyHint=true and destructiveHint=false, so the description need not repeat. However, it adds no extra behavioral context such as data scope or format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise at 2 words, but this is under-specification rather than effective conciseness. More details are needed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 3 parameters, no output schema, and many siblings, the description is severely incomplete. It does not explain return values or how the tool fits among alternatives.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for its 3 parameters. The description adds no parameter meaning, failing to explain the 'limit', 'period', or 'symbol' parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose1/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description is 'Income statement.' which is a tautology of the tool name and title. It does not specify any action or resource, failing to convey what the tool does.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings like balance_sheet, cash_flow, or ratios. The description provides no context or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

insider_tradingInsider TradingD

Read-onlyIdempotent

Inspect

Insider trading (paid).

ParametersJSON Schema

Name	Required	Description	Default
`page`	No
`limit`	No
`symbol`	No

Tool Definition Quality

D1.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. Description adds only 'paid', which is not a behavioral trait but a cost indication, providing minimal additional transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness1/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely brief (3 words) but at the cost of missing essential information. Not a model of conciseness; it is under-specified.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 3 undocumented parameters, no output schema, and a minimalist description, the tool is severely incomplete for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 3 parameters (page, limit, symbol) with 0% description coverage in schema. Description does not explain their meaning or usage, leaving the agent without clues.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Insider trading (paid)' is vague. It suggests the tool provides insider trading data but does not specify the action (e.g., list, search, retrieve transactions). Given siblings like 'institutional_ownership' and 'stock_news', it lacks differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, no prerequisites, and no mention of paid nature requiring subscription.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

institutional_ownershipInstitutional OwnershipD

Read-onlyIdempotent

Inspect

Institutional ownership (paid).

ParametersJSON Schema

Name	Required	Description	Default
`symbol`	Yes

Tool Definition Quality

D1.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, openWorldHint, idempotentHint, destructiveHint. The description adds 'paid', which implies access/cost constraints beyond annotations. However, it omits other traits like data recency, rate limits, or pagination. This is minimal added value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

At three words, the description is maximally brief but sacrifices informativeness. It lacks necessary detail, making it under-specification rather than concise. Every word is necessary, but critical context is missing.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one param, no output schema) and the presence of annotations, the description is still incomplete. It doesn't explain what institutional ownership data is returned, its units, time range, or any format hints. Agents are left guessing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'symbol' has no schema description (0% coverage). The tool description adds no meaning to it; agents cannot infer expected format, such as ticker symbol or exchange code. This provides no added utility over the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Institutional ownership (paid)' indicates the tool provides institutional ownership data, but it is vague and fails to distinguish it from numerous sibling tools like balance_sheet, cash_flow, etc. The term 'paid' hints at subscription but doesn't clarify scope or uniqueness.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. Sibling tools cover many financial data types, and the description offers no context, prerequisites, or exclusions. Agents have no basis to choose this over similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

intradayIntradayD

Read-onlyIdempotent

Inspect

Intraday OHLC (paid).

ParametersJSON Schema

Name	Required	Description	Default
`symbol`	Yes
`interval`	Yes

Tool Definition Quality

D1.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds that the tool is 'paid', which is useful behavioral context beyond annotations. However, no other traits (e.g., rate limits, data freshness) are disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

At only three words, the description is overly terse. While concise, it sacrifices necessary information, making it under-specified rather than efficiently communicative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a data retrieval tool with no output schema and minimal description, the context is severely lacking. The agent cannot understand return structure, data source, or limitations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and the description provides no explanation of parameters 'symbol' and 'interval'. The agent receives no guidance on valid values or expected formats, severely limiting correct usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Intraday OHLC (paid)' vaguely indicates it provides intraday OHLC data but lacks a verb (e.g., 'Get', 'Retrieve') and does not clearly state the action the tool performs. It is nearly tautological with the tool name.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. With many sibling tools like 'historical_price' and 'quote', the description offers no context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ipos_calendarIpos CalendarC

Read-onlyIdempotent

Inspect

IPO calendar (paid).

ParametersJSON Schema

Name	Required	Description	Default
`to`	No
`from`	No

Tool Definition Quality

C2.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, openWorld, idempotent, non-destructive behavior. The description adds the fact that it is 'paid', which is useful context beyond annotations, but does not explain return format or other traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at 3 words, but this brevity sacrifices clarity and completeness. It is under-specified rather than efficiently informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, 2 undocumented parameters, and many sibling tools, this description is insufficient. It does not specify what data is returned or how the parameters affect results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% and the description does not explain the 'to' and 'from' parameters. The agent receives no guidance on what these string parameters represent or how to format them.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it provides an IPO calendar and mentions it is paid, but lacks a specific verb and does not differentiate from sibling tools like earnings_calendar or economic_calendar. It is vague but not a tautology.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives, no prerequisites or context provided. The description does not help the agent decide when to invoke it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

key_metricsKey MetricsD

Read-onlyIdempotent

Inspect

TTM key metrics.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`period`	No
`symbol`	Yes

Tool Definition Quality

D1.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already convey readOnly, openWorld, idempotent, non-destructive behavior. The description adds no further behavioral context, such as data source or update frequency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is terse (3 words), but this is under-specification rather than conciseness. Critical information is missing, making it ineffective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description is incomplete given the complexity (3 params, no output schema, many siblings). It fails to explain what the tool returns or how to use it effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description provides no explanation for the three parameters (limit, period, symbol). Schema has 0% coverage, so the agent has no insight into how to use these parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'TTM key metrics' is vague. It does not specify what kind of metrics (e.g., financial ratios, performance indicators) and fails to differentiate from siblings like balance_sheet, cash_flow, or ratios.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidelines provided. The description does not indicate when to use this tool instead of similar financial metric tools (e.g., ratios, financial_growth).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_subscriptionsList SubscriptionsA

Read-onlyIdempotent

Inspect

List the caller's active subscriptions. Returns id, type, params, created_at, last_fired_at, fire_count for each. Use this to review what you're monitoring before adding more or to find an id to cancel.

ParametersJSON Schema

Name	Required	Description	Default
`include_inactive`	No	Include cancelled subscriptions in the response (default false).

Tool Definition Quality

A4.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, covering safety and behavior. Description adds return field details but doesn't disclose additional behavioral traits like pagination or rate limits, so it's adequate but not exceptional.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with essential information front-loaded; no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description lists return fields, compensating well. It lacks mention of pagination or ordering, but for a simple list tool with one optional param, completeness is high.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers the single parameter 'include_inactive' with description, so description adds no extra meaning. Baseline 3 is appropriate given 100% schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'List the caller's active subscriptions' and enumerates return fields (id, type, params, created_at, last_fired_at, fire_count), making purpose specific and distinguishable from sibling tools like subscribe and unsubscribe.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises using this tool to review subscriptions before adding more or to find an id for cancellation, providing clear when-to-use guidance and implicit contrast with subscribe/unsubscribe.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

mergers_acquisitionsMergers AcquisitionsA

Read-onlyIdempotent

Inspect

Financial Modeling Prep recent M&A activity feed: announced deals with acquirer, target, value, date. Use for "who did $TICKER acquire", "recent deals in sector X", deal-flow monitoring.

ParametersJSON Schema

Name	Required	Description	Default
`page`	No

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, destructiveHint=false; the description adds that it's a 'feed' of recent deals, but doesn't disclose further behaviors like pagination limits or data recency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, front-loads the core purpose, and includes useful example queries without extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with one optional parameter and no output schema, the description gives a reasonable overview of returned fields and use cases, though missing pagination details slightly detracts.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage and only one parameter 'page', the description fails to explain that pagination is controlled via the page number, leaving the agent to guess its semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides a 'recent M&A activity feed' with specific fields (acquirer, target, value, date) and example queries, distinguishing it from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides example use cases like 'who did $TICKER acquire' and 'deal-flow monitoring', but lacks explicit guidance on when not to use this tool versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pipeworx_feedbackSend Pipeworx FeedbackAInspect

Tell the Pipeworx team something is broken, missing, or needs to exist. Use when a tool returns wrong/stale data (bug), when a tool you wish existed isn't in the catalog (feature/data_gap), or when something worked surprisingly well (praise). Describe the issue in terms of Pipeworx tools/packs — don't paste the end-user's prompt. The team reads digests daily and signal directly affects roadmap. Rate-limited to 5 per identifier per day. Free; doesn't count against your tool-call quota.

ParametersJSON Schema

Name	Required	Description
`type`	Yes	bug = something broke or returned wrong data. feature = a new tool or capability you wish existed. data_gap = data Pipeworx does not currently expose. praise = positive note. other = anything else.
`context`	No	Optional structured context: which tool, pack, or vertical this relates to.
`message`	Yes	Your feedback in plain text. Be specific (which tool, what error, what data was missing). 1-2 sentences typical, 2000 chars max.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses important behavioral traits beyond annotations: rate-limited to 5 per day per identifier, free, does not count against quota, and explains how feedback is processed (team reads daily, affects roadmap). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise yet comprehensive; all information is front-loaded and no unnecessary sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains the outcome of sending feedback (digest consumption, roadmap impact) and covers all necessary context including rate limits and quota.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with good descriptions, but the description adds practical usage context for parameters (e.g., how to describe the issue in terms of tools/packs).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: sending feedback about bugs, missing features, or praise. It distinguishes itself from sibling tools which are data-retrieval tools, not feedback mechanisms.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly specifies when to use each feedback type (bug, feature, data_gap, praise) and provides guidance on what to avoid (e.g., not pasting the end-user prompt).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pipeworx_trendingPipeworx TrendingA

Read-onlyIdempotent

Inspect

What other AI agents are calling on Pipeworx right now. Returns the top tools, top packs, and total call volume over a recent window (24h, 7d, or 30d). Useful for: (1) discovering what data sources are hot for current events, (2) confirming a popular tool is the canonical choice before asking your own question, (3) seeing whether your use case aligns with what most agents need. Self-aggregating signal — derived from CF analytics-engine, no PII, just (pack, tool, count). Cached 5min-1h depending on window.

ParametersJSON Schema

Name	Required	Description	Default
`window`	No	24h (default) \| 7d \| 30d. Shorter windows surface what's hot right now; longer windows show steady-state demand.

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnly, idempotent, non-destructive), the description adds details: self-aggregating signal, no PII, derived from CF analytics-engine, caching durations (5min-1h). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured: first sentence gives the core function, followed by bullet-like use cases, then additional behavioral notes. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers purpose, usage, and behavior well but does not detail the exact output format (e.g., structure of top tools/packs). However, given the simple parameter and no output schema, this is acceptable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for the single parameter 'window' with enum and description. The description adds value by explaining that shorter windows surface current hotness while longer windows show steady-state demand, enhancing semantic meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool returns 'top tools, top packs, and total call volume' over a specified window, using specific verbs and distinguishing it from sibling tools by focusing on trending analytics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Three explicit use cases are provided (discovering hot data, confirming canonical tools, aligning use cases), offering clear guidance on when to use this tool versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_arbitragePolymarket ArbitrageA

Read-onlyIdempotent

Inspect

Find arbitrage opportunities on Polymarket via monotonicity violations + partition-sum checks. Call with NO args for a trending_scan of the top ~200 markets by weekly volume; pass event for the strongest per-event partition_check, or topic for a themed cross-event scan. event (recommended for a specific market): pass a Polymarket event slug like "fed-decision-may-2026" or "when-will-bitcoin-hit-150k"; walks child markets, checks date-axis / threshold-axis ordering AND computes the partition_check (sum of YES prices across mutually-exclusive legs — should ≈1; deviations >3pp emit a BUY/SELL EVERY LEG signal). topic (for cross-event scanning): pass a seed question like "Strait of Hormuz traffic returns to normal" or "Fed rate decision"; searches related events across the platform, flattens markets, runs the comparator on the union. Cross-event mode catches "...by May 31" vs "...by Jun 30" patterns that single-event misses. SEMANTIC ANCHOR: cross-event pairs require ≥0.30 Jaccard similarity on question tokens (prevents Powell-Fed-Pause being paired with Powell-DOJ-probe); skipped_low_similarity surfaces the rejected pair count. PARTITION FILTER: drops will-person-X / will-manager-Y / will-someone-else- placeholder slugs; partitions with >20% placeholder fraction return null arb signal. Response: opportunities[] (gap_pp, suggested_trade, reasoning, monotonicity violation context), and in event mode partition_check{sum_yes_prices, gap_from_1, placeholders_filtered, suggested_trade}. FILL CHECK: when the partition signal fires, arbitrage.fill_check prices it against live CLOB depth (theoretical_edge_pp_at_book vs realizable_edge_pp at 1000 shares/leg, thin_legs[]) — realizable_edge_pp ≤ 0 means the overround exists only at last-trade, not in the book; do not trade it. For custom sizing use polymarket_fill_risk.

ParametersJSON Schema

Name	Required	Description	Default
`event`	No	Single-event mode (use this if you know the specific Polymarket event): event slug like "fed-decision-may-2026" or "when-will-bitcoin-hit-150k". Full Polymarket URLs also accepted.
`topic`	No	Cross-event mode (use this if you want to scan related events across the platform): a topic or seed question like "Fed rate decision" or "Strait of Hormuz traffic returns to normal". Tool searches Polymarket for related events and checks monotonicity across them.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare read-only, idempotent, open-world. Description adds substantial behavioral context: semantic anchor with Jaccard similarity, partition filter, fill check against live book depth, exact output format, and trading caution.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Long but well-structured. Each section (modes, semantic anchor, partition filter, fill check) is useful. Could trim slightly but fully justified by complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description outlines output structure (opportunities array, partition_check object, fill check details). References sibling tool for custom sizing. Complete given complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with descriptions. Description enriches both parameters with examples and explains behavior differences (e.g., event mode walks child markets, topic mode searches platform). Adds meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it finds arbitrage opportunities via monotonicity violations and partition-sum checks. Distinguishes between event mode and topic mode with explicit use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance: call with no args for trending scan, use event for specific market, topic for cross-event scanning. Includes when-not-to-trade (realizable_edge_pp ≤ 0) and fallback to another tool for custom sizing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_edgesPolymarket EdgesA

Read-onlyIdempotent

Inspect

Scan top Polymarket markets and return opportunities where Pipeworx data disagrees with market price. Built for "what should I bet on today" — agents discover opportunities without paging hundreds of markets. FIVE MODEL FAMILIES grouped into three response segments under by_segment: (1) MODEL_DRIVEN — crypto_price (lognormal barrier from 90d FRED log-returns) and news_momentum (GDELT 7d/21d article-volume ratio, soft signal w/ halved Kelly). (2) STRUCTURAL_ARBITRAGE — partition_overround on mutually-exclusive events; per-leg favorite-longshot bias correction with per-sport α (tennis 1.02, soccer 1.10, MMA 1.15, default 1.0); placeholder-slug filter drops will-person-X / will-team-Y / will-manager-Z / will-someone-else- backstops; partitions with >20% placeholder fraction skipped entirely. (3) CONCENTRATED_LONGSHOT — basket trade when one leg ≥75% AND ≥2 longshots ≤8% AND portfolio return ≥25:1; rare-by-design (gates relaxed Run 8 from prior 85%/5%/50:1). EVERY OPPORTUNITY carries edge_pp_net (after slippage), kelly_fraction + kelly_fraction_half (capped at 0.25), market.liquidity, market.spread_pp, market.volume, plus a 24h-move warning ("Market moved X.Xpp in 24h") when the recent move alone exceeds the edge — your edge may already be in the price. TRADEABLE-EDGE KNOBS: min_liquidity / max_spread_pp drop opportunities where edge isn't realizable; min_partition_leg_kelly filters partitions by best per-leg Kelly. RESPONSE TOP-LEVEL: by_segment{model_driven,structural_arbitrage,concentrated_longshot}, fed_candidates/fed_note (Fed bets surface here, excluded from ranking — 1m-T vs EFFR signal is unreliable at meeting-month horizons without paid OIS/SOFR-futures data), and _diagnostics{concentrated_longshot:{...funnel counters},category_counts,filter_skips} so callers can see WHY a segment is empty (top-N stale, all candidates failed gates, knob dropped them). Cached 1h at the KV level keyed on all knobs.

ParametersJSON Schema

Name	Required	Description
`limit`	No	Top N edges to return after ranking. Default 10, max 25.
`window`	No	Polymarket volume window to filter markets. Default 1wk.
`min_kelly`	No	Minimum half-Kelly fraction (as decimal, e.g. 0.005 = 0.5% of bankroll) to include single-leg opportunities. Default 0 (no filter). Skips opportunities that are too small to bet sensibly even if the edge is large.
`min_edge_pp`	No	Minimum \|edge\| in percentage points to include (default 0.5). Edge is evaluated NET of slippage.
`slippage_pp`	No	Assumed execution slippage in percentage points per leg (default 0.3). Subtracted from raw \|edge\| before ranking and Kelly sizing. Polymarket has zero trading fees as of 2024 but bid/ask + thin depth typically eats 20-50bp per trade. Bump for very thin partitions; drop to 0 if you have a smarter fill model.
`max_spread_pp`	No	Tradeable-edge filter. Maximum bid/ask spread in percentage points on the representative market. Default null (no filter). Set to 2 to require tight books — anything wider eats most plausible edges.
`min_liquidity`	No	Tradeable-edge filter. Minimum $ liquidity on the representative market (or for partition_overround, on at least one top_leg). Default 0 (no filter). Set to 5000 to drop thin-book opportunities where executing the edge would walk the book past breakeven.
`category_filter`	No	Comma-separated list to restrict the output: "model_driven" (crypto_price + news_momentum), "structural_arbitrage" (partition_overround), "concentrated_longshot". Combine like "model_driven,structural_arbitrage". Default: all.
`min_partition_leg_kelly`	No	Minimum BEST per-leg half-Kelly fraction across a partition_overround opportunity's top_legs (or longshot_basket legs). Default 0 (no filter). Partition arbs always return kelly_fraction_half=0 at the parent level by design (basket trades don't compose to single-leg Kelly), so min_kelly never filters them — this knob applies to the per-leg Kelly inside top_legs instead. Use to suppress thin partitions whose individual leg edges aren't worth the per-leg slippage cost.

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description goes far beyond annotations (readOnlyHint, destructiveHint) by disclosing caching behavior (1h KV-level), response structure (by_segment with diagnostics), edge calculation details (slippage, Kelly caps, 24h-move warning), and limitations (Fed data reliability). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is comprehensive but verbose, with dense paragraphs that could be better structured (e.g., bullet points for segments, knobs, diagnostics). While all information is valuable, the length may hinder quick parsing. Front-loading of purpose is good.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Without an output schema, the description thoroughly explains the response layout (by_segment, fed_candidates, diagnostics) and field semantics (edge_pp_net, kelly_fraction, liquidity). It also covers parameter usage, caching, and data limitations, making it self-contained for a complex tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 100% schema coverage, the description adds significant value by explaining the rationale behind parameters (e.g., min_partition_leg_kelly for basket trades, max_spread_pp for tight books), tradeable-edge knobs, and how they interact. This surpasses the baseline 3 for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: scanning Polymarket markets for opportunities where Pipeworx data disagrees with market price, specifically built for daily betting discovery. It distinguishes itself from siblings by detailing unique model families (MODEL_DRIVEN, STRUCTURAL_ARBITRAGE, CONCENTRATED_LONGSHOT) and edge calculation methodology.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use the tool ('what should I bet on today', avoids paging hundreds of markets) and explains how to configure filters like min_liquidity and max_spread_pp. However, it does not explicitly compare against sibling tools such as polymarket_arbitrage or state when to prefer alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_edge_trackerPolymarket Edge TrackerA

Read-onlyIdempotent

Inspect

Edge persistence and decay telemetry built from daily polymarket_edges snapshots. Answers "how long has this edge existed and is it shrinking?" — a fresh wide edge and a 3-week-old wide edge are different trades (the latter is wide for a reason nobody is willing to take). Args: days (lookback, default 14, max 30), window (snapshot family, default "1wk"). RESPONSE: tracked[] = every opportunity in the LATEST snapshot with its full edge_pp_net time-series across prior snapshots, first_seen, trend (new | widening | stable | decaying) and decay_pp_per_day (both computed on |edge_pp_net| — the value itself is signed by trade direction, negative = SELL YES); expired[] = opportunities that appeared in earlier snapshots but are GONE from the latest (closed, resolved, or arbed away) with their lifespan_days — the median lifespan is your competition clock; snapshot_dates[] = which days actually have data (snapshots are written when polymarket_edges runs on a cache-miss, so gaps mean nobody scanned that day). LIMITS: history depth is bounded by the 60-day snapshot TTL and starts from when snapshotting was enabled; decay numbers come from daily closes of edge_pp_net (net of default slippage), not intraday.

ParametersJSON Schema

Name	Required	Description	Default
`days`	No	Lookback in days (default 14, clamp 2-30).
`window`	No	Which polymarket_edges window family to read snapshots for: 24hr \| 1wk \| 1mo (default 1wk).

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate read-only and non-destructive behavior. The description adds significant behavioral context: snapshots are written on cache-miss, gaps mean no scan, 60-day TTL, and the exact response structure including decay computation and fields like 'trend' and 'decay_pp_per_day'. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with a clear summary and uses structured text with labeled sections (RESPONSE, LIMITS). It is slightly verbose but every sentence adds value, so it remains effective without being overly wordy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description fully explains the response structure (tracked[], expired[], snapshot_dates[]) and their fields. It also covers limitations (TTL, snapshot gaps) and edge cases (decay from close, not intraday). No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with basic descriptions. The description adds default values, clamping range for 'days', and clarifies 'window' as the snapshot family. This provides meaningful guidance beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: providing edge persistence and decay telemetry from daily snapshots. It answers a specific question about edge history and distinguishes itself from sibling tools like polymarket_edges, which likely provides current edges.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool (to understand edge longevity and decay) and provides an example contrasting fresh vs old edges. It also notes limits like TTL and snapshot gaps, but does not explicitly state when not to use or offer direct alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_fill_riskPolymarket Fill RiskA

Read-onlyIdempotent

Inspect

Realizable-vs-theoretical edge check against live CLOB order-book depth. REQUIRES one of market (single-market mode) or event (basket/partition mode). SINGLE-MARKET: pass a market slug/URL + side (buy_yes|sell_yes|buy_no|sell_no, default buy_yes) + size_usd (default 1000 — max spend on buys, target proceeds on sells); walks the ladder and returns top_of_book, vwap_fill_price, slippage_pp, shares_filled, max_fillable_usd, and a verdict (clean|degraded|cannot_fill). BASKET: pass an event slug/URL + side (sell_yes = capture overround by selling every leg, buy_yes = capture underround; default auto from partition sum) + size_usd interpreted as settlement notional S (shares per leg; each share pays $1); returns theoretical_sum vs realizable_sum (top-of-book vs VWAP across all legs), capture_ratio, profit_usd at executed size, per-leg fill detail, thin_legs[], max_clean_notional_usd, and forced_directional_risk naming the legs most likely to strand you unhedged. USE THIS before acting on any polymarket_arbitrage SELL/BUY-EVERY-LEG signal or any polymarket_edges trade above ~$500 — theoretical overround on thin books is not capturable, and partial basket fills convert an arb into an unhedged directional position (the dominant loss mode in real arb-bot P&L).

ParametersJSON Schema

Name	Required	Description
`side`	No	Single-market: buy_yes \| sell_yes \| buy_no \| sell_no (default buy_yes). Basket: sell_yes \| buy_yes (default auto — sell if partition sum > 1, buy if < 1).
`event`	No	Basket mode: event slug or full polymarket.com URL — checks every leg of the partition.
`market`	No	Single-market mode: market slug or full polymarket.com URL.
`size_usd`	No	Single-market: USD to spend (buys) or target proceeds (sells). Basket: settlement notional — shares per leg, each paying $1 at resolution. Default 1000, clamp 10–1,000,000.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint as true and destructiveHint as false, indicating a safe, non-mutating tool. The description adds value by detailing the tool's behavior: walks the order-book ladder, returns fill details, and warns about risks like partial basket fills converting an arb into an unhedged directional position. It does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long but well-structured with capitalized sections (REQUIRES, SINGLE-MARKET, BASKET) and front-loaded with the main purpose. However, it includes extensive detail that could be more concise without losing clarity. Every sentence serves a purpose, but the length is borderline excessive for a tool description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description fully documents the return values for both modes (e.g., top_of_book, vwap_fill_price, slippage_pp, verdict for single-market; theoretical_sum, realizable_sum, capture_ratio, profit_usd, thin_legs, forced_directional_risk for basket). It also explains the rationale for using the tool (thin books, partial fills) and warns of risks. No gaps are evident.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are already documented. The description adds contextual meaning beyond schema, such as interpreting `size_usd` in basket mode as 'settlement notional S (shares per leg; each share pays $1)' and explaining the auto-detection of side for baskets. This enrichment justifies a score above baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Realizable-vs-theoretical edge check against live CLOB order-book depth.' It distinguishes between single-market and basket modes, each with specific return values. It also explicitly differentiates from sibling tools by stating 'USE THIS before acting on any polymarket_arbitrage SELL/BUY-EVERY-LEG signal or any polymarket_edges trade above ~$500.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes clear usage instructions: 'REQUIRES one of `market` or `event`' and specifies when to use (before acting on arb signals or edges above $500). It explains mode selection (single-market vs basket) and default behaviors. However, it does not explicitly state when NOT to use this tool or describe alternative tools for alternative scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

polymarket_kalshi_spreadPolymarket–Kalshi SpreadA

Read-onlyIdempotent

Inspect

Cross-venue spread between Kalshi and Polymarket for the same resolving question. The two venues sometimes price the same outcome 2-25pp apart because their participant pools differ — when the bet shapes are equivalent that delta is a real signal, when they aren't the tool says so. TWO MODES: (1) topic — 10 pre-mapped macro shortcuts ("fed", "btc", "cpi", "gdp", "sp500", "recession", "next_pope", "next_uk_pm", "next_israel_pm", "2028_president") auto-fetch the matching event on each venue. (2) explicit kalshi_event_ticker + polymarket_event_slug for custom pairings. RESPONSE: each venue's leg-by-leg prices (raw probability 0-1) plus matched spread[].top_spreads_pp (Kalshi − Polymarket) where the same outcome shows up on both sides. SAFETY FIELDS: compatibility_warning fires in two cases — (a) matched_pairs:0 with skipped_cross_type>0 means the venues frame the topic with non-equivalent bet shapes (e.g. Kalshi range_bucket point-in-time vs Polymarket cumulative_threshold touch-anywhere — no arb exists), (b) matched_pairs:0 with skipped_cross_type:0 and both venues >5 legs means the token-overlap matcher found nothing in common — events likely semantically unrelated despite the topic keyword. temporal_alignment{polymarket_month,kalshi_month,aligned} tells you whether the two events resolve in the same calendar period; aligned:false means spreads are mathematically meaningless across the temporal gap. skipped_cross_type / skipped_cross_subtype counters expose how many leg-pair comparisons were dropped (cross-type = metric_type mismatch like MoM vs YoY; cross-subtype = inequality mismatch like cum_ge vs cum_le). Real cross-venue spreads are rarer than the macro-shortcut list suggests — most pre-mapped topics return compatibility_warning today; pre-mapped ≠ tradeable.

ParametersJSON Schema

Name	Required	Description
`topic`	No	Pre-mapped: fed \| btc \| cpi \| gdp \| sp500 \| recession \| next_pope \| next_uk_pm \| next_israel_pm \| 2028_president
`kalshi_event_ticker`	No	Explicit Kalshi event ticker, e.g. "KXFED-26OCT". Overrides the topic-mapped Kalshi side.
`polymarket_event_slug`	No	Explicit Polymarket event slug, e.g. "fed-decision-in-june-825". Overrides the topic-mapped Polymarket side.

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description significantly expands on the annotations (readOnlyHint, openWorldHint, idempotentHint) by detailing the response structure, safety fields (compatibility_warning, temporal_alignment, skipped_cross_type), and the logic behind skipped leg-pair comparisons. It honestly discloses limitations such as temporal misalignment and non-equivalent bet shapes.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections and uses bold for key terms. It is somewhat lengthy but each sentence contributes essential information. The front-loading of purpose and modes is good, and the safety field explanations are detailed but necessary for correct usage.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of cross-venue arbitrage detection and the absence of an output schema, the description provides a thorough explanation of the response structure, safety fields, and edge cases (e.g., compatibility_warning conditions, temporal alignment). It covers all critical aspects for an agent to understand the tool's behavior and limitations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 100% schema coverage, the description adds substantial value by explaining the topic shortcuts (listing all 10), the explicit overrides, and how they interact. It provides examples and clarifies the mode-switching behavior, which is not evident from the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Cross-venue spread between Kalshi and Polymarket for the same resolving question.' It specifies two distinct modes (topic and explicit) and distinguishes itself from sibling tools like polymarket_arbitrage by focusing on cross-venue comparison.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly guides when to use the tool: it explains the two modes, notes that pre-mapped topics often yield compatibility_warning, and states that 'pre-mapped ≠ tradeable.' It also provides clear cases for when the compatibility_warning fires and advises on temporal alignment, giving agents clear context on when the spread is meaningful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

profileProfileA

Read-onlyIdempotent

Inspect

Fetch FMP company profile for a ticker symbol, including sector, industry, description, CEO, employee count, website, market cap, and exchange listing details.

ParametersJSON Schema

Name	Required	Description	Default
`symbol`	Yes

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, open-world, idempotent, and non-destructive behavior. The description adds the output fields but lacks disclosure of any additional behavioral traits such as rate limits or data freshness, so it provides only marginal extra transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no filler, immediately starting with the action verb 'Fetch', and perfectly sized for the tool's simplicity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-parameter fetch tool, the description lists all key return fields. However, it omits potential error conditions or limitations (e.g., data availability for delisted symbols), so slightly incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage for the 'symbol' parameter, but the description explicitly states 'ticker symbol', which clarifies the parameter's meaning and format, compensating for the schema gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool fetches a company profile for a ticker symbol and lists specific fields included (sector, industry, CEO, etc.), making its purpose unambiguous and distinguishing it from siblings like balance_sheet or cash_flow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description implies usage for getting company profile data but does not explicitly guide when to use this tool versus alternatives like entity_profile, which likely provides similar data, nor does it mention when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

quoteQuoteA

Read-onlyIdempotent

Inspect

Fetch the current real-time quote for a ticker symbol from FMP, including price, change, percent change, day range, 52-week range, volume, and market cap.

ParametersJSON Schema

Name	Required	Description	Default
`symbol`	Yes

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnly, idempotent, and destructive=false. The description adds value by specifying real-time data from FMP and the fields returned, providing behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence that front-loads the tool's purpose and key data points. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description lists the main returned fields, which compensates for the lack of an output schema. However, it does not mention error handling or data freshness, leaving some gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description adds that the parameter is a ticker symbol, but does not provide format, validation, or examples. This is minimal value; baseline for low coverage would expect more.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool fetches a real-time quote for a ticker symbol, listing the data fields included. However, it does not explicitly distinguish from sibling tools like quote_short, which would improve clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for obtaining a full quote but provides no explicit guidance on when to use this versus alternatives, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

quote_shortQuote ShortA

Read-onlyIdempotent

Inspect

Fetch a lightweight FMP quote for a ticker symbol returning only price, volume, and percent change; use when only the current price is needed.

ParametersJSON Schema

Name	Required	Description	Default
`symbol`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, and destructiveHint. The description adds value by specifying the exact return fields (price, volume, percent change) and that it is a lightweight operation, complementing the annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that front-loads the verb and resource, then specifies return fields and use case. Every word contributes, with no wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity, good annotations, and no output schema, the description adequately covers purpose, return fields, and usage context. It lacks details on exact output structure but is sufficient for a lightweight quote tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the description clarifies that 'symbol' is a ticker symbol, adding basic semantic value beyond the parameter name. No further details like format or examples are given, but for a single simple parameter, this is minimally adequate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (Fetch), resource (FMP quote for ticker symbol), and specific return fields (price, volume, percent change). It distinguishes itself from a full quote tool by emphasizing 'lightweight' and 'only' the current price when needed.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'use when only the current price is needed,' providing clear context. However, it does not explicitly list alternatives or when not to use this tool, though the sibling 'quote' implies a fuller alternative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ratiosRatiosD

Read-onlyIdempotent

Inspect

Financial ratios.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`period`	No
`symbol`	Yes

Tool Definition Quality

D1.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint. The description adds no behavioral context beyond that, missing details like whether it returns historical or current ratios, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is too short to be useful. While brevity is valued, this is under-specification, not conciseness. One sentence does not provide enough information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 3 parameters, no output schema, and no parameter descriptions, the description is critically incomplete. It fails to explain what ratios are returned, the data source, time periods, or any default behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% (no descriptions on any parameters). The description does not explain the meaning of 'limit', 'period', or 'symbol', leaving the agent with no semantic guidance beyond the parameter names.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description says 'Financial ratios.' This is a vague noun phrase without a verb indicating the tool's action. It does not specify whether it retrieves, calculates, or lists ratios. Among sibling tools like income_statement, balance_sheet, and key_metrics, it fails to distinguish itself.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like key_metrics or financial_growth. The description does not specify context, prerequisites, or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recallRecallA

Read-onlyIdempotent

Inspect

Retrieve a value previously saved via remember, or list all saved keys (omit the key argument). Use to look up context the agent stored earlier — the user's target ticker, an address, prior research notes — without re-deriving it from scratch. Scoped to your identifier (anonymous IP, BYO key hash, or account ID). Pair with remember to save, forget to delete.

ParametersJSON Schema

Name	Required	Description	Default
`key`	No	Memory key to retrieve (omit to list all keys)

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly=true, idempotent=true, destructive=false. Description adds value by explaining the dual mode (single key retrieval vs listing all) and scoping, without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, front-loaded with action and purpose. Every sentence adds essential context (dual behavior, use case, scoping, pairing). No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read tool with 1 optional param and no output schema, the description fully covers behavior, use case, and relationship to siblings. No missing information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (parameter description already explains omitting key to list all). The tool description does not add new parameter semantics beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'retrieve' and resource 'value previously saved via remember', and distinguishes from siblings by mentioning 'list all saved keys (omit key argument)'. It is specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explanation of when to use: 'look up context the agent stored earlier... without re-deriving it from scratch.' Explicitly mentions scoping (by identifier) and pairs with remember/forget for complete workflow.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recent_alertsRecent AlertsA

Read-onlyIdempotent

Inspect

Pull fired events from your subscription feed. Returns the most recent alerts the evaluator has written to your persisted feed — each carries source, citation_uri (pipeworx:// when available), and the raw event payload. Filter by type (e.g. "sec_8k") and/or since (ISO timestamp). Set mark_read:true to flag returned events read so the next call only shows newer ones. Polls work fine; the same feed is also at GET registry.pipeworx.io/alerts.json for scripts and dashboards.

ParametersJSON Schema

Name	Required	Description
`type`	No	Optional — filter to one subscription type.
`limit`	No	Max events to return (1-200, default 50).
`since`	No	Optional ISO timestamp — return events fired_at >= this time.
`mark_read`	No	Flag the returned events read in the same call (default false).
`unread_only`	No	Return only events where read_at is null (default false).

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false. The description adds value by explaining return content (source, citation_uri, payload) and the side effect of mark_read (flagging events as read), which goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately concise, providing essential information in a few sentences. It front-loads the core action and then details parameters and behavior. Could be slightly more structured, but every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, so description compensates by describing return fields. Covers filtering, mark_read behavior, and polling suitability. However, it omits pagination details and error handling, which are minor gaps given the tool's simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. The description adds semantic value by explaining the type filter with an example ('sec_8k'), how to use since (ISO timestamp), and the behavior of mark_read. Does not repeat all parameter details but provides enough context to use them correctly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool 'pulls fired events from your subscription feed' and returns recent alerts. It specifies the key fields (source, citation_uri, payload) and filtering options, distinguishing it from sibling tools like list_subscriptions or subscribe.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides guidance on polling behavior and mentions an alternative REST endpoint for scripts/dashboards. Could be more explicit about when to choose this tool over siblings, but the description of filtering and mark_read gives practical usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recent_changesRecent ChangesA

Read-onlyIdempotent

Inspect

"What's new with X" / "latest on Y" / "what happened to Z this week / month / quarter" / "updates on Acme" / "news on Tesla recently" / "what's happening with Apple" — change feed for a company in the last N days/weeks/months in ONE parallel call. Fans out to SEC EDGAR (filings since since), GDELT→GNews fallback (news mentions in window — GDELT preferred, GNews when rate-limited or 5xx), USPTO (patents granted; PatentsView API sunset May 2025 so this soft-fails until reactivated). since accepts ISO date ("2026-04-01") or relative shorthand ("7d", "30d", "3m", "1y"). Returns structured changes[] grouped by source + total_changes count + pipeworx:// citation URIs. Use entity_profile instead when you want the static profile (filings + fundamentals + LEI + patents) regardless of window.

ParametersJSON Schema

Name	Required	Description
`type`	Yes	Entity type. Only "company" supported today.
`since`	Yes	Window start — ISO date ("2026-04-01") or relative ("7d", "30d", "3m", "1y"). Use "30d" or "1m" for typical monitoring.
`value`	Yes	Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193").

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare safe read-only and idempotent behavior. The description adds details about data sources, fallback mechanisms, and soft-fail for USPTO, providing extra context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with front-loaded example queries and clear sections. It is comprehensive without being verbose, though slightly longer than strictly necessary.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with multiple data sources and no output schema, the description covers return structure (changes[], total_changes, citation URIs), parameter behavior, and fallback logic, making it highly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions. The description adds value by explaining 'since' formats (ISO date or relative shorthand) and providing usage examples, though the schema already describes them sufficiently.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: providing a change feed for a company over a specified window. It uses specific verbs like 'fans out' and distinguishes from sibling tool 'entity_profile' which offers a static profile.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance is given: use for queries like 'What's new with X' and alternative 'entity_profile' when static profile is needed. Limitations and fallback behavior are also described.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rememberRememberA

Idempotent

Inspect

Save data the agent will need to reuse later — across this conversation or across sessions. Use when you discover something worth carrying forward (a resolved ticker, a target address, a user preference, a research subject) so you don't have to look it up again. Stored as a key-value pair scoped by your identifier. Authenticated users get persistent memory; anonymous sessions retain memory for 24 hours. Pair with recall to retrieve later, forget to delete.

ParametersJSON Schema

Name	Required	Description	Default
`key`	Yes	Memory key (e.g., "subject_property", "target_ticker", "user_preference")
`value`	Yes	Value to store (any text — findings, addresses, preferences, notes)

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotentHint=true and destructiveHint=false. Description adds important behavioral details about persistence (24 hours for anonymous, persistent for authenticated) beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences with no wasted words. Front-loaded with the primary action, then context, then pairing instructions. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple write tool with no output schema, the description fully explains what the tool does, when to use it, and how data is stored. Minor gap: does not specify maximum length for key or value.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, providing clear key-value description. Description adds value with examples (e.g., 'subject_property', 'target_ticker') and explains the purpose of each field, going beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verbs ('save', 'reuse') and clearly identifies the resource ('data the agent will need to reuse later'). It distinguishes from sibling tools by explicitly mentioning 'pair with recall to retrieve later, forget to delete.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use guidance ('when you discover something worth carrying forward') and references companion tools (recall, forget). However, lacks explicit when-not-to-use scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

resolve_entityResolve EntityA

Read-onlyIdempotent

Inspect

"What's the ticker for…" / "find the CIK for…" / "what's the RxCUI for…" / "look up the ID for…" / "what is X's official identifier" — resolve a user-spoken NAME to the canonical/official identifier other tools require as input. Use FIRST whenever you have a name but need an ID. SUPPORTED TYPES: "company" (returns ticker + 10-digit CIK + company_name from SEC EDGAR + pipeworx://edgar/company/{cik} citation URI; accepts ticker, CIK, or company name as input — auto-disambiguated), "drug" (returns RxCUI + ingredient + brand from RxNorm + pipeworx://rxnorm/{rxcui} citation; accepts brand or generic name). Each call cascades through several lookup endpoints internally — using resolve_entity replaces 2-3 manual lookups.

ParametersJSON Schema

Name	Required	Description	Default
`type`	Yes	Entity type: "company" or "drug".
`value`	Yes	For company: ticker (AAPL), CIK (0000320193), or name. For drug: brand or generic name (e.g., "ozempic", "metformin").

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false. The description adds crucial behavioral context: internal cascading through multiple endpoints, auto-disambiguation for companies, and specific return formats (ticker, CIK, citation URI for company; RxCUI, ingredient, brand for drug), going beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately lengthy but well-structured with examples and bullet points. Every sentence contributes meaning, though slight condensation could improve conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description fully describes return values for each entity type, including citation URIs. It covers edge cases like accepting multiple input formats (ticker, CIK, name) and auto-disambiguation, making it complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds value by providing concrete examples and formats for the 'value' parameter (e.g., ticker 'AAPL', drug 'ozempic'), and explains the different return types per 'type' enum value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool resolves user-spoken names to canonical identifiers, with specific examples (ticker, CIK, RxCUI) and supported types (company, drug). It distinguishes from sibling tools by noting that other tools require these IDs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use FIRST whenever you have a name but need an ID,' providing clear context for when to use. It also mentions internal cascading replacing manual lookups, but does not explicitly state when not to use or list alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scan_competitor_ai_presenceScan Competitor AI PresenceA

Read-onlyIdempotent

Inspect

Compare AI visibility across multiple entities side-by-side. Probes each entity (your brand + N competitors) with ai_visibility_check, ranks by score, surfaces which is most/least recognized. Useful for competitive AI-marketing audits: "does Claude know about us as well as our competitors?". Returns ranked list with score, confidence, signal density per entity.

ParametersJSON Schema

Name	Required	Description
`models`	No	Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai.
`_apiKey`	No	Optional Anthropic API key — only if "anthropic" is in models. Passed to api.anthropic.com per probe.
`context`	No	Optional shared context applied to every probe (e.g. "B2B SaaS", "Boston restaurant"). Disambiguates common names.
`entities`	Yes	Array of 2-8 entities to compare (brand/business/product names). First entry treated as the "subject" for narrative; rest are competitors.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnly, idempotent, non-destructive), the description adds valuable behavioral details: it probes each entity using ai_visibility_check, ranks by score, and returns score, confidence, and signal density. It also notes that the first entity is treated as the subject. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences—front-loaded with purpose, then mechanism, use case, and output. Every sentence adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately describes the return value (ranked list with score, confidence, signal density per entity). It covers constraints (entity count) and behavior (first entity treated as subject), making the tool easy to use without further lookup.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds extra meaning by specifying that entities array must have 2-8 items and that the first entry is the subject. It reinforces model parameter options but does not add new info beyond schema for models and context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool compares AI visibility across multiple entities side-by-side, using a specific verb 'Compare' and resource 'AI visibility'. It distinguishes from the sibling ai_visibility_check by being a batch comparison that ranks entities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for use ('competitive AI-marketing audits') with a concrete example question. However, it does not explicitly exclude use cases or mention alternatives like compare_entities.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scan_dependencyScan DependencyA

Read-onlyIdempotent

Inspect

Composite "should I add this npm package to my project" check in ONE call — fans out across deps.dev (license + advisories + version history) and bundlephobia (gzipped/minified bundle size, dependency count, ESM/tree-shake support). Use whenever an agent asks "is X safe / popular / small" or "what does adding lodash cost me". Returns a summary block (is_latest, license, published_at, advisory_count, bundle_kb_min, bundle_kb_gz, dependency_count, has_esm, tree_shakeable), per-advisory detail, links, and a list of recent alternative versions. NPM ecosystem only in v1; PyPI / Maven / Cargo / Go fall under deps.dev:version directly. Partial failures degrade gracefully — bundlephobia's first measurement on a new version can take 5-30s; sources_failed will list it if it times out, the rest still returns.

ParametersJSON Schema

Name	Required	Description	Default
`package`	Yes	npm package name. Scoped packages (e.g. "@types/node") are accepted.
`version`	No	Specific version to check (e.g., "18.3.1"). Defaults to the latest published version when omitted.

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds important behavioral details beyond annotations: partial failures degrade gracefully, bundlephobia first measurement delay (5-30s), sources_failed list. Annotations already indicate read-only, idempotent, not destructive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with purpose, each sentence adds value. Slightly long but justified by complexity of multi-source behavior and edge cases. Could be tightened without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description thoroughly explains what is returned: summary fields, per-advisory detail, links, alternatives. Covers timing, partial failure, and ecosystem scope.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema already has 100% coverage with descriptions. Description reinforces usage (npm package name, scoped accepted, version defaults to latest) and provides context on how parameters are used in the composite check.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's a composite check for npm packages, covering license, advisories, bundle size, etc. It distinguishes well from siblings which are mostly finance/company tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use: when agent asks about safety, popularity, size, cost. Also specifies ecosystem limitation (NPM only v1) and fallback for other ecosystems. Mentions partial failure behavior.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_nameSearch NameC

Read-onlyIdempotent

Inspect

Company-name search.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`query`	Yes
`exchange`	No

Tool Definition Quality

C2.3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, destructiveHint. The description adds no behavioral context beyond that, such as rate limits or output format specifics. It merely repeats the search verb.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

At only two words, the description is under-specified rather than concise. It lacks structure and fails to provide essential information that would fit in a short description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and minimal description, the tool is incomplete. An agent cannot infer return format, pagination, or how results are ordered. Sibling tools are better documented, making this tool harder to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Parameter descriptions are completely absent from both the schema (0% coverage) and the description. The description does not explain what 'query', 'limit', or 'exchange' mean or how to use them effectively, leaving the agent to guess.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Company-name search' clearly states the tool searches by company name, distinguishing it from siblings like search_symbol (by symbol) and search_within (content-based). However, it could be more specific about what is returned.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives, no mention of prerequisites or when not to use it. The context signals and sibling list imply competition but the description offers no direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_symbolSearch SymbolD

Read-onlyIdempotent

Inspect

Symbol search.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No
`query`	Yes
`exchange`	No

Tool Definition Quality

D1.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnly, openWorld, and idempotent hints. The description adds no behavioral context beyond what is already known, so it does not enhance transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (two words) but sacrifices informativeness, making it under-specified rather than efficiently concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters, no output schema, and zero schema description coverage, the minimal description is completely inadequate for an AI agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description does not explain the meaning of query, limit, or exchange parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Symbol search.' indicates the tool searches for symbols, but it is too vague and does not differentiate from sibling tools like search_name or search_within.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives, nor any context on prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_withinSearch Within a SourceA

Read-onlyIdempotent

Inspect

Semantic search INSIDE a fetched record. Pass the text you already pulled (e.g. a SEC 10-K body, an article, a long tool result) plus a natural-language query; get back the top-N passages with character offsets and similarity scores. Use when the record is too big to cram into the prompt — search_within saves context, returns only the passages that matter, and every passage carries an offset so the agent can verify a verbatim quote. Pairs with ask_pipeworx_grounded: fetch with the gateway, ground over the relevant passages instead of the whole document. BGE-base-en embeddings + cosine over 500-char overlapping windows; cap is 200K chars (longer inputs are truncated and flagged).

ParametersJSON Schema

Name	Required	Description
`text`	Yes	The document text to search inside (max ~200K chars).
`limit`	No	Max passages to return (1-20, default 5).
`query`	Yes	Natural-language query — what passages do you want? E.g. "supply-chain risk", "fiscal year 2024 revenue", "drug interactions with warfarin".

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint, idempotentHint, destructiveHint), the description discloses the embedding model (BGE-base-en), similarity metric (cosine), windowing (500-char overlapping windows), behavior on truncation at 200K chars, and that results include character offsets and similarity scores. This is highly transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the main purpose and use case. It includes some technical details (e.g., BGE-base-en embeddings, cosine similarity, 500-char windows) that, while informative, could be slightly more concise without losing essential behavior guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex tool with 3 parameters and no output schema, the description covers behavior, limits, integration with sibling tools, and output features. It lacks explicit mention of error handling or edge cases, but overall is sufficiently complete for agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, but the description adds value with example queries for the 'query' parameter and clarifies that 'text' should be 'already pulled' content. The examples and contextualization go beyond basic schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it performs semantic search inside a fetched record. It uses specific verbs ('Search Within a Source') and distinguishes the tool from siblings by highlighting its use for large records that cannot fit in the prompt and pairing with ask_pipeworx_grounded.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use when the record is too big to cram into the prompt' and explains the workflow with ask_pipeworx_grounded. It provides clear context for when to use the tool, though it does not explicitly mention when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

stock_newsStock NewsD

Read-onlyIdempotent

Inspect

News (paid).

ParametersJSON Schema

Name	Required	Description	Default
`page`	No
`limit`	No
`symbols`	No

Tool Definition Quality

D1.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. The description adds 'paid', indicating cost involvement, which is useful but lacks details on pricing model or access constraints. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise (two words), but this under-specification sacrifices clarity. The description should provide enough context to be useful, not just be short.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 optional parameters, no output schema, and many sibling tools, the description is severely incomplete. It lacks details on return format, pagination, symbol format, and any other behavioral traits.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (no descriptions for page, limit, symbols), and the description does not explain any parameter. Agents have no clue about parameter meaning or formatting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'News (paid)' is vague. It indicates stock news but does not specify the scope (e.g., general news for symbols, headlines, or articles) or differentiate from sibling tools like earnings_calendar or economic_calendar.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The description provides no context on appropriate scenarios or exclusions, leaving the agent without decision support.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

stock_screenerStock ScreenerD

Read-onlyIdempotent

Inspect

Stock screener (paid).

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

D1.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, destructiveHint. The description adds only 'paid', which is minimal extra context. No contradictions, but little value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely brief, but the brevity comes from under-specification rather than conciseness. Critical information is missing, making the description unhelpful.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, empty input schema, and 0 parameters, the description must explain the tool's behavior and usage. It fails entirely, providing no actionable guidance for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has no properties (0 parameters, schema coverage 100% only due to emptiness) but allows additionalProperties. The description does not clarify what parameters are accepted or how to structure them, leaving the agent guessing.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Stock screener (paid)' hints at the domain but lacks a specific verb and resource. It does not detail what the tool does (e.g., filter stocks by criteria), and with many sibling financial tools, it fails to differentiate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidelines provided. The description offers no context on when to use the stock screener versus alternatives like earnings_calendar or insider_trading, nor any exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

subscribeSubscribe to AlertsA

Idempotent

Inspect

Create a proactive monitoring subscription to a live-data event stream. Returns the new subscription id. Requires a Pipeworx OAuth account (anonymous + BYO cannot persist subscriptions). Supported types: "sec_8k" (8-K filings matching ticker + item codes — e.g. items:["5.02"] = officer change), "polymarket_edge" (Polymarket↔Kalshi cross-venue mispricings — params:{topic:"fed"}), "fred_series" (new FRED observations — params:{series_id:"UNRATE"}). Delivery channels: feed (always on — pull via recent_alerts or GET registry.pipeworx.io/alerts.json), and optionally email (set delivery:{email:"you@x.com"}) or sms (delivery:{sms:"+15551234567"} — phone must be verified at /account first; 10/day cap).

ParametersJSON Schema

Name	Required	Description
`type`	Yes	Subscription type.
`params`	Yes	Type-specific filter. sec_8k: {ticker:"AAPL", items?:["5.02","1.01"]}. polymarket_edge: {topic:"fed", min_spread_bps?:500}. fred_series: {series_id:"UNRATE"}. patent_grant: {applicant:"Apple Inc."}. clinical_trial: {sponsor?:"Pfizer", condition?:"lung cancer", phase?:"PHASE3"} (sponsor or condition required).
`delivery`	No	Optional delivery channels in addition to the always-on persistent feed. {email:"you@x.com"} sends a templated alert per fired event. {sms:"+15551234567"} sends an SMS per event — must match the verified phone on the caller's account (verify at https://pipeworx.io/account first; 10/day cap). {webhook:"https://..."} POSTs each event JSON to your endpoint, HMAC-signed — the response includes delivery.webhook_secret (whsec_…) ONCE; verify X-Pipeworx-Signature = sha256 HMAC of "<X-Pipeworx-Timestamp>.<raw body>". Auto-disabled after 10 consecutive failing runs.

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds value beyond annotations: details auth requirements, SMS cap (10/day), and webhook auto-disable after failures. Annotations already indicate write operation (readOnlyHint=false), so description complements well without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with core action and requirements. Well-structured with separate sections for types and delivery. Slightly long but every sentence adds value; minor redundancy in delivery descriptions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (multiple types, delivery channels, auth requirements), description is comprehensive. Covers all supported types with detail, delivery options with limitations, and return value (subscription id). No output schema needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description significantly enriches meaning with concrete examples for each type (e.g., sec_8k items codes, polymarket_edge topic) and delivery options with constraints (SMS must match verified phone, webhook HMAC details). Goes well beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool creates a proactive monitoring subscription to live-data event streams. Specifies action (Create), resource (subscription), and domain (alerts). Distinguishes itself from siblings like list_subscriptions and unsubscribe by focusing on creation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context for usage: requires Pipeworx OAuth account, warns that anonymous/BYO cannot persist subscriptions. Explains supported types and delivery channels. No explicit when-not-to-use statements, but context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

suggest_questionsWhat Can I Ask Pipeworx?A

Read-onlyIdempotent

Inspect

What can I ask Pipeworx? / what is Pipeworx good for? / what can you do? / give me ideas / show me examples / getting started / what data do you have? — the onboarding entry point for an agent that just connected and wants to know what is worth asking. Returns category-bucketed example questions (company financials, drugs & clinical trials, economics, real estate, prediction markets, weather, government & patents, science & academia, news) — each with the exact tool + argument shape that answers it, drawn from the live catalog of thousands of tools. Call with no arguments for the full spread, or pass topic (e.g. "finance", "pharma", "betting") to focus. Use this FIRST when you do not yet know what Pipeworx can do for you, or to learn how to call the meta-tools (ask_pipeworx, entity_profile, compare_entities, etc.).

ParametersJSON Schema

Name	Required	Description	Default
`topic`	No	Optional focus area: finance \| pharma \| economics \| real-estate \| betting \| weather \| government \| science \| news. Omit for a cross-category spread.

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds that it returns example questions with tool+argument shapes drawn from a live catalog, providing useful context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Packed with information yet highly efficient: front-loaded with common phrasings, then purpose, then details. Every sentence serves a purpose with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with no output schema, the description covers usage, parameter guidance, and behavioral traits comprehensively, leaving no obvious gaps for an agent to misuse.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one optional parameter. The description adds meaning by listing possible values for 'topic' and explaining behavior when omitted (full spread vs. focused), which goes beyond the schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is the onboarding entry point, returns category-bucketed example questions with exact tool+argument shapes, and distinguishes from siblings like ask_pipeworx and discover_tools by being the first tool to use when learning Pipeworx's capabilities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises 'Use this FIRST when you do not yet know what Pipeworx can do for you, or to learn how to call the meta-tools'. While it doesn't list exclusions, the context makes the appropriate use clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

unsubscribeUnsubscribe from AlertsA

Idempotent

Inspect

Cancel a subscription by id. Ownership is enforced — you can only cancel your own subscriptions. The row is deactivated (not deleted) so its historical events stay available via recent_alerts.

ParametersJSON Schema

Name	Required	Description	Default
`id`	Yes	Subscription id (uuid) returned by subscribe.

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses behavioral trait beyond annotations: the row is deactivated (not deleted) and historical events remain available. Aligns with destructiveHint=false and adds valuable context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with action, every sentence adds value. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool with annotations and no output schema, the description covers purpose, ownership rule, and behavioral effect completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a description for the id parameter. The description mentions 'returned by subscribe' but adds limited extra meaning. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action 'Cancel a subscription by id' and specifies the resource. Distinguishes from siblings like subscribe and list_subscriptions by mentioning ownership enforcement and deactivation vs deletion.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states ownership enforcement ('you can only cancel your own subscriptions'), providing a clear usage rule. Does not explicitly mention alternative tools, but the context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_claimValidate ClaimA

Read-onlyIdempotent

Inspect

"Is it true that…" / "fact check" / "verify the claim that…" / "did X really…" / "was Y actually…" / "confirm or refute" / "true or false" — natural-language claim verification against authoritative sources. Use whenever the agent needs to check whether something a user said is factually correct. Company-financial claims (revenue, net income, cash for public US companies) verify via the structured SEC EDGAR + XBRL fast path with exact percent-delta math; ANY OTHER factual claim (macro statistics, rates, prices, drug data, records) automatically falls through to the grounded pipeline — routed to the right live source, answered with verbatim evidence, then judged. Returns a verdict (confirmed / approximately_correct / refuted / inconclusive / unsupported), the grounded or structured actual value with pipeworx:// citation, and reasoning. Replaces 4–6 sequential calls (NL parsing → entity resolution → data lookup → comparison).

ParametersJSON Schema

Name	Required	Description	Default
`claim`	Yes	Natural-language factual claim, e.g., "Apple's FY2024 revenue was $400 billion" or "Microsoft made about $100B in profit last year".
`tolerance_pct`	No	Max percent deviation still graded approximately_correct (0.5–50). Overrides the tolerance implied by the claim wording — set 1–2 for hallucination detection where any material error must be refuted. Default: implied by wording, capped at 5.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false. The description adds context about the two processing pathways and return format (verdict, value, reasoning), which is useful but not critical given the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with usage examples and a clear summary of functionality. While it is somewhat lengthy, every sentence adds value, and the structure is logical.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description fully explains the return values (verdict types, citation, reasoning). It covers the tool's behavior for both claim types and is complete given the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds practical guidance for tolerance_pct (e.g., use 1-2 for hallucination detection) and provides claim examples. This elevates the parameter understanding beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs natural-language claim verification against authoritative sources, with specific verb+resource. It distinguishes between company-financial claims (structured SEC EDGAR+ XBRL) and other factual claims (grounded pipeline), setting it apart from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises to use whenever verifying factual correctness and differentiates between financial and non-financial claims. However, it does not provide explicit when-not-to-use cases or alternative tool references beyond the implicit distinction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Server Details

Available Tools

Discussions

Your Connectors

Resources