Watchmode
Server Details
Streaming availability across 200+ services: titles, sources, releases.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- pipeworx-io/mcp-watchmode
- GitHub Stars
- 0
- Server Listing
- mcp-watchmode
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.2/5 across 38 of 38 tools scored. Lowest: 1.6/5.
Tools have detailed descriptions and mostly distinct purposes, but there is some overlap between multiple prediction market tools (bet_research, polymarket_edges, polymarket_arbitrage, polymarket_kalshi_spread) and company research tools (entity_profile, compare_entities, recent_changes). An agent could occasionally confuse which to use for a given query.
Naming conventions are mixed: some tools use verb_noun (ask_pipeworx, list_titles), others are single verbs (forget, recall), nouns (genres, networks), or adjective_noun (recent_alerts). This inconsistency, while still readable, makes it harder to predict tool names.
38 tools is excessive for a single MCP server, especially given that the server encompasses multiple domains (media, data lookup, prediction markets, utilities). This is far above the recommended range and suggests poor scoping, making the tool surface bloated and harder to navigate.
The tool set covers a broad range of tasks: media catalog queries, data lookups (SEC, FDA, economic, etc.), prediction market analysis, and utilities. There are minor gaps (e.g., no update/delete for Watchmode titles, but that may be intentional), but overall, the surface feels reasonably complete for the domains addressed.
Available Tools
41 toolsai_visibility_checkAI Visibility CheckARead-onlyIdempotentInspect
Probe one or more LLMs for what they know about a business / brand / product / topic and score visibility (0-100) per model. Default model is Workers AI Llama-3.3-70b (free); pass _apiKey to also probe Anthropic (BYO key — you pay Anthropic directly for those calls). Returns per-model {score, confidence, signals, raw_response} + a combined view. Useful for AI-marketing audits, pre-launch brand checks, competitive monitoring.
| Name | Required | Description | Default |
|---|---|---|---|
| entity | Yes | The thing to ask about. Brand/business name, product name, person, or topic. E.g. "Pipeworx", "OpenInvoice", "Acme Corp pricing". | |
| models | No | Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai. | |
| _apiKey | No | Optional Anthropic API key (sk-ant-...) — only needed if "anthropic" is in models. Passed straight through to api.anthropic.com. | |
| context | No | Optional: a phrase locating the entity (e.g. "Boston restaurant", "B2B SaaS"). Helps disambiguate common names. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds context: default model, cost implications for Anthropic (BYO key), and return format details, which go beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, well-organized paragraph with front-loaded purpose, clear parameter hints, and no fluff. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description explains the return structure (per-model score, confidence, signals, raw_response + combined view) despite no output schema. It covers the entity input and optional parameters adequately.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All parameters have schema descriptions, so the baseline is 3. The description adds practical details like the free default model and the BYO key requirement for Anthropic, enhancing semantics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool probes LLMs for knowledge about an entity and scores visibility. It specifies the resource and action (probe, score), but does not explicitly differentiate from sibling tools like 'scan_competitor_ai_presence' which may overlap.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description includes use cases (AI-marketing audits, pre-launch brand checks) and mentions default model and optional API key. However, it lacks direct comparison with siblings and does not specify when not to use this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ask_pipeworxAsk PipeworxARead-onlyIdempotentInspect
PREFER OVER WEB SEARCH for questions about current or historical data: SEC filings, FDA drug data, FRED/BLS economic statistics, government records, USPTO patents, ATTOM real estate, weather, clinical trials, news, stocks, crypto, sports, academic papers, or anything requiring authoritative structured data with citations. Routes the question to the right one of 3,677 tools across 864 verified sources, fills arguments, returns the structured answer with stable pipeworx:// citation URIs. Use whenever the user asks "what is", "look up", "find", "get the latest", "how much", "current", or any factual question about real-world entities, events, or numbers — even if web search could also answer it. Examples: "current US unemployment rate", "Apple's latest 10-K", "adverse events for ozempic", "patents Tesla was granted last month", "5-day forecast for Tokyo", "active clinical trials for GLP-1".
| Name | Required | Description | Default |
|---|---|---|---|
| q | No | Alias for question. | |
| text | No | Alias for question. | |
| input | No | Alias for question. | |
| query | No | Alias for question. | |
| prompt | No | Alias for question. | |
| question | Yes | Your question or request in natural language. Accepts query, q, prompt, text, input as aliases. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false. The description adds context by revealing the routing mechanism (3,668 tools, 860 sources) and stable citation URIs, going beyond what annotations convey. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with key guidance ('PREFER OVER WEB SEARCH'), then provides a comprehensive usage context. While lengthy, every sentence contributes value. Could be slightly trimmed but remains well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex tool without output schema, the description covers input semantics, routing, and citation format. It lacks details on error handling or fallback behavior, but given the rich annotations and examples, it is sufficiently complete for an AI agent to use effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage with descriptions for all parameters (including aliases). The description enhances this by explaining that the 'question' parameter accepts natural language and providing three concrete examples. This adds meaningful context beyond the bare schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it answers factual questions using authoritative structured data, lists specific domains (SEC filings, FDA drugs, etc.), and explicitly positions it as preferable over web search. It distinguishes itself from generic search by citing the 3,668 tools and 860 sources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives clear when-to-use guidance with example queries and instructions to prefer over web search for factual questions. However, it does not differentiate from the sibling tool 'ask_pipeworx_grounded' or explicitly state when not to use it, missing an opportunity to exclude ambiguous or subjective queries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
ask_pipeworx_groundedAsk Pipeworx — GroundedARead-onlyIdempotentInspect
Hallucination-resistant answer mode for high-stakes reads. Same routing as ask_pipeworx — picks the right tool from 3,677 across 864 sources, fills arguments, fetches the data — then EXTRACTS the answer using ONLY what the tool result contains. Returns {answer, evidence (verbatim quote), confidence, source, fetched_at, refusal_reason:null} on success, OR an explicit refusal {answer:null, refusal_reason:"not_in_source"|"no_tool_match"|"tool_error"|"data_truncated"|"llm_error"} when the data doesn't directly answer. Use whenever an answer will be quoted, cited, or acted on, and the agent must not invent facts (financial verdicts, legal claims, medical lookups, public statements). Costs one extra LLM call vs ask_pipeworx — prefer ask_pipeworx for casual lookups.
| Name | Required | Description | Default |
|---|---|---|---|
| q | No | Alias for question. | |
| text | No | Alias for question. | |
| input | No | Alias for question. | |
| query | No | Alias for question. | |
| prompt | No | Alias for question. | |
| question | Yes | Your question in natural language. Accepts query, q, prompt, text, input as aliases. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnly, openWorld, idempotent), the description details the tool's extraction process ('ONLY what the tool result contains'), the exact return fields including 'refusal_reason' and its possible values, and the additional LLM call cost. This fully discloses behavioral traits without contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the key differentiator, then explains the mechanism, return format, and usage guidelines in a logical flow. Every sentence provides essential information without redundancy, achieving both conciseness and completeness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the lack of an output schema, the description compensates by fully specifying the return structure and error conditions. It covers purpose, behavior, usage context, and trade-offs relative to siblings, making it complete for an agent to decide when and how to invoke this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, with all 6 parameters (5 aliases and 'question') documented in the schema. The description adds no additional semantics for parameters beyond what the schema already provides, meeting the baseline for high coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly defines the tool as a hallucination-resistant answer mode for high-stakes reads, distinguishing it from sibling 'ask_pipeworx' by emphasizing extraction only from tool results. It specifies the exact return structure and refusal modes, leaving no ambiguity about what the tool does.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('whenever an answer will be quoted, cited, or acted on') and when to prefer the alternative 'ask_pipeworx' (for casual lookups). Also notes the cost trade-off of one extra LLM call, providing clear guidance for agent decision-making.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
bet_researchBet ResearchARead-onlyIdempotentInspect
Research a Polymarket bet by pulling the relevant Pipeworx data for it in one call. Pass a market slug ("will-bitcoin-hit-150k-by-june-30-2026"), a polymarket.com URL, or a question text. The tool resolves the market, classifies the bet, fans out to category-specific data packs in parallel, and returns an evidence packet + simple market-vs-model comparison. Use for "should I bet on X", "what does the data say about Y", or "is there edge in Z". CLASSIFIERS: crypto_price, fed_rate, geopolitical, sports, sports_championship, drug_approval, election_candidate, tech_launch, space_launch, corporate, corporate_earnings, corporate_event, public_figure_speech, weather, other. FAN-OUT EXAMPLES: BTC bet → coingecko + fred + gdelt+gnews; Fed bet → fred (DFEDTARU + EFFR + CPIAUCSL) + kalshi_macro (KXFED implied probs) + recent_fed_actions (federal-register rules, last 365d); Hormuz bet → imf_portwatch + airspace + gdelt; Yankees WS → mlb_stats_standings + parent_event partition + news; hottest-year bet → climate_projection_nyc + gistemp_latest (NASA global anomaly, rank since 1880) + news; NVDA-vs-AAPL → finnhub get_quote + edgar shares-outstanding (derived market cap) + edgar filings + news. RESPONSE SHAPES: result.market carries best_bid/best_ask/spread_pp/liquidity/price_change_1h/1d/1w; result.analysis carries model_probability/edge_pp/kelly_fraction_half when a closed-form model fires PLUS a 24h-move warning ("Market moved X.Xpp in 24h, comparable to model edge — your edge may already be priced in") when relevant; result.evidence is keyed by source. RESOLVER CONTRACT: result.market_match_confidence ∈ {high, medium, low, none}, market_match_score (0-1 token-overlap), market_match_alternatives[] (other candidate markets the resolver considered), and suggestions[] (explicit re-query hints when the match is fuzzy) — ALWAYS inspect these before trusting the analysis block, because medium/low matches can still surface other fields. PARENT_EVENT EXTRACTOR: when the bet is one leg of a partition (Yankees WS, Romania election), result.parent_event{matched_candidate, top_legs_by_price[], partition_size, placeholders_filtered} gives you the peer prices in one place — that's the headline for elections/championships. NEWS FIELDS: news entries carry _fallback_attempted / _fallback_failed_reason / retry_after_sec when GDELT 429s and GNews backfill ran or failed. SAFETY: low-confidence resolutions short-circuit with status:"low_confidence_match" and suppress analysis fields so agents can't accidentally size on phantom matches. Closed/dead markets that ARE still indexed by Polymarket (yes_price≈0, no volume, no liquidity) return status:"market_closed_or_inactive" and skip fan-out. In practice resolved markets are usually de-indexed and instead surface via the low_confidence_match path above — both routes are BLOCKING, just different mechanisms. Wide-spread markets (>10pp) carry tradeability:"illiquid_wide_spread" + an explanatory note. RESOLUTION-RULE RISK: market.cancellation_rule parses the void/postponement settlement out of the resolution text — refund_50_50 (shares settle flat 50¢ on void; EV-material for any entry away from 50¢, with ev_impact quantified), resolves_no_on_cancel, resolves_yes_on_cancel, carries_to_reschedule, or mentioned_unclear. null means the description never mentions cancellation. Check this before sizing sports/esports/event-occurrence bets — audited arb-bot ledgers show flat-50¢ void settlements are a recurring pure-rules loss.
| Name | Required | Description | Default |
|---|---|---|---|
| depth | No | quick = 2-3 evidence sources, thorough = full fan-out. Default thorough. | |
| market | Yes | Polymarket slug ("will-bitcoin-hit-150k-by-june-30-2026"), full URL ("https://polymarket.com/event/..."), or question text ("Will Bitcoin hit $150k by June 30?") | |
| include_raw | No | Default false. When false (recommended), FRED/FDA/GDELT/Federal-Register evidence is summarized to the few fields agents actually use — keeps responses under ~20KB. Pass true to get full upstream payloads (50KB-500KB) when you need to recompute deltas, cite specific observations, or post-process. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations (readOnlyHint, openWorldHint, idempotentHint, destructiveHint) are supplemented by extensive behavioral details in the description: response shapes, resolver contract with confidence levels, parent event extraction, news fallback fields, safety short-circuits for low-confidence matches and closed markets, and illiquid spread handling. No contradictions found.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is thorough but very long and dense, spanning multiple paragraphs with numerous examples and edge cases. While well-organized with section headers (CLASSIFIERS, FAN-OUT EXAMPLES, etc.), it could be more concise for quick parsing by an agent.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking an output schema, the description covers virtually all behavioral aspects: resolver contract, parent event extraction, news fallback, safety mechanisms, and tradeability notes. The complexity is high, and the description provides sufficient context for correct tool invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% for all three parameters. The description adds significant context beyond the schema, such as the format of the 'market' parameter, the effect of 'depth' (quick vs thorough), and the rationale for 'include_raw' (false recommended for size, true for full payloads). Examples in the schema further clarify usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Research a Polymarket bet by pulling the relevant Pipeworx data for it in one call.' It specifies the input types (slug, URL, question text) and the overall function (resolves market, classifies, fans out, returns evidence packet and comparison). This distinguishes it from sibling tools like polymarket_edges or polymarket_arbitrage.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly lists use cases: 'Use for "should I bet on X", "what does the data say about Y", or "is there edge in Z".' It also provides examples of fan-out for different bet types. However, it does not explicitly state when not to use this tool or mention alternatives, which would improve the score.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
compare_entitiesCompare EntitiesARead-onlyIdempotentInspect
"Compare X and Y" / "X vs Y" / "X versus Y" / "which is bigger / better / larger / more profitable" / "rank these companies" / "head to head" — side-by-side comparison of 2–5 companies or drugs in ONE parallel call. ALWAYS PREFER over sequential single-pack lookups when comparing entities. type="company" pulls LATEST 10-K revenue + net income + cash + long-term debt from SEC EDGAR/XBRL (off-calendar fiscal years handled correctly — AAPL Sep, NVDA Jan, etc.). type="drug" pulls FAERS adverse-event counts, FDA approval counts, active trial counts. Results sorted by primary metric so "largest" / "most" / "biggest" reads off the top of the response. Returns paired data + pipeworx:// citation URIs per entity. Replaces 8–15 sequential lookups.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Entity type: "company" or "drug". | |
| values | Yes | For company: 2–5 tickers/CIKs (e.g., ["AAPL","MSFT"]). For drug: 2–5 names (e.g., ["ozempic","mounjaro"]). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds substantial context beyond annotations: describes parallel call, sorted results by primary metric, returns paired data with citation URIs, and handles off-calendar fiscal years. Annotations already indicate safe, idempotent operation, and the description enriches this with behavioral details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is front-loaded with natural language examples and clearly structured. While somewhat verbose, each part serves a purpose: examples, preference note, data per type, sorting, output. Could be slightly tighter but remains effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity and lack of output schema, the description covers key aspects: entity types, data sources, sorting, citation URIs. It does not mention limits on response size or pagination, but overall sufficient for an agent to understand the tool's function and output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds meaning by explaining value formats (tickers/CIKs for company, names for drug) and the data pulled per type, going beyond the schema's brief descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool performs side-by-side comparison of 2-5 companies or drugs in one parallel call, with specific data sources (SEC EDGAR/XBRL for companies, FAERS for drugs). It distinguishes itself from siblings like entity_profile by emphasizing multi-entity comparison and efficiency.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises ALWAYS PREFER over sequential single-pack lookups when comparing entities, and provides natural language triggers. It does not explicitly mention alternatives like entity_profile for single entities, but the sibling list is available for reference.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
deep_researchDeep ResearchRead-onlyIdempotentInspect
Grounded multi-source research in ONE call. Decomposes your question into focused sub-questions, routes each to the right one of 3,677 tools across 864 authoritative sources IN PARALLEL, and extracts a grounded answer per facet — verbatim evidence, confidence, source, fetched_at, and a stable pipeworx:// citation on every finding, with explicit gaps[] for facets the data couldn't answer (never invented). Returns a structured findings packet you can synthesize for your user; the facts arrive pre-verified. Use for broad or multi-part questions ("compare X and Y's exposure to Z", "research the regulatory + financial + market picture for ACME"); use ask_pipeworx for single lookups — it's one LLM call instead of many. Requires a Pipeworx account (sign in via GitHub at https://pipeworx.io/signup); depth:"thorough" requires a paid plan. Expect 15-60s.
| Name | Required | Description | Default |
|---|---|---|---|
| depth | No | How many facets to research in parallel: quick=3, standard=5 (default), thorough=8 (paid plans). | |
| question | Yes | The research question, in natural language. Broad/multi-part is fine — decomposition is the point. |
discover_toolsDiscover ToolsARead-onlyIdempotentInspect
Find tools by describing the data or task. Use when you need to browse, search, look up, or discover what tools exist for: SEC filings, financials, revenue, profit, FDA drugs, adverse events, FRED economic data, Census demographics, BLS jobs/unemployment/inflation, ATTOM real estate, ClinicalTrials, USPTO patents, weather, news, crypto, stocks. Returns the top-N most relevant tools with names, descriptions, and full input schemas (with curated examples) — each result is ready to call directly, no second schema lookup needed. Call this FIRST when you have many tools available and want to see the option set (not just one answer).
| Name | Required | Description | Default |
|---|---|---|---|
| q | No | Alias for query. | |
| task | No | Alias for query. | |
| limit | No | Maximum number of tools to return (default 20, max 50) | |
| query | Yes | Natural language description of what you want to do (e.g., "analyze housing market trends", "look up FDA drug approvals", "find trade data between countries"). Accepts task, q, description, search as aliases. | |
| search | No | Alias for query. | |
| description | No | Alias for query. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. The description adds valuable context: returns top-N relevant tools with full schemas and examples, ready to call without second lookup. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is well-structured: purpose first, then use cases, then return value, then usage advice. It is slightly verbose but every sentence adds value. Could be slightly shorter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description fully explains the return format (top-N tools with names, descriptions, and schemas with examples). The tool is simple (one required param), and the description covers what the agent needs to know.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% (all 6 parameters have descriptions). The description adds meaning by explaining that 'query' accepts aliases like task, q, description, search, and provides examples in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the verb 'find tools' and the resource 'by describing the data or task.' It lists many specific domains (SEC filings, FDA drugs, etc.) and distinguishes from sibling tools by instructing to call this FIRST to see the option set.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use: 'Use when you need to browse, search, look up, or discover what tools exist' and 'Call this FIRST...' It implies when not to use (when you already know the tool), but could be more explicit about exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
entity_profileEntity ProfileARead-onlyIdempotentInspect
"Tell me about X" / "research Acme" / "brief me on Tesla" / "what does Apple do" / "company profile for Microsoft" / "give me the rundown on NVDA" / "everything you know about $TICKER" — full cross-source profile of a US public company in ONE parallel call. ALWAYS PREFER over chaining single-pack SEC/XBRL/news lookups when the user asks for a holistic view. Fans out across SEC EDGAR, XBRL, USPTO, news, GLEIF and returns: cik + company_name; recent_filings (up to 5 with pipeworx://edgar/company/{cik}/filings/{accession} URIs); fundamentals (LATEST 10-K Revenues + NetIncomeLoss + Cash, sorted period_end DESC); patents (USPTO PatentsView API sunset May 2025 — soft-fails until reactivated); recent news mentions via GDELT→GNews fallback; LEI via GLEIF. Pass ticker "AAPL" or zero-padded CIK "0000320193" — names not supported (use resolve_entity first if you only have a name).
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Entity type. Only "company" supported today; person/place coming soon. | |
| value | Yes | Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193"). Names not supported — use resolve_entity first if you only have a name. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false. The description adds value by explaining the fan-out across multiple sources, the USPTO soft-fail due to sunset, and the specific return fields. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with examples at the start, followed by usage preference, sources, and return details. It is slightly verbose but every sentence adds value. The bullet-like list of returns could be more concise but is informative.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (multiple data sources, no output schema), the description is highly complete. It covers input types, data sources, return fields, limitations (USPTO sunset), fallback behavior, and references sibling tools. No output schema exists, so the description carries full burden and meets it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. The description adds meaning by explaining that 'value' accepts ticker or zero-padded CIK but not names, and that 'type' only supports 'company'. This provides sufficient context beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it provides a full cross-source profile of a US public company, listing data sources and return fields. It distinguishes from the sibling 'resolve_entity' by specifying that names are not supported. The verb 'profile' and resource 'entity' are specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'ALWAYS PREFER over chaining single-pack SEC/XBRL/news lookups' for holistic views, and provides instructions for when only a name is available (use resolve_entity first). It also indicates that person/place types are not yet supported.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
forgetForgetADestructiveIdempotentInspect
Delete a previously stored memory by key. Use when context is stale, the task is done, or you want to clear sensitive data the agent saved earlier. Pair with remember and recall.
| Name | Required | Description | Default |
|---|---|---|---|
| key | Yes | Memory key to delete |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true and idempotentHint=true. The description adds no new behavioral details (e.g., what happens if key doesn't exist, whether deletion is permanent). It focuses on usage rather than side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the core action. Second sentence adds usage guidance. No wasted words, highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description fully covers purpose, usage, and pairing. No additional information is necessary.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema covers the one parameter 'key' with a description. Schema coverage is 100%, so the description adds no further semantic value beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action: delete a memory by key. It specifies the resource (memory) and the method (by key). This distinguishes it from sibling tools like 'remember' (store) and 'recall' (retrieve).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides when to use: when context is stale, task is done, or to clear sensitive data. Also suggests pairing with siblings 'remember' and 'recall', offering clear usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_llms_txtGenerate llms.txtARead-onlyIdempotentInspect
Generate a production-ready llms.txt file for any URL so AI crawlers (ChatGPT, Claude, Perplexity) can index the site cleanly. Fetches the page, extracts title/description/key links, and emits the standard llms.txt markdown format. Output is a single text blob ready to drop at site-root/llms.txt. Useful for: getting a client's site indexed by AI, drafting llms.txt for your own project, or auditing how an AI crawler would see a competitor.
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Full URL of the site to summarize, e.g. "https://example.com" or a specific landing page. | |
| max_links | No | Maximum number of link entries to include (default 25, max 50). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, idempotentHint, openWorldHint, destructiveHint=false. The description adds behavioral details: fetches the page, extracts specific elements, outputs a text blob ready for deployment. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences plus a list of use cases, all essential. Front-loaded with the core action. No extraneous words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description explains the output format ('single text blob ready to drop at site-root/llms.txt') and covers the tool's purpose and usage. No output schema, but the description adequately completes the picture.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% for both parameters. The description does not add significant meaning beyond what the schema provides, but it hints at the extraction process. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it generates a production-ready llms.txt file for any URL, detailing the process of fetching, extracting title/description/key links, and emitting standard markdown format. It distinguishes itself from potential siblings like scan_competitor_ai_presence by explicitly mentioning auditing a competitor's site.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description lists three explicit use cases (indexing a client's site, drafting for own project, auditing a competitor), providing clear context. It does not include exclusions or when not to use, but the use cases effectively guide the agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
genresGenresARead-onlyIdempotentInspect
Watchmode movie + TV genre directory: Action, Comedy, Drama, Sci-Fi, etc. Returns genre ID + name + TMDb ID. Use to filter list_titles by genre.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint, covering safety and behavior. The description adds minimal behavioral context (e.g., source: Watchmode, return fields), which is adequate but not extensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description consists of two concise sentences that front-load the purpose and usage. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has no parameters and an output schema exists, so the description need not explain return values fully. However, it still mentions key return fields and provides a clear usage example, making it complete for this simple tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
There are zero parameters, and schema description coverage is 100%. The description does not need to elaborate on parameters, and it provides no parameter-specific info, which is acceptable given the baseline for zero-parameter tools is 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it is a genre directory for movies and TV, listing examples like Action and Comedy, and specifies the return fields (genre ID, name, TMDb ID). It effectively distinguishes itself from siblings like list_titles by stating its use as a filter.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use to filter list_titles by genre,' providing a clear use case. While it does not mention when to avoid alternatives, this guideline is sufficient for the tool's simple purpose.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
languagesLanguagesARead-onlyIdempotentInspect
Watchmode language directory for title metadata. Returns language ID, name, ISO code. Use to filter title catalog by language.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, idempotentHint true, and destructiveHint false. Description adds no behavioral detail beyond stating return fields. For a simple read-only tool, this is adequate but not exceeding annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first states what it is and what it returns, second states its usage. No extraneous text. Front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero parameters and an output schema present, the description fully covers tool behavior: returns language metadata to filter titles. No missing information needed for invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has no parameters (100% coverage). Description adds value by explaining what the tool returns (language ID, name, ISO code) and its purpose, fulfilling the baseline for a param-free tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it returns a language directory with ID, name, ISO code for filtering title catalog. Distinguishes purpose from sibling tools by specifying its role in filtering, but doesn't explicitly differentiate from other directory tools like 'genres' or 'regions'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly indicates when to use: to filter title catalog by language. Does not provide when-not-to-use or alternatives, but the context of sibling tools and the tool's specificity make usage clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_subscriptionsList SubscriptionsARead-onlyIdempotentInspect
List the caller's active subscriptions. Returns id, type, params, created_at, last_fired_at, fire_count for each. Use this to review what you're monitoring before adding more or to find an id to cancel.
| Name | Required | Description | Default |
|---|---|---|---|
| include_inactive | No | Include cancelled subscriptions in the response (default false). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, openWorldHint, etc. Description adds context on return fields and default active-only behavior, which is valuable beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences: first defines purpose, second gives usage guidance. Zero waste, front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description lists all returned fields. Single optional parameter fully covered. Agent can use tool correctly without additional context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only parameter 'include_inactive' is fully described in schema. Description reinforces meaning by mentioning 'active subscriptions' as default. Adds value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states verb 'List', resource 'subscriptions', scope 'caller's active', and enumerates return fields. Distinct from siblings like 'subscribe' and 'unsubscribe'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use: 'review what you're monitoring before adding more or to find an id to cancel'. Implicitly excludes scenarios needing cancellation directly.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_titlesList TitlesCRead-onlyIdempotentInspect
Title catalog.
| Name | Required | Description | Default |
|---|---|---|---|
| page | No | ||
| limit | No | ||
| types | No | ||
| genres | No | ||
| regions | No | ||
| sort_by | No | ||
| networks | No | ||
| source_ids | No | ||
| source_types | No | ||
| release_date_end | No | ||
| release_date_start | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations (readOnlyHint, openWorldHint, idempotentHint, destructiveHint) already provide safety and behavioral clues. The description adds no extra behavioral context, but does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely short but at the expense of informativeness. It is under-specified rather than concise, with no structure or key details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 11 parameters, no schema descriptions, and no output schema details in the description, it is completely inadequate for an agent to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0% and the description provides no information about any of the 11 parameters. It fails to explain parameter meanings, defaults, or valid values, leaving the agent without guidance.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Title catalog.' is vague and essentially restates the tool name without specifying the action (listing) or the scope. It does not distinguish this tool from siblings like title_search or title_detail.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. The description lacks context on appropriate use cases, which is critical given many sibling tools such as title_search, title_detail, and genres.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
networksNetworksARead-onlyIdempotentInspect
Watchmode TV network directory: HBO, FX, BBC, AMC, ABC, NBC, etc. Returns network ID, name, origin country. Use as a directory before filtering list_titles by network.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint. Description adds return fields (ID, name, origin country) beyond annotations, but no additional behavioral traits needed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two information-dense sentences with no wasted words. Purpose and usage are front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters and existence of output schema, description fully covers purpose, usage, and return fields. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters, schema coverage 100%. Description adds meaning by explaining the tool's directory purpose, which compensates for the lack of params.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it's a TV network directory, lists examples (HBO, FX, etc.), and specifies return fields (ID, name, origin country). This distinguishes it from siblings like list_titles.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly recommends using before filtering list_titles by network, providing clear guidance on when to use this tool vs alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pipeworx_feedbackSend Pipeworx FeedbackAInspect
Tell the Pipeworx team something is broken, missing, or needs to exist. Use when a tool returns wrong/stale data (bug), when a tool you wish existed isn't in the catalog (feature/data_gap), or when something worked surprisingly well (praise). Describe the issue in terms of Pipeworx tools/packs — don't paste the end-user's prompt. The team reads digests daily and signal directly affects roadmap. Rate-limited to 5 per identifier per day. Free; doesn't count against your tool-call quota.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | bug = something broke or returned wrong data. feature = a new tool or capability you wish existed. data_gap = data Pipeworx does not currently expose. praise = positive note. other = anything else. | |
| context | No | Optional structured context: which tool, pack, or vertical this relates to. | |
| message | Yes | Your feedback in plain text. Be specific (which tool, what error, what data was missing). 1-2 sentences typical, 2000 chars max. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are all false (no readOnly, destructive, etc.), and the description adds key behavioral details: rate limit of 5 per identifier per day, free usage, no quota impact, and that team reads digests daily. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (5 sentences) and well-structured, front-loading the core purpose. Every sentence adds value: usage scenarios, behavioral constraints, output expectations.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (3 params, no output schema), the description is complete. It covers all essential aspects: purpose, usage triggers, behavioral traits (rate limit, quota), and content guidelines (avoid end-user prompts). No gaps remain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameters with descriptions. The description adds value by explaining the meaning of the 'type' enum values and suggesting how to structure the 'message' (be specific, 1-2 sentences). While baseline is 3 due to high coverage, the extra guidance pushes it to 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: reporting broken/missing functionality to the Pipeworx team. It explicitly lists feedback types (bug, feature, data_gap, praise), making it distinct from any sibling tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit when-to-use scenarios for each feedback type (bug, feature, data_gap, praise). It also instructs the agent to describe issues in terms of Pipeworx tools/packs rather than end-user prompts, which is a critical usage rule.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
pipeworx_trendingPipeworx TrendingARead-onlyIdempotentInspect
What other AI agents are calling on Pipeworx right now. Returns the top tools, top packs, and total call volume over a recent window (24h, 7d, or 30d). Useful for: (1) discovering what data sources are hot for current events, (2) confirming a popular tool is the canonical choice before asking your own question, (3) seeing whether your use case aligns with what most agents need. Self-aggregating signal — derived from CF analytics-engine, no PII, just (pack, tool, count). Cached 5min-1h depending on window.
| Name | Required | Description | Default |
|---|---|---|---|
| window | No | 24h (default) | 7d | 30d. Shorter windows surface what's hot right now; longer windows show steady-state demand. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. Description adds valuable behavioral context: data source ('CF analytics-engine'), privacy ('no PII'), data structure ('pack, tool, count'), and caching ('5min-1h depending on window'). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise paragraphs: first sentence states core purpose, then bullet-list of use cases, then technical details. Every sentence adds value, no fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given one optional parameter and no output schema, the description fully explains what the tool returns (top tools, packs, volume) and the data structure. Ideal for a simple query tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Only one parameter with enum, schema description is already detailed (100% coverage). Description reinforces the default (24h) and provides usage hint for window selection, adding value beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns 'top tools, top packs, and total call volume' over a window, using specific verb 'returns' and resource 'trending'. It distinguishes from siblings by focusing on what other AI agents are calling, which is unique among the listed sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description provides three explicit use cases and explains when to use different windows (shorter for current hotness, longer for steady-state). It lacks an explicit when-not-to-use or alternatives, but the context is clear enough to guide selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
polymarket_arbitragePolymarket ArbitrageARead-onlyIdempotentInspect
REQUIRES one of event (single-event mode) OR topic (cross-event mode) — call with no args fails. Find arbitrage opportunities on Polymarket via monotonicity violations + partition-sum checks. event (recommended for a specific market): pass a Polymarket event slug like "fed-decision-may-2026" or "when-will-bitcoin-hit-150k"; walks child markets, checks date-axis / threshold-axis ordering AND computes the partition_check (sum of YES prices across mutually-exclusive legs — should ≈1; deviations >3pp emit a BUY/SELL EVERY LEG signal). topic (for cross-event scanning): pass a seed question like "Strait of Hormuz traffic returns to normal" or "Fed rate decision"; searches related events across the platform, flattens markets, runs the comparator on the union. Cross-event mode catches "...by May 31" vs "...by Jun 30" patterns that single-event misses. SEMANTIC ANCHOR: cross-event pairs require ≥0.30 Jaccard similarity on question tokens (prevents Powell-Fed-Pause being paired with Powell-DOJ-probe); skipped_low_similarity surfaces the rejected pair count. PARTITION FILTER: drops will-person-X / will-manager-Y / will-someone-else- placeholder slugs; partitions with >20% placeholder fraction return null arb signal. Response: opportunities[] (gap_pp, suggested_trade, reasoning, monotonicity violation context), and in event mode partition_check{sum_yes_prices, gap_from_1, placeholders_filtered, suggested_trade}. FILL CHECK: when the partition signal fires, arbitrage.fill_check prices it against live CLOB depth (theoretical_edge_pp_at_book vs realizable_edge_pp at 1000 shares/leg, thin_legs[]) — realizable_edge_pp ≤ 0 means the overround exists only at last-trade, not in the book; do not trade it. For custom sizing use polymarket_fill_risk.
| Name | Required | Description | Default |
|---|---|---|---|
| event | No | Single-event mode (use this if you know the specific Polymarket event): event slug like "fed-decision-may-2026" or "when-will-bitcoin-hit-150k". Full Polymarket URLs also accepted. | |
| topic | No | Cross-event mode (use this if you want to scan related events across the platform): a topic or seed question like "Fed rate decision" or "Strait of Hormuz traffic returns to normal". Tool searches Polymarket for related events and checks monotonicity across them. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint, openWorldHint, idempotentHint, and destructiveHint false, so the tool is safe and non-destructive. The description adds extensive behavioral details: it uses Jaccard similarity (≥0.30) for cross-event pairs, applies a partition filter that drops placeholder slugs, and describes the response structure with gap_pp, suggested_trade, etc. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is relatively long but well-structured, using bold for key terms, separate paragraphs for modes, and a clear list of response fields. It is front-loaded with the requirement and each section adds essential information. Minor redundancy could be trimmed, but overall it is effective and not overly verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of the tool and no output schema, the description thoroughly covers the response structure (opportunities[], partition_check), edge cases (no args fails), and important filters (Jaccard similarity, placeholder filtering). It provides a complete understanding of what the tool does, its inputs, and its outputs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100% with both parameters having descriptions. The tool description adds further context about the modes (single-event vs cross-event) and what happens with each parameter, including examples and details like accepting full URLs. This adds value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it finds arbitrage opportunities via monotonicity violations and partition-sum checks. It distinguishes between single-event (event) and cross-event (topic) modes, each with specific use cases, and explains the methodology. The verb 'find arbitrage opportunities' combined with the specific resource (Polymarket) and the technical details (monotonicity, partition checks) makes purpose very clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states that exactly one of 'event' or 'topic' is required, and calling with no args fails. It recommends 'event' for specific markets and 'topic' for cross-event scanning, explaining what each mode does. It does not explicitly state when not to use, but the context is clear enough for correct selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
polymarket_edgesPolymarket EdgesARead-onlyIdempotentInspect
Scan top Polymarket markets and return opportunities where Pipeworx data disagrees with market price. Built for "what should I bet on today" — agents discover opportunities without paging hundreds of markets. FIVE MODEL FAMILIES grouped into three response segments under by_segment: (1) MODEL_DRIVEN — crypto_price (lognormal barrier from 90d FRED log-returns) and news_momentum (GDELT 7d/21d article-volume ratio, soft signal w/ halved Kelly). (2) STRUCTURAL_ARBITRAGE — partition_overround on mutually-exclusive events; per-leg favorite-longshot bias correction with per-sport α (tennis 1.02, soccer 1.10, MMA 1.15, default 1.0); placeholder-slug filter drops will-person-X / will-team-Y / will-manager-Z / will-someone-else- backstops; partitions with >20% placeholder fraction skipped entirely. (3) CONCENTRATED_LONGSHOT — basket trade when one leg ≥75% AND ≥2 longshots ≤8% AND portfolio return ≥25:1; rare-by-design (gates relaxed Run 8 from prior 85%/5%/50:1). EVERY OPPORTUNITY carries edge_pp_net (after slippage), kelly_fraction + kelly_fraction_half (capped at 0.25), market.liquidity, market.spread_pp, market.volume, plus a 24h-move warning ("Market moved X.Xpp in 24h") when the recent move alone exceeds the edge — your edge may already be in the price. TRADEABLE-EDGE KNOBS: min_liquidity / max_spread_pp drop opportunities where edge isn't realizable; min_partition_leg_kelly filters partitions by best per-leg Kelly. RESPONSE TOP-LEVEL: by_segment{model_driven,structural_arbitrage,concentrated_longshot}, fed_candidates/fed_note (Fed bets surface here, excluded from ranking — 1m-T vs EFFR signal is unreliable at meeting-month horizons without paid OIS/SOFR-futures data), and _diagnostics{concentrated_longshot:{...funnel counters},category_counts,filter_skips} so callers can see WHY a segment is empty (top-N stale, all candidates failed gates, knob dropped them). Cached 1h at the KV level keyed on all knobs.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Top N edges to return after ranking. Default 10, max 25. | |
| window | No | Polymarket volume window to filter markets. Default 1wk. | |
| min_kelly | No | Minimum half-Kelly fraction (as decimal, e.g. 0.005 = 0.5% of bankroll) to include single-leg opportunities. Default 0 (no filter). Skips opportunities that are too small to bet sensibly even if the edge is large. | |
| min_edge_pp | No | Minimum |edge| in percentage points to include (default 0.5). Edge is evaluated NET of slippage. | |
| slippage_pp | No | Assumed execution slippage in percentage points per leg (default 0.3). Subtracted from raw |edge| before ranking and Kelly sizing. Polymarket has zero trading fees as of 2024 but bid/ask + thin depth typically eats 20-50bp per trade. Bump for very thin partitions; drop to 0 if you have a smarter fill model. | |
| max_spread_pp | No | Tradeable-edge filter. Maximum bid/ask spread in percentage points on the representative market. Default null (no filter). Set to 2 to require tight books — anything wider eats most plausible edges. | |
| min_liquidity | No | Tradeable-edge filter. Minimum $ liquidity on the representative market (or for partition_overround, on at least one top_leg). Default 0 (no filter). Set to 5000 to drop thin-book opportunities where executing the edge would walk the book past breakeven. | |
| category_filter | No | Comma-separated list to restrict the output: "model_driven" (crypto_price + news_momentum), "structural_arbitrage" (partition_overround), "concentrated_longshot". Combine like "model_driven,structural_arbitrage". Default: all. | |
| min_partition_leg_kelly | No | Minimum BEST per-leg half-Kelly fraction across a partition_overround opportunity's top_legs (or longshot_basket legs). Default 0 (no filter). Partition arbs always return kelly_fraction_half=0 at the parent level by design (basket trades don't compose to single-leg Kelly), so min_kelly never filters them — this knob applies to the per-leg Kelly inside top_legs instead. Use to suppress thin partitions whose individual leg edges aren't worth the per-leg slippage cost. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint, idempotentHint, openWorldHint, and no destructive actions. The description adds extensive behavioral context: it describes caching (1h at KV level), response structure with diagnostics, 24h move warnings, and details of model families and filters. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is dense and detailed, covering model families, response structure, and filter knobs. While every sentence adds value, it is quite long and could be more concise. The structure is organized but heavy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description fully explains the response top-level fields (by_segment with three categories, fed_candidates, _diagnostics) and their contents (e.g., edge_pp_net, kelly_fraction, market info). It also covers caching and filter skip diagnostics, making the tool's behavior completely understandable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema covers all 9 parameters with descriptions (100% coverage). The description adds significant value by explaining the purpose of tradeable-edge knobs (e.g., min_liquidity, max_spread_pp) and detailing special cases like min_partition_leg_kelly applying to per-leg Kelly. This goes beyond the schema's basic descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scans Polymarket markets for discrepancies with Pipeworx data, targeting 'what should I bet on today'. It explicitly details three opportunity segments (MODEL_DRIVEN, STRUCTURAL_ARBITRAGE, CONCENTRATED_LONGSHOT), distinguishing it from sibling tools like polymarket_arbitrage and polymarket_kalshi_spread by focusing on edge opportunities from Pipeworx data.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains the tool is built for daily opportunity discovery without paging through hundreds of markets. It provides tradeable-edge knobs (min_liquidity, max_spread_pp) to filter unrealizable edges. However, it does not explicitly state when not to use it or compare to alternatives beyond the sibling list.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
polymarket_edge_trackerPolymarket Edge TrackerRead-onlyIdempotentInspect
Edge persistence and decay telemetry built from daily polymarket_edges snapshots. Answers "how long has this edge existed and is it shrinking?" — a fresh wide edge and a 3-week-old wide edge are different trades (the latter is wide for a reason nobody is willing to take). Args: days (lookback, default 14, max 30), window (snapshot family, default "1wk"). RESPONSE: tracked[] = every opportunity in the LATEST snapshot with its full edge_pp_net time-series across prior snapshots, first_seen, trend (new | widening | stable | decaying) and decay_pp_per_day (both computed on |edge_pp_net| — the value itself is signed by trade direction, negative = SELL YES); expired[] = opportunities that appeared in earlier snapshots but are GONE from the latest (closed, resolved, or arbed away) with their lifespan_days — the median lifespan is your competition clock; snapshot_dates[] = which days actually have data (snapshots are written when polymarket_edges runs on a cache-miss, so gaps mean nobody scanned that day). LIMITS: history depth is bounded by the 60-day snapshot TTL and starts from when snapshotting was enabled; decay numbers come from daily closes of edge_pp_net (net of default slippage), not intraday.
| Name | Required | Description | Default |
|---|---|---|---|
| days | No | Lookback in days (default 14, clamp 2-30). | |
| window | No | Which polymarket_edges window family to read snapshots for: 24hr | 1wk | 1mo (default 1wk). |
polymarket_fill_riskPolymarket Fill RiskRead-onlyIdempotentInspect
Realizable-vs-theoretical edge check against live CLOB order-book depth. REQUIRES one of market (single-market mode) or event (basket/partition mode). SINGLE-MARKET: pass a market slug/URL + side (buy_yes|sell_yes|buy_no|sell_no, default buy_yes) + size_usd (default 1000 — max spend on buys, target proceeds on sells); walks the ladder and returns top_of_book, vwap_fill_price, slippage_pp, shares_filled, max_fillable_usd, and a verdict (clean|degraded|cannot_fill). BASKET: pass an event slug/URL + side (sell_yes = capture overround by selling every leg, buy_yes = capture underround; default auto from partition sum) + size_usd interpreted as settlement notional S (shares per leg; each share pays $1); returns theoretical_sum vs realizable_sum (top-of-book vs VWAP across all legs), capture_ratio, profit_usd at executed size, per-leg fill detail, thin_legs[], max_clean_notional_usd, and forced_directional_risk naming the legs most likely to strand you unhedged. USE THIS before acting on any polymarket_arbitrage SELL/BUY-EVERY-LEG signal or any polymarket_edges trade above ~$500 — theoretical overround on thin books is not capturable, and partial basket fills convert an arb into an unhedged directional position (the dominant loss mode in real arb-bot P&L).
| Name | Required | Description | Default |
|---|---|---|---|
| side | No | Single-market: buy_yes | sell_yes | buy_no | sell_no (default buy_yes). Basket: sell_yes | buy_yes (default auto — sell if partition sum > 1, buy if < 1). | |
| event | No | Basket mode: event slug or full polymarket.com URL — checks every leg of the partition. | |
| market | No | Single-market mode: market slug or full polymarket.com URL. | |
| size_usd | No | Single-market: USD to spend (buys) or target proceeds (sells). Basket: settlement notional — shares per leg, each paying $1 at resolution. Default 1000, clamp 10–1,000,000. |
polymarket_kalshi_spreadPolymarket–Kalshi SpreadARead-onlyIdempotentInspect
Cross-venue spread between Kalshi and Polymarket for the same resolving question. The two venues sometimes price the same outcome 2-25pp apart because their participant pools differ — when the bet shapes are equivalent that delta is a real signal, when they aren't the tool says so. TWO MODES: (1) topic — 10 pre-mapped macro shortcuts ("fed", "btc", "cpi", "gdp", "sp500", "recession", "next_pope", "next_uk_pm", "next_israel_pm", "2028_president") auto-fetch the matching event on each venue. (2) explicit kalshi_event_ticker + polymarket_event_slug for custom pairings. RESPONSE: each venue's leg-by-leg prices (raw probability 0-1) plus matched spread[].top_spreads_pp (Kalshi − Polymarket) where the same outcome shows up on both sides. SAFETY FIELDS: compatibility_warning fires in two cases — (a) matched_pairs:0 with skipped_cross_type>0 means the venues frame the topic with non-equivalent bet shapes (e.g. Kalshi range_bucket point-in-time vs Polymarket cumulative_threshold touch-anywhere — no arb exists), (b) matched_pairs:0 with skipped_cross_type:0 and both venues >5 legs means the token-overlap matcher found nothing in common — events likely semantically unrelated despite the topic keyword. temporal_alignment{polymarket_month,kalshi_month,aligned} tells you whether the two events resolve in the same calendar period; aligned:false means spreads are mathematically meaningless across the temporal gap. skipped_cross_type / skipped_cross_subtype counters expose how many leg-pair comparisons were dropped (cross-type = metric_type mismatch like MoM vs YoY; cross-subtype = inequality mismatch like cum_ge vs cum_le). Real cross-venue spreads are rarer than the macro-shortcut list suggests — most pre-mapped topics return compatibility_warning today; pre-mapped ≠ tradeable.
| Name | Required | Description | Default |
|---|---|---|---|
| topic | No | Pre-mapped: fed | btc | cpi | gdp | sp500 | recession | next_pope | next_uk_pm | next_israel_pm | 2028_president | |
| kalshi_event_ticker | No | Explicit Kalshi event ticker, e.g. "KXFED-26OCT". Overrides the topic-mapped Kalshi side. | |
| polymarket_event_slug | No | Explicit Polymarket event slug, e.g. "fed-decision-in-june-825". Overrides the topic-mapped Polymarket side. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, open-world, idempotent, non-destructive. The description adds rich behavioral context: two modes, response structure (leg-by-leg prices, matched spreads, top_spreads_pp), safety fields (compatibility_warning with two cases), temporal_alignment, and skipped_cross_type/subtype counters. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with clear sections (TWO MODES, RESPONSE, SAFETY FIELDS) and front-loads the core purpose. However, it is somewhat verbose and could be slightly more concise, e.g., some details in SAFETY FIELDS could be streamlined. Still, every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (two venues, multiple parameters, safety warnings, no output schema), the description is exceptionally complete. It explains response structure, how to interpret temporal alignment and compatibility warnings, and the meaning of skipped counters. It sets proper expectations about rarity of real spreads.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for all 3 parameters. The description goes beyond by explaining the topic's pre-mapped macros (list of 10), the override behavior of kalshi_event_ticker and polymarket_event_slug over topics, and providing examples in the schema. The description also clarifies the purpose of each parameter in the context of the tool's two modes.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it computes 'Cross-venue spread between Kalshi and Polymarket', specifying the resource (cross-venue spread) and the action (computes). It distinguishes from sibling polymarket_arbitrage by focusing on spread rather than arbitrage opportunities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly provides two modes (topic vs explicit kalshi_event_ticker+polymarket_event_slug), details when to use each, and includes extensive warnings about compatibility and when spreads are meaningless (e.g., temporal alignment false, mismatched bet shapes). Also notes that pre-mapped topics often yield warnings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recallRecallARead-onlyIdempotentInspect
Retrieve a value previously saved via remember, or list all saved keys (omit the key argument). Use to look up context the agent stored earlier — the user's target ticker, an address, prior research notes — without re-deriving it from scratch. Scoped to your identifier (anonymous IP, BYO key hash, or account ID). Pair with remember to save, forget to delete.
| Name | Required | Description | Default |
|---|---|---|---|
| key | No | Memory key to retrieve (omit to list all keys) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnly=true, idempotent=true, destructive=false. Description adds critical scoping detail: 'Scoped to your identifier (anonymous IP, BYO key hash, or account ID)'. Also explains behavior when key is omitted. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences efficiently cover purpose, usage, and scoping. Examples slightly verbose but overall compact and well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but description implies returns a value or list. For 1-param read-only tool, covers key behavioral aspects (scoping, key omission). Could mention return format explicitly but acceptable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage 100% with parameter 'key' described. Description adds usage nuance: 'omit the key argument' for listing all keys, which is not in schema. Provides examples of typical keys.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description states 'Retrieve a value previously saved via remember, or list all saved keys' with specific verb (retrieve, list) and resource (saved keys). Distinguishes from sibling 'forget' and pairs with 'remember'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit when to use: 'look up context the agent stored earlier... without re-deriving it from scratch'. Clear that omitting key lists all keys. No explicit when-not or alternatives, but context is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recent_alertsRecent AlertsARead-onlyIdempotentInspect
Pull fired events from your subscription feed. Returns the most recent alerts the evaluator has written to your persisted feed — each carries source, citation_uri (pipeworx:// when available), and the raw event payload. Filter by type (e.g. "sec_8k") and/or since (ISO timestamp). Set mark_read:true to flag returned events read so the next call only shows newer ones. Polls work fine; the same feed is also at GET registry.pipeworx.io/alerts.json for scripts and dashboards.
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | Optional — filter to one subscription type. | |
| limit | No | Max events to return (1-200, default 50). | |
| since | No | Optional ISO timestamp — return events fired_at >= this time. | |
| mark_read | No | Flag the returned events read in the same call (default false). | |
| unread_only | No | Return only events where read_at is null (default false). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds context beyond that: it reveals the mark_read parameter mutates state by flagging returned events read, affecting subsequent calls. It also explains the return payload fields, which is useful for the agent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences covering purpose, details, and usage. No redundancy. The most important information (what the tool returns) is front-loaded. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description explains the return fields, filtering, mark_read effect, and even an alternative REST endpoint. Given no output schema, this is good coverage. Minor gaps: no mention of pagination or ordering beyond 'most recent', but the limit parameter is in schema. Overall quite complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. The description reinforces parameter usage with examples (e.g., type='sec_8k') and explains the effect of mark_read, but does not add new semantic meaning beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'pull'/'returns' and the resource 'fired events from your subscription feed'. It specifies what each alert carries (source, citation_uri, raw payload). This distinguishes it from sibling tools like 'list_subscriptions' which manage subscriptions rather than retrieving events.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides examples for filtering by type and since, explains the mark_read flag effect, and mentions that the same feed is available as a REST endpoint for scripts. However, it does not explicitly say when not to use this tool or suggest alternatives for specific needs like historical searches (which might be covered by other tools).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
recent_changesRecent ChangesARead-onlyIdempotentInspect
"What's new with X" / "latest on Y" / "what happened to Z this week / month / quarter" / "updates on Acme" / "news on Tesla recently" / "what's happening with Apple" — change feed for a company in the last N days/weeks/months in ONE parallel call. Fans out to SEC EDGAR (filings since since), GDELT→GNews fallback (news mentions in window — GDELT preferred, GNews when rate-limited or 5xx), USPTO (patents granted; PatentsView API sunset May 2025 so this soft-fails until reactivated). since accepts ISO date ("2026-04-01") or relative shorthand ("7d", "30d", "3m", "1y"). Returns structured changes[] grouped by source + total_changes count + pipeworx:// citation URIs. Use entity_profile instead when you want the static profile (filings + fundamentals + LEI + patents) regardless of window.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Entity type. Only "company" supported today. | |
| since | Yes | Window start — ISO date ("2026-04-01") or relative ("7d", "30d", "3m", "1y"). Use "30d" or "1m" for typical monitoring. | |
| value | Yes | Ticker (e.g., "AAPL") or zero-padded CIK (e.g., "0000320193"). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description extensively discloses behavior beyond annotations: it explains the fan-out to multiple sources, fallback mechanism (GDELT→GNews), USPTO soft-fail, `since` parameter semantics, and output structure. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is dense and front-loaded with examples, but it could be slightly more structured. Every sentence adds value, and it efficiently covers the tool's purpose, behavior, and parameters.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 3 parameters and no output schema, the description fully explains inputs, behavior, return format (changes grouped by source, total_changes, citation URIs), edge cases (USPTO soft-fail), and alternative tool. Very complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, and the description adds significant value by explaining `since` formats (ISO vs relative), default usage ('30d'), and that `value` can be ticker or CIK. However, the schema already covers basics.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs and examples ('What's new with X', 'latest on Y') to clearly state it provides a change feed for a company. It explicitly distinguishes from the sibling tool 'entity_profile' which offers static profiles.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives explicit query examples and explains when to use this tool versus 'entity_profile'. It also details the `since` parameter format. However, it does not contrast with many other siblings, but the single contrast is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
regionsRegionsARead-onlyIdempotentInspect
Watchmode supported regions/countries for streaming availability (US, UK, CA, AU, etc.). Returns 2-letter codes. Use to constrain title_sources or list_titles to a region.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnly, openWorld, idempotent, non-destructive. The description adds non-obvious behavioral details: output format (2-letter codes) and its role in constraining other tools. This complements the annotations well.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, immediately stating the purpose and output, then the usage. No unnecessary words, highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters, extensive annotations, and an output schema, the description is fully complete. It explains what the tool returns and how to use it, meeting all contextual needs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With no parameters and 100% schema coverage (empty schema), the description adds meaning by explaining the output format and usage context. Baseline is 4 for zero-param tools, and the description meets this.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly indicates the tool returns supported regions/countries for streaming availability, lists example codes, and mentions its output format (2-letter codes). It also relates to sibling tools like title_sources and list_titles, distinguishing its purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states the primary use case: to constrain title_sources or list_titles to a region. It provides clear context on when to use the tool, though it does not mention when not to use it or alternatives, which is acceptable given its specific role.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
releasesReleasesDRead-onlyIdempotentInspect
Recent/upcoming releases.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| types | No | ||
| regions | No | ||
| end_date | No | ||
| source_ids | No | ||
| start_date | No | ||
| source_types | No | ||
| regions_pricing | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true. Description adds no additional behavioral context (e.g., pagination, date range interpretation, or what 'recent/upcoming' means operationally). Minimal added value.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise (3 words) but at the expense of informativeness. No structure, no actionable details. Conciseness without substance scores low.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 8 parameters and an output schema, the description is woefully incomplete. Agent cannot determine how to filter, what data is returned, or how to formulate a valid request. Requires significant additional context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so description must compensate. It does not mention any of the 8 parameters or their semantics. Examples in schema provide some clues but description itself offers zero parameter guidance.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description is 'Recent/upcoming releases.' which is a noun phrase, not a verb+resource. It vaguely indicates the tool deals with releases but doesn't state what action (list, search, etc.). Lacks specificity to distinguish from sibling tools like 'list_titles' or 'title_search'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. No context on prerequisites, filters, or exclusions. The agent is left to infer usage from the name alone.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
rememberRememberAIdempotentInspect
Save data the agent will need to reuse later — across this conversation or across sessions. Use when you discover something worth carrying forward (a resolved ticker, a target address, a user preference, a research subject) so you don't have to look it up again. Stored as a key-value pair scoped by your identifier. Authenticated users get persistent memory; anonymous sessions retain memory for 24 hours. Pair with recall to retrieve later, forget to delete.
| Name | Required | Description | Default |
|---|---|---|---|
| key | Yes | Memory key (e.g., "subject_property", "target_ticker", "user_preference") | |
| value | Yes | Value to store (any text — findings, addresses, preferences, notes) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate idempotentHint=true and destructiveHint=false. The description adds context about key-value scope, persistence differences between authenticated and anonymous sessions, and 24-hour retention. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, no filler, front-loaded with the core purpose. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple tool (2 required params, no output schema, good annotations), the description covers purpose, usage, behavior, and pairing with siblings. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for key and value. The description adds context beyond the schema by explaining the key-value pair scoping and giving examples, which helps the agent understand usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Save') and resource ('data'), clearly defines the scope ('across this conversation or across sessions'), and distinguishes from siblings like 'recall' and 'forget' by mentioning them.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It explicitly states when to use ('when you discover something worth carrying forward') and references 'pair with recall to retrieve later, forget to delete' for alternatives. It could improve by noting when not to use it, but it's clear enough.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
resolve_entityResolve EntityARead-onlyIdempotentInspect
"What's the ticker for…" / "find the CIK for…" / "what's the RxCUI for…" / "look up the ID for…" / "what is X's official identifier" — resolve a user-spoken NAME to the canonical/official identifier other tools require as input. Use FIRST whenever you have a name but need an ID. SUPPORTED TYPES: "company" (returns ticker + 10-digit CIK + company_name from SEC EDGAR + pipeworx://edgar/company/{cik} citation URI; accepts ticker, CIK, or company name as input — auto-disambiguated), "drug" (returns RxCUI + ingredient + brand from RxNorm + pipeworx://rxnorm/{rxcui} citation; accepts brand or generic name). Each call cascades through several lookup endpoints internally — using resolve_entity replaces 2-3 manual lookups.
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Entity type: "company" or "drug". | |
| value | Yes | For company: ticker (AAPL), CIK (0000320193), or name. For drug: brand or generic name (e.g., "ozempic", "metformin"). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnly, idempotent, non-destructive. Description adds that each call cascades through several internal endpoints, explaining potential latency or rate-limit implications. Also specifies exact return fields and citation URIs per type.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is fairly lengthy but well-structured with clear sections (examples, usage, supported types, return details). Every sentence adds value, though could be slightly tightened without losing clarity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (multiple entity types, multiple input formats, no output schema), the description is fully complete: covers input, behavior, output details, and context. Annotations cover safety. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds significant context: for 'type' it lists allowed values and their meanings; for 'value' it provides input format examples (ticker, CIK, name for company; brand or generic for drug) and mentions auto-disambiguation, greatly aiding parameter usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool resolves a name to a canonical identifier, with specific examples like ticker, CIK, RxCUI. It differentiates from siblings by focusing on resolution, and mentions it replaces 2-3 manual lookups.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says 'Use FIRST whenever you have a name but need an ID', providing strong guidance on when to use. Also details supported types and what each returns, guiding the agent on input selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
scan_competitor_ai_presenceScan Competitor AI PresenceARead-onlyIdempotentInspect
Compare AI visibility across multiple entities side-by-side. Probes each entity (your brand + N competitors) with ai_visibility_check, ranks by score, surfaces which is most/least recognized. Useful for competitive AI-marketing audits: "does Claude know about us as well as our competitors?". Returns ranked list with score, confidence, signal density per entity.
| Name | Required | Description | Default |
|---|---|---|---|
| models | No | Which models to probe. Supported: "workers-ai" (free default), "anthropic" (requires _apiKey). Omit for just workers-ai. | |
| _apiKey | No | Optional Anthropic API key — only if "anthropic" is in models. Passed to api.anthropic.com per probe. | |
| context | No | Optional shared context applied to every probe (e.g. "B2B SaaS", "Boston restaurant"). Disambiguates common names. | |
| entities | Yes | Array of 2-8 entities to compare (brand/business/product names). First entry treated as the "subject" for narrative; rest are competitors. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint true, idempotentHint true, destructiveHint false, so it's safe. Description adds behavioral context: it probes each entity with ai_visibility_check, treats first entity as subject for narrative, and returns ranked list with score, confidence, signal density. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences plus a parenthetical example and result description. Front-loaded with main purpose, each sentence adds value. Very efficient and no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 4 parameters, no output schema, and rich annotations, the description adequately covers what the tool does, how it works (probes via ai_visibility_check), and what it returns (ranked list with metrics). Could mention the entity count limit (2-8 from schema) but does not hinder completeness significantly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds value beyond schema: clarifies first entity is treated as subject for narrative, and context disambiguates common names. It also reinforces the models parameter defaults, though that is in schema. Provides useful narrative context for the entities parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it compares AI visibility across multiple entities side-by-side, probes each with ai_visibility_check, ranks by score, and surfaces most/least recognized. It distinguishes from siblings like ai_visibility_check (single entity probe) by specifying it is a batch comparison tool.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Description explicitly says when to use: competitive AI-marketing audits, including a concrete example ('does Claude know about us as well as our competitors?'). It implies use for multiple entities but does not explicitly state not to use for single-entity queries (use ai_visibility_check instead).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
scan_dependencyScan DependencyARead-onlyIdempotentInspect
Composite "should I add this npm package to my project" check in ONE call — fans out across deps.dev (license + advisories + version history) and bundlephobia (gzipped/minified bundle size, dependency count, ESM/tree-shake support). Use whenever an agent asks "is X safe / popular / small" or "what does adding lodash cost me". Returns a summary block (is_latest, license, published_at, advisory_count, bundle_kb_min, bundle_kb_gz, dependency_count, has_esm, tree_shakeable), per-advisory detail, links, and a list of recent alternative versions. NPM ecosystem only in v1; PyPI / Maven / Cargo / Go fall under deps.dev:version directly. Partial failures degrade gracefully — bundlephobia's first measurement on a new version can take 5-30s; sources_failed will list it if it times out, the rest still returns.
| Name | Required | Description | Default |
|---|---|---|---|
| package | Yes | npm package name. Scoped packages (e.g. "@types/node") are accepted. | |
| version | No | Specific version to check (e.g., "18.3.1"). Defaults to the latest published version when omitted. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, idempotentHint, and non-destructive. The description adds valuable behavioral context: fans out to external services, returns a specific summary block, handles partial failures gracefully, and notes that bundlephobia's first measurement can take 5-30 seconds. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is information-dense but front-loads the core composite nature. Every sentence contributes (ecosystem limitation, partial failures, return structure). Slightly verbose but well-structured; a minor reduction from 5 due to length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, so the description fully covers return values: summary block fields, per-advisory detail, links, and alternative versions. It also addresses edge cases like scoped packages, partial failures, and bundlephobia timing. Given complexity, it leaves no gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with clear descriptions for both parameters. The description adds extra meaning by explaining that version defaults to latest when omitted and that scoped packages like '@types/node' are accepted. This goes beyond the schema, justifying a score above baseline 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: a composite check for 'should I add this npm package' by aggregating data from deps.dev and bundlephobia. It uses specific verbs ('scan', 'fans out') and distinguishes itself from sibling tools like 'bet_research' or 'scan_competitor_ai_presence' by being npm-focused and multi-source.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicit usage triggers are given: 'whenever an agent asks "is X safe / popular / small" or "what does adding lodash cost me"'. It also notes NPM ecosystem only and directs to deps.dev:version for other ecosystems. Partial failure behavior is described, providing clear when-to-use and when-not-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
search_withinSearch Within a SourceARead-onlyIdempotentInspect
Semantic search INSIDE a fetched record. Pass the text you already pulled (e.g. a SEC 10-K body, an article, a long tool result) plus a natural-language query; get back the top-N passages with character offsets and similarity scores. Use when the record is too big to cram into the prompt — search_within saves context, returns only the passages that matter, and every passage carries an offset so the agent can verify a verbatim quote. Pairs with ask_pipeworx_grounded: fetch with the gateway, ground over the relevant passages instead of the whole document. BGE-base-en embeddings + cosine over 500-char overlapping windows; cap is 200K chars (longer inputs are truncated and flagged).
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | The document text to search inside (max ~200K chars). | |
| limit | No | Max passages to return (1-20, default 5). | |
| query | Yes | Natural-language query — what passages do you want? E.g. "supply-chain risk", "fiscal year 2024 revenue", "drug interactions with warfarin". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses embedding model (BGE-base-en), similarity metric (cosine), window size (500-char overlapping), character cap (200K), and truncation behavior. None of this is in annotations, so adds significant value beyond readOnlyHint etc.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single paragraph, front-loaded with purpose, efficient examples, and technical details. Every sentence adds value with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, usage, parameters, output format (passages with offsets and scores), behavior, and pairing with sibling tool. No output schema, but description sufficiently explains return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description adds concrete examples for 'text' (e.g., SEC 10-K body) and 'query' (e.g., supply-chain risk), and clarifies 'limit' as top-N passages. Provides meaningful context beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states semantic search inside a fetched record, specifying verb, resource, and output (passages with offsets and scores). Distinguishes from sibling tools by mentioning ask_pipeworx_grounded.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says when to use: when record is too large for prompt. Also describes pairing with ask_pipeworx_grounded for grounding over passages.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
sourcesSourcesARead-onlyIdempotentInspect
Watchmode streaming directory — all sources/services worldwide (Netflix, Disney+, Hulu, Prime Video, etc.) with IDs, name, type (subscription/free/rent/buy), regions, logo. Use as a directory before filtering list_titles by source.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and no destruction. The description adds valuable details about the returned fields (IDs, name, type, regions, logo) and confirms global coverage, enhancing transparency beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences with no wasted words. It front-loads the core purpose and includes a practical usage hint, making it highly efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has no parameters, a rich output schema, and thorough annotations, the description is complete. It explains what the tool returns and its purpose, leaving no gaps for an agent to misunderstand.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has zero parameters (100% coverage), so the description naturally lacks parameter-specific info. Baseline score of 4 is appropriate as no additional parameter context is needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns a directory of all streaming sources worldwide with specific data fields (IDs, name, type, regions, logo). It distinguishes itself from siblings by advising use as a directory before filtering with list_titles.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('Use as a directory before filtering list_titles by source'), providing a clear use case and referencing a sibling tool. It does not provide explicit when-not-to-use guidance, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
subscribeSubscribe to AlertsAIdempotentInspect
Create a proactive monitoring subscription to a live-data event stream. Returns the new subscription id. Requires a Pipeworx OAuth account (anonymous + BYO cannot persist subscriptions). Supported types: "sec_8k" (8-K filings matching ticker + item codes — e.g. items:["5.02"] = officer change), "polymarket_edge" (Polymarket↔Kalshi cross-venue mispricings — params:{topic:"fed"}), "fred_series" (new FRED observations — params:{series_id:"UNRATE"}). Delivery channels: feed (always on — pull via recent_alerts or GET registry.pipeworx.io/alerts.json), and optionally email (set delivery:{email:"you@x.com"}) or sms (delivery:{sms:"+15551234567"} — phone must be verified at /account first; 10/day cap).
| Name | Required | Description | Default |
|---|---|---|---|
| type | Yes | Subscription type. | |
| params | Yes | Type-specific filter. sec_8k: {ticker:"AAPL", items?:["5.02","1.01"]}. polymarket_edge: {topic:"fed", min_spread_bps?:500}. fred_series: {series_id:"UNRATE"}. patent_grant: {applicant:"Apple Inc."}. clinical_trial: {sponsor?:"Pfizer", condition?:"lung cancer", phase?:"PHASE3"} (sponsor or condition required). | |
| delivery | No | Optional delivery channels in addition to the always-on persistent feed. {email:"you@x.com"} sends a templated alert per fired event. {sms:"+15551234567"} sends an SMS per event — must match the verified phone on the caller's account (verify at https://pipeworx.io/account first; 10/day cap). {webhook:"https://..."} POSTs each event JSON to your endpoint, HMAC-signed — the response includes delivery.webhook_secret (whsec_…) ONCE; verify X-Pipeworx-Signature = sha256 HMAC of "<X-Pipeworx-Timestamp>.<raw body>". Auto-disabled after 10 consecutive failing runs. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses behavioral traits beyond annotations: requires OAuth, cannot be anonymous/BYO, delivery limits (SMS cap, webhook auto-disable), and signing secret returned once. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured and front-loaded with purpose. Detailed but concise given complexity. Could trim some verbose examples slightly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Highly complete for a complex subscription tool with multiple types and delivery channels. Covers requirements, examples, constraints. No output schema, but description sufficiently explains return (new subscription id). Minor gaps like error handling not covered.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Adds significant meaning beyond the input schema: explains each subscription type with concrete examples and clarifies delivery channel details (email, SMS with verification, webhook with HMAC). Schema covers 100% but description enriches understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it creates a proactive monitoring subscription to a live-data event stream and returns a subscription id. Distinguishes from siblings like list_subscriptions and unsubscribe.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explains when to use (to create subscriptions for proactive monitoring) and mentions requirements (Pipeworx OAuth account). Could explicitly mention alternatives like list_subscriptions for existing subs, but still strong guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
suggest_questionsWhat Can I Ask Pipeworx?ARead-onlyIdempotentInspect
What can I ask Pipeworx? / what is Pipeworx good for? / what can you do? / give me ideas / show me examples / getting started / what data do you have? — the onboarding entry point for an agent that just connected and wants to know what is worth asking. Returns category-bucketed example questions (company financials, drugs & clinical trials, economics, real estate, prediction markets, weather, government & patents, science & academia, news) — each with the exact tool + argument shape that answers it, drawn from the live catalog of thousands of tools. Call with no arguments for the full spread, or pass topic (e.g. "finance", "pharma", "betting") to focus. Use this FIRST when you do not yet know what Pipeworx can do for you, or to learn how to call the meta-tools (ask_pipeworx, entity_profile, compare_entities, etc.).
| Name | Required | Description | Default |
|---|---|---|---|
| topic | No | Optional focus area: finance | pharma | economics | real-estate | betting | weather | government | science | news. Omit for a cross-category spread. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already cover safety (readOnlyHint, idempotentHint, destructiveHint). The description adds valuable behavioral context: it returns example questions, can be called with no arguments or a focused topic, and draws from a live catalog. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single focused paragraph with front-loaded alternative phrasings. It efficiently conveys purpose, usage, and parameters without extraneous words, though slightly dense.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple tool (1 optional parameter, no output schema, rich annotations), the description fully covers purpose, when to use, parameter behavior, and what to expect. It is complete for an agent to make an informed decision.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with the topic parameter described. The description enhances semantics by explaining that omitting topic yields a cross-category spread and listing example values, adding meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly identifies the tool as an onboarding entry point that returns category-bucketed example questions, distinguishing it from siblings like ask_pipeworx and discover_tools. It uses specific verbs ('returns', 'call') and specifies the resource ('category-bucketed example questions').
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool ('Use this FIRST when you do not yet know what Pipeworx can do for you') and how to use the optional topic parameter. While it doesn't explicitly list when not to use it, the context and sibling list provide sufficient guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
title_detailTitle DetailDRead-onlyIdempotentInspect
Title detail.
| Name | Required | Description | Default |
|---|---|---|---|
| title_id | Yes | ||
| append_to_response | No |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide strong behavioral hints (readOnly, idempotent, not destructive), but the description adds no additional context beyond 'Title detail.' There is no contradiction, but the description fails to enhance understanding.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise but under-specified. A single phrase 'Title detail.' is too minimal to be helpful; it is not a proper description of functionality.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having an output schema and rich annotations, the description omits critical usage context. It doesn't explain that it returns details for a specific title or how to use the 'append_to_response' parameter.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description provides no information about the parameters 'title_id' or 'append_to_response'. Without any compensation, the agent cannot infer proper usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Title detail.' is a tautology that merely restates the tool name and title. It lacks a specific verb and resource, and does not distinguish the tool from siblings like 'list_titles' or 'title_search'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives. The name implies it retrieves a single title's details, but no when-not or context is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
title_searchTitle SearchDRead-onlyIdempotentInspect
Search titles.
| Name | Required | Description | Default |
|---|---|---|---|
| types | No | ||
| search_field | No | ||
| search_value | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint=false, so the safety profile is clear. The description adds no further behavioral context, but does not contradict annotations. A score of 3 is appropriate as the description adds minimal value beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely short, but this is under-specification rather than efficient conciseness. It fails to convey necessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has three parameters, many sibling tools, and an output schema, the description is critically incomplete. It provides no context about search capabilities, scope, or return format.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. However, it provides no explanation for the three parameters (types, search_field, search_value), leaving their meanings entirely unspecified.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Search titles.' is a tautology, merely restating the tool name without specifying the resource or behavior. It provides no differentiation from sibling tools like list_titles or search_within.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No usage guidance is provided. The description does not indicate when to use this tool versus alternatives, such as comparing with list_titles for listing all titles or search_within for narrower scope.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
title_seasonsTitle SeasonsARead-onlyIdempotentInspect
Seasons + episodes for a Watchmode TV show ID. Returns episode count, season number, year. Use after title_search to enumerate a show's structure.
| Name | Required | Description | Default |
|---|---|---|---|
| title_id | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare read-only, idempotent, non-destructive behavior. The description adds specific information about return values (episode count, season number, year), which provides useful behavioral detail beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence with an additional usage instruction. It is front-loaded with the core functionality and returns fields, with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and strong annotations, the description covers purpose, usage context, and return fields. No additional details (e.g., pagination, rate limits) are necessary given the tool's straightforward nature.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, and the description only mentions 'Watchmode TV show ID' for the title_id parameter without specifying format, constraints, or examples. The description does not adequately compensate for the lack of schema parameter descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves seasons and episodes for a Watchmode TV show ID, listing specific return fields (episode count, season number, year). It distinguishes from sibling tools like title_search by specifying usage after searching for a show.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly instructs the agent to use this tool 'after title_search to enumerate a show's structure', providing a clear workflow and context for when to invoke it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
title_sourcesTitle SourcesDRead-onlyIdempotentInspect
Streaming sources.
| Name | Required | Description | Default |
|---|---|---|---|
| regions | No | ||
| title_id | Yes |
Output Schema
| Name | Required | Description |
|---|---|---|
No output parameters | ||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations already indicate readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds no behavioral context beyond stating it is about 'streaming sources'. It does not disclose return format, pagination, rate limits, or any constraints. The description is essentially a label, not a behavioral disclosure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
While only two words, the description is under-specified rather than concise. It sacrifices clarity for brevity, making it unhelpful for an agent. Concision should maintain utility; here it does not.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists but its content is unknown, the description still fails to specify what the tool returns. For a tool with two parameters and siblings like 'sources', the description should explain scope or filter behavior. It is grossly incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description must compensate by explaining parameter meanings. It fails to do so entirely. The parameters 'title_id' and 'regions' are not explained; the description does not indicate that 'title_id' is required or that 'regions' is an optional comma-separated list.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description 'Streaming sources' is extremely vague. It does not specify whether the tool retrieves, lists, or manages streaming sources for a title. The name 'title_sources' implies a relationship to titles, but the description provides no action verb or resource clarification, and it does not distinguish this tool from similar siblings like 'sources'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
There is absolutely no guidance on when to use this tool versus alternatives. The description lacks any context about prerequisites, typical use cases, or situations where another tool would be more appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
unsubscribeUnsubscribe from AlertsAIdempotentInspect
Cancel a subscription by id. Ownership is enforced — you can only cancel your own subscriptions. The row is deactivated (not deleted) so its historical events stay available via recent_alerts.
| Name | Required | Description | Default |
|---|---|---|---|
| id | Yes | Subscription id (uuid) returned by subscribe. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds significant context beyond annotations: ownership enforcement, row deactivation vs deletion, and historical event availability via recent_alerts. Annotations already provide idempotentHint, readOnlyHint, destructiveHint.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, 26 words, front-loaded with the primary action. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description covers ownership, mutability, side effects, and ties to a sibling tool perfectly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a clear parameter description. The description only implicitly refers to the parameter via 'by id' and adds no extra semantics beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'cancel' and the resource 'subscription by id', distinguishing it from sibling tools like subscribe, list_subscriptions, and recent_alerts.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It indicates when to use (to cancel a subscription), specifies ownership enforcement, and explains the deactivation behavior with a link to recent_alerts. Does not explicitly exclude certain cases but provides sufficient context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
validate_claimValidate ClaimARead-onlyIdempotentInspect
"Is it true that…" / "fact check" / "verify the claim that…" / "did X really…" / "was Y actually…" / "confirm or refute" / "true or false" — natural-language claim verification against authoritative sources. Use whenever the agent needs to check whether something a user said is factually correct. v1 supports company-financial claims (revenue, net income, cash position for public US companies) via SEC EDGAR + XBRL. Returns a verdict (confirmed / approximately_correct / refuted / inconclusive / unsupported), extracted structured form, actual value with pipeworx:// citation, and percent delta. Replaces 4–6 sequential calls (NL parsing → entity resolution → data lookup → numeric comparison).
| Name | Required | Description | Default |
|---|---|---|---|
| claim | Yes | Natural-language factual claim, e.g., "Apple's FY2024 revenue was $400 billion" or "Microsoft made about $100B in profit last year". |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description fully discloses behavioral traits: it returns a verdict, structured form, actual value with citation, and percent delta. It also mentions it replaces 4–6 sequential calls. Annotations already provide readOnlyHint, openWorldHint, idempotentHint, and destructiveHint, and the description adds value beyond these.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise yet comprehensive. It starts with natural-language examples, defines the tool, states usage context, specifies scope, and details return values. Every sentence adds value, and the structure is well-organized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the single parameter with full schema, rich annotations, and detailed description, the tool definition is complete. The description explains the return values even without an output schema. It addresses all necessary aspects for an AI agent to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'claim' has complete schema description. The description adds context about the type of claims supported (company-financial) and natural-language format. With 100% schema coverage, baseline is 3, but the added context justifies a 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: natural-language claim verification against authoritative sources. It provides specific examples of queries and distinguishes itself by noting it replaces multiple sequential calls, which differentiates it from sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use whenever the agent needs to check whether something a user said is factually correct' and specifies the current scope (company-financial claims). It doesn't provide explicit 'when not to use' or alternative tools, but the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.