Skip to main content
Glama
Ownership verified

Server Details

Over 45 API endpoints ranging from prediction market topics, weather, macro economics, predictive possibilities, and market stats.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

Score is being calculated. Check back soon.

Available Tools

47 tools
air_quality_indexAInspect

[$0.001 USDC per call (x402)] Air quality index for any US location. AQI value, pollutant levels (PM2.5, ozone), and health category. Powered by PROWL. Use this to answer 'what is the air quality?' or 'is it safe to go outside?'

ParametersJSON Schema
NameRequiredDescriptionDefault
latYesLatitude
lngYesLongitude
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden and successfully communicates critical behavioral traits: the cost per call ($0.001 USDC), the data provider (PROWL), and the geographic constraint (US-only). It could be improved by mentioning rate limits or error handling for non-US coordinates.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with cost information front-loaded in brackets, followed immediately by core functionality, return value summary, attribution, and usage examples. Every sentence serves a distinct purpose without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple two-parameter lookup tool without output schema or annotations, the description is complete. It explains what data is returned (compensating for missing output schema), discloses cost prerequisites, and defines the geographic scope, leaving no critical gaps for an agent to successfully invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While the input schema has 100% description coverage (establishing a baseline of 3), the description adds valuable semantic context by specifying 'US location,' which implies the lat/lng parameters should correspond to United States coordinates. This geographic constraint is not present in the schema itself.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific resource (Air Quality Index) and scope (any US location), and explicitly lists returned data types (AQI value, PM2.5, ozone, health category). This effectively distinguishes it from weather-focused siblings like weather_current or weather_forecast by focusing on pollution metrics rather than meteorological conditions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit positive use cases ('Use this to answer what is the air quality? or is it safe to go outside?'), giving clear guidance on when to invoke the tool. However, it lacks negative constraints or explicit differentiation from the weather-related sibling tools that might also address outdoor safety questions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

conflict_escalation_trackerAInspect

[$0.005 USDC per call (x402)] Conflict escalation tracking -monitors whether an active conflict is intensifying, stabilizing, or de-escalating based on event frequency, severity, and actor involvement trends. Powered by PROWL. Use this to answer 'is the conflict in [area] getting worse?' or 'escalation trend for [conflict]?'

ParametersJSON Schema
NameRequiredDescriptionDefault
conflictYesConflict or region to track
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It successfully discloses critical cost information ('$0.005 USDC per call'), data provenance ('Powered by PROWL'), and analytical methodology. Missing only minor operational details like rate limits or explicit read-only status, but cost transparency is particularly valuable for agent decision-making.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Information-dense with no wasted words. Front-loads pricing (critical for cost-aware agents), followed by purpose, methodology, source attribution, and usage examples. Minor structural awkwardness with the dash in 'tracking -monitors' and parenthetical pricing prefix, but every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool without output schema, coverage is strong: purpose, cost, methodology, data source, and usage patterns are documented. Minor gap: does not describe the return value structure (e.g., whether it returns a categorical label, score, or text analysis), though the three possible states (intensifying/stabilizing/de-escalating) are enumerated.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage ('Conflict or region to track'), establishing baseline 3. Description reinforces this through usage examples showing '[area]' and '[conflict]' placeholders but does not add significant semantic depth, data format constraints, or example values beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity: states it 'monitors whether an active conflict is intensifying, stabilizing, or de-escalating' and lists the specific factors used (event frequency, severity, actor involvement trends). Clearly distinguishes from generic conflict monitoring tools through this methodology disclosure.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit example queries ('is the conflict in [area] getting worse?', 'escalation trend for [conflict]?') that clarify when to invoke. However, lacks explicit mention of sibling tool 'conflict_spillover_analysis' or exclusion criteria for when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

conflict_spillover_analysisAInspect

[$0.03 USDC per call (x402)] Conflict spillover analysis -given an active conflict, scores probability of geographic spread using adjacency, alliance networks, historical patterns, and cross-border event frequency. Powered by PROWL intelligence engine. Use this to answer 'will this conflict spread?' or 'which neighboring countries are at risk?'

ParametersJSON Schema
NameRequiredDescriptionDefault
conflictYesActive conflict to analyze
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and successfully discloses critical behavioral traits: the $0.03 USDC per call cost (x402), the use of PROWL intelligence engine, and the analytical methodology. However, it omits whether this is a read-only operation or details about rate limits and response format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure with cost front-loaded (critical for agent decision-making), followed by function definition, methodology details, and usage examples. No wasted words; every clause provides necessary context for tool selection.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex geopolitical analysis tool with no output schema, the description provides good methodological context but lacks information about return value structure (e.g., probability scores, risk ratings, country lists) and operational constraints beyond pricing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the single 'conflict' parameter fully described as 'Active conflict to analyze'. The description mentions 'given an active conflict' which aligns with but does not extend beyond the schema documentation. Baseline score appropriate given schema completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states the tool 'scores probability of geographic spread' using specific factors (adjacency, alliance networks, historical patterns, cross-border event frequency). It clearly distinguishes from sibling 'conflict_escalation_tracker' by focusing specifically on geographic spillover rather than intensity escalation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use-case guidance: 'Use this to answer will this conflict spread? or which neighboring countries are at risk?' However, it does not explicitly mention when NOT to use this (e.g., for tracking conflict intensity) or cite 'conflict_escalation_tracker' as the alternative for escalation monitoring.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

crypto_health_checkAInspect

[Free] Free crypto health check -basic market status, BTC/ETH prices, 24h change, and overall market cap. No payment required. Powered by PROWL.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden. It successfully communicates the cost model (free) and return data types (prices, changes, market cap), but lacks operational details like rate limits, data freshness/latency, or attribution requirements implied by 'Powered by PROWL'.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

While the description is brief, it wastes space with redundant cost declarations: '[Free]', 'Free', and 'No payment required' appear in close proximity, effectively stating the same information three times. The structure front-loads cost information before functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (no parameters, simple data retrieval), the description adequately covers the essential context: what data is returned (compensating for the missing output schema) and the access cost. For a basic health check endpoint, this is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema contains zero parameters. According to the rubric, 0 parameters establishes a baseline score of 4. The description correctly omits parameter discussions since none exist, and the schema requires no additional semantic clarification.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns basic market status, BTC/ETH prices, 24h change, and market cap. The use of 'basic' implicitly distinguishes it from more comprehensive siblings like crypto_market_briefing and crypto_market_regime, though it doesn't explicitly name them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The '[Free]' label and 'No payment required' text imply usage context (use when cost-free access is needed), but there are no explicit when-to-use guidelines or comparisons against the nine sibling crypto tools that would help an agent select this over crypto_trading_edges or crypto_market_briefing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

crypto_market_briefingAInspect

[$0.05 USDC per call (x402)] Comprehensive crypto market briefing -regime, key levels, flow analysis, funding rates, liquidation maps, and actionable trade ideas synthesized into a structured report. The deepest crypto analysis PROWL offers. Powered by PROWL intelligence engine. Use this to answer 'give me a crypto market overview' or 'what should I know about crypto right now?'

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses critical cost information ('$0.05 USDC per call') which affects agent selection, and describes the output as a 'structured report.' It could further improve by mentioning data freshness or rate limits, but the cost disclosure is the key behavioral trait added.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description efficiently front-loads cost information, follows with capability scope, positioning against siblings, and explicit usage examples. Every sentence serves a distinct purpose, though the 'Powered by PROWL intelligence engine' clause adds marginal value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description adequately compensates by enumerating the report components (regime, key levels, etc.) and format. For a zero-parameter tool, the context provided is sufficient for correct invocation, though specifying coverage scope (e.g., which assets) would further improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema contains zero parameters. Per the rubric, 0 params establishes a baseline score of 4. The description appropriately requires no parameter clarification.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it generates a 'Comprehensive crypto market briefing' listing specific components (regime, key levels, flow analysis, funding rates, liquidation maps, trade ideas). It clearly distinguishes from siblings by claiming to be 'The deepest crypto analysis PROWL offers,' positioning it above narrower tools like crypto_health_check or crypto_market_regime.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage triggers: 'Use this to answer 'give me a crypto market overview' or 'what should I know about crypto right now?'' This gives the agent clear conversational contexts for invocation vs. more specialized sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

crypto_market_regimeAInspect

[Free] Crypto market regime detection -risk-on, risk-off, or transitioning. Based on volatility structure, correlation patterns, and funding rates. Know whether the market environment favors momentum or mean-reversion strategies. Powered by PROWL. Use this to answer 'what regime is crypto in?' or 'is crypto risk-on or risk-off?'

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses methodology (volatility structure, correlation patterns, funding rates) and cost ('[Free]'), but lacks operational details like caching behavior, data freshness, or specific output format/structure since no output schema exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Every sentence earns its place: opening establishes purpose and states, second sentence details methodology, third explains strategic implications, fourth attributes source, and fifth specifies exact query patterns. No redundant or filler content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero parameters and no output schema, the description adequately explains the conceptual return values (regime classifications) and their trading implications. It misses only the specific data structure of the response, which would be helpful given the absence of an output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema contains zero parameters. According to calibration guidelines, zero-parameter tools receive a baseline score of 4, as there are no parameter semantics to describe beyond what the schema already communicates (100% coverage of empty set).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool performs 'Crypto market regime detection' with specific classification states (risk-on, risk-off, transitioning). It distinguishes itself from siblings like crypto_health_check and crypto_trading_edges by focusing specifically on regime classification rather than general health or trading opportunities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit use-case guidance: 'Use this to answer what regime is crypto in?' and explains it helps determine whether the environment favors momentum or mean-reversion strategies. It lacks explicit 'when not to use' exclusions, but clearly defines the specific questions it addresses.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

crypto_trading_edgesAInspect

[Free] Crypto trading edge signals -funding rate anomalies, basis trades, liquidation cascades, and flow imbalances. Actionable signals with entry/exit levels and confidence scores. Powered by PROWL. Use this to answer 'where is there edge in crypto?' or 'crypto trading opportunities?'

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses output structure (entry/exit levels, confidence scores) and data source (PROWL), adding context beyond the name. However, lacks operational details like real-time vs cached data, rate limits, or safety profile that would be essential for a trading signal tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with efficient front-loading: signal type enumeration, output format description, attribution, and usage example. Each sentence delivers distinct value. '[Free]' and 'Powered by PROWL' marginally reduce purity but provide relevant cost and provenance context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero parameters and no output schema, the description partially compensates by describing the semantic content of returns (actionable signals with levels/scores). However, it omits critical context like signal frequency, time horizons, or return structure (array vs object), leaving gaps for an agent attempting to parse responses.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema contains zero parameters. Per rubric, 0 params equals baseline 4. The description correctly omits parameter discussion as none exist, and the '[Free]' prefix hints at accessibility without requiring parameterization.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the resource (crypto trading edge signals) and specific types provided (funding rate anomalies, basis trades, liquidation cascades, flow imbalances). It effectively distinguishes from sibling crypto tools by specifying these technical signal categories rather than general market data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit positive guidance: 'Use this to answer where is there edge in crypto? or crypto trading opportunities?' This clearly frames the interrogative use case. Lacks explicit 'when not to use' or direct sibling contrasts, but the specific signal types imply differentiation from general market briefings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

economic_calendarAInspect

[$0.001 USDC per call (x402)] Upcoming economic data releases -jobs reports, CPI, GDP, FOMC meetings, and more. Know what's coming before it moves markets. Each entry includes expected date, previous value, and consensus forecast when available. Powered by PROWL. Use this to answer 'when is the next jobs report?' or 'what economic data comes out this week?'

ParametersJSON Schema
NameRequiredDescriptionDefault
days_aheadNoHow many days forward (default 14)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. Adds critical cost information ($0.001 USDC), data provenance (Powered by PROWL), and return structure (expected date, previous value, consensus forecast) despite absence of output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Information-dense with zero waste. Front-loaded with cost critical for paid tool, followed by purpose, value proposition, return data description, and concrete usage examples. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-parameter tool with no output schema, description adequately compensates by detailing return values (date, previous, consensus) and providing usage context. Complexity-appropriate coverage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with clear description ('How many days forward'). Description implies parameter usage via 'what economic data comes out this week?' example but doesn't enhance parameter semantics beyond schema baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verb+resource ('Upcoming economic data releases') with concrete examples (jobs reports, CPI, GDP, FOMC). Effectively distinguishes from sibling economic_indicator (current values) and fed_speeches by emphasizing calendar/forecast nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit usage examples ('Use this to answer when is the next jobs report?') provide clear positive guidance. Lacks explicit 'when not to use' exclusions (e.g., historical data queries), but examples effectively steer toward calendar-style queries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

economic_indicatorAInspect

[Free] Economic indicator data from PROWL macro data. Access GDP, unemployment, inflation (CPI/PCE), interest rates, housing starts, retail sales, and 800,000+ time series. Latest values with previous period comparison. Powered by PROWL. Use this to answer 'what is the current GDP?' or 'what is the unemployment rate?' Combine with prediction_market_odds for how markets react to economic data.

ParametersJSON Schema
NameRequiredDescriptionDefault
series_idYesFRED series ID (e.g. GDP, UNRATE, CPIAUCSL, DGS10, FEDFUNDS)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses key behaviors: returns 'Latest values with previous period comparison' (comparative data), 'Free' (cost), and '800,000+ time series' (scope). Missing rate limits, caching behavior, or detailed return format disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Every sentence earns its place: [Free] tag, data source, scope (800k+ series), behavioral feature (previous comparison), usage examples, and sibling integration. Front-loaded with critical metadata (cost/source) and zero redundancy across six efficient sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter retrieval tool with complete schema coverage and no output schema, the description adequately covers data content (latest+previous values), source attribution, and practical application. Minor gap in not explicitly describing the return structure, though 'latest values' implies current data points.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with examples (GDP, UNRATE, etc.). Description reinforces these examples by listing indicator types in the text, but adds no syntax or semantic detail beyond the schema's 'FRED series ID' definition. Baseline 3 appropriate given schema completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb+resource combination ('Economic indicator data from PROWL macro data') with explicit scope (GDP, unemployment, inflation, 800,000+ time series). Distinguishes from siblings like economic_calendar or macro_briefing by focusing on specific time series retrieval rather than events or summaries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides concrete use cases ('Use this to answer what is the current GDP?') and explicitly names complementary sibling tool ('Combine with prediction_market_odds'). Lacks explicit negative guidance on when to use economic_calendar instead, but clear positive examples suffice.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

economic_surprise_indexAInspect

[$0.005 USDC per call (x402)] Economic surprise index -are economic releases beating or missing expectations? Positive means data is stronger than forecast, negative means weaker. Tracks rolling surprise score and individual release surprises. Powered by PROWL. Use this to answer 'is the economy doing better or worse than expected?' or 'did the jobs report beat expectations?'

ParametersJSON Schema
NameRequiredDescriptionDefault
days_backNoWindow for rolling index (default 30)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It successfully adds critical behavioral context: cost disclosure ('$0.005 USDC per call'), data source ('Powered by PROWL'), and output interpretation ('Positive means data is stronger than forecast, negative means weaker'). Does not mention error handling, caching, or explicit read-only status, but covers the most critical operational constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is efficiently structured with cost upfront, followed by definition, value interpretation, data source, and usage examples. Every sentence earns its place; no redundant or filler content. Front-loaded with the most critical information (cost and core function).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema exists, the description appropriately explains return values (rolling vs individual surprises, positive/negative interpretation) and covers the paid pricing model. For a single-parameter tool of moderate complexity, the description provides sufficient context for correct invocation and result interpretation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage for the single parameter ('days_back'). The description mentions 'rolling surprise score' which loosely connects to the rolling window concept, but does not explicitly discuss parameter semantics, syntax, or provide examples. Baseline 3 is appropriate given schema completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly defines the tool's purpose with specific verbs ('Tracks rolling surprise score and individual release surprises') and distinguishes from siblings like 'economic_indicator' or 'economic_calendar' by emphasizing 'beating or missing expectations' and 'surprise' analysis rather than raw data retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage guidance through example questions ('is the economy doing better or worse than expected?', 'did the jobs report beat expectations?'), clearly indicating when to invoke the tool. Lacks explicit naming of alternative tools (e.g., 'use economic_indicator for actual values'), though the examples effectively differentiate usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fed_rate_oddsAInspect

[Free] Federal Reserve rate decision probabilities. What are the odds of a rate cut, hold, or hike at the next FOMC meeting? Synthesized from futures markets and prediction market data. Powered by PROWL cross-source analysis. Use this to answer 'will the Fed cut rates?' or 'what are the odds of a rate hike?'

ParametersJSON Schema
NameRequiredDescriptionDefault
meetings_aheadNoHow many meetings forward (default 3)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It adds valuable data provenance ('Synthesized from futures markets and prediction market data', 'Powered by PROWL cross-source analysis') and cost indication ('[Free]'). However, it omits operational details like data freshness, rate limits, caching behavior, or explicit confirmation that this is a read-only operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with zero waste: cost flag ([Free]), core purpose, data sources, attribution, and usage examples each serve distinct purposes. Front-loaded with the essential function. Minor deduction for promotional content ('Powered by PROWL') that, while relevant to trust, slightly dilutes technical clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description effectively hints at return values by specifying the three probability outcomes (cut, hold, hike). For a single-parameter tool, this provides adequate completeness. Could improve by explicitly stating the return format (e.g., percentages or decimals) and whether it returns single or multiple meeting forecasts.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (meetings_ahead is fully documented). The description implies temporal scope through references to 'next FOMC meeting' but does not explicitly discuss the parameter syntax or semantics. With high schema coverage, baseline 3 is appropriate as the description doesn't need to compensate but doesn't add param-specific usage examples either.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool provides 'Federal Reserve rate decision probabilities' for rate cuts, holds, or hikes at FOMC meetings. It clearly distinguishes from siblings like 'prediction_market_odds' by specifying the Fed/FOMC domain and from 'fed_tone_analysis' by focusing on quantitative probabilities rather than qualitative sentiment.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear usage context through specific example queries ('will the Fed cut rates?', 'what are the odds of a rate hike?'). However, it lacks explicit 'when-not-to-use' guidance or direct comparison to siblings like 'fed_tone_analysis' (for qualitative signals) or 'economic_indicator' (for historical data rather than forward-looking odds).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fed_speechesAInspect

[$0.001 USDC per call (x402)] Recent Federal Reserve speeches with speaker, date, and key excerpts. Track what Fed governors are saying about inflation, employment, and interest rates. Powered by PROWL. Use this to answer 'what did the Fed say recently?' or 'what is the Fed's stance on rates?'

ParametersJSON Schema
NameRequiredDescriptionDefault
speakerNoFilter by speaker name
days_backNoHow far back (default 30)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries significant burden by disclosing pricing ('$0.001 USDC per call'), data provenance ('Powered by PROWL'), and output content structure ('speaker, date, and key excerpts'). Missing minor operational details like rate limits or caching behavior, but covers economic and data transparency well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Exceptionally structured with critical cost information front-loaded in brackets, followed by resource definition, value proposition, data source, and use cases. Every sentence delivers distinct value without redundancy. The flow moves logically from economic cost to functional capability to practical application.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-parameter tool without output schema, the description adequately compensates by describing return content (speaker, date, excerpts) and covering the monetary/policy domain comprehensively. Includes rare but important cost transparency. Would benefit from explicit output format or pagination notes to achieve completeness for production use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, providing 'Filter by speaker name' and 'How far back (default 30)'. The description aligns parameters to concepts ('track what Fed governors are saying' maps to speaker, 'recent' maps to days_back) but adds no technical syntax, constraints, or format details beyond the schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'Recent Federal Reserve speeches with speaker, date, and key excerpts' using specific verbs and resources. It distinguishes itself from siblings like fed_rate_odds and fed_tone_analysis by emphasizing raw speech content and specific tracking capabilities for inflation, employment, and interest rates.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit positive guidance with concrete use cases: 'Use this to answer what did the Fed say recently?' or 'what is the Fed's stance on rates?' This clarifies when to select this tool over siblings. Lacks explicit negative guidance (when NOT to use) or named alternative tools, preventing a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fed_tone_analysisAInspect

[$0.005 USDC per call (x402)] Fed communication tone analysis -hawkish vs dovish scoring of recent speeches and statements. Tracks shifts in Fed rhetoric that precede policy changes. Score from -1 (extremely dovish) to +1 (extremely hawkish) with trend direction. Powered by PROWL. Use this to answer 'is the Fed getting more hawkish?' or 'what direction is Fed policy headed?'

ParametersJSON Schema
NameRequiredDescriptionDefault
days_backNoAnalysis window (default 90)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and discloses critical behavioral traits: per-call pricing ($0.005 USDC), output format (scoring range -1 to +1 with trend direction), data source (PROWL), and predictive nature (tracks shifts preceding policy changes).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Information-dense single paragraph front-loaded with pricing and purpose. Efficiently combines cost transparency, functionality, output specification, and usage guidance. Minor technical notation (x402) is necessary for cost structure but slightly reduces readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Compensates effectively for missing output schema by explicitly documenting the return value format (scale from -1 to +1 with trend direction) and cost structure. Given the tool's focused scope (single scoring metric), the description provides sufficient context for invocation decisions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the 'days_back' parameter fully documented as 'Analysis window (default 90)'. The description implies temporal scope via 'recent speeches' but does not add syntax, format constraints, or semantic details beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Explicitly states the tool performs 'hawkish vs dovish scoring' of Fed communications, using specific terminology (-1 to +1 scale) that clearly distinguishes it from sibling tools like fed_speeches (raw data) or fed_rate_odds (probabilistic forecasting).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases: 'Use this to answer is the Fed getting more hawkish?' and 'what direction is Fed policy headed?' giving clear positive guidance on when to invoke. Lacks explicit exclusions or named alternatives, preventing a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

geopolitical_eventsAInspect

[Free] Geopolitical events by region -conflicts, diplomatic actions, protests, military movements, and cooperation events. Each event scored on the Goldstein scale (-10 extreme conflict to +10 extreme cooperation). Powered by PROWL. Use this to answer 'what is happening in [region]?' or 'are there conflicts in [area]?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 50)
regionYesRegion or country (e.g. 'Middle East', 'Ukraine', 'Taiwan')
days_backNoHow far back (default 7)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the Goldstein scale methodology and data source (PROWL), and cost status ([Free]), but lacks operational details like data latency, update frequency, or behavior when no events match the query.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Efficiently packed with no wasted words: opens with cost/model info, defines the resource and event types, explains the scoring methodology, cites the provider, and closes with usage examples. Slightly dense but well front-loaded with critical selection criteria.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Appropriately complete for tool selection despite lacking an output schema. Covers data content (events with Goldstein scores), scope (regional), cost, and source. Could be improved by briefly describing the return structure (e.g., 'returns list of scored events'), but the Goldstein reference provides sufficient output context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage for all 3 parameters (region, limit, days_back). The description mentions 'by region' aligning with the required parameter, but adds no additional semantic guidance (e.g., valid date ranges, region formats) beyond what the schema already provides, warranting the baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states the resource (geopolitical events) and specific types covered (conflicts, diplomatic actions, protests, military movements, cooperation), distinguishing it from siblings like conflict_escalation_tracker or sanctions_tracker which focus narrowly. The Goldstein scale scoring (-10 to +10) further clarifies the data's nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit natural language usage patterns: 'Use this to answer what is happening in [region]?' or 'are there conflicts in [area]?' giving clear invocation triggers. However, it does not explicitly state when to prefer siblings like conflict_spillover_analysis or regional_tension_index over this general-purpose tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hurricane_landfall_probabilityAInspect

[$0.005 USDC per call (x402)] Hurricane landfall probability for active storms. Probability of landfall by state/region, projected intensity at landfall, and historical comparison. Powered by PROWL. Use this to answer 'will the hurricane hit Florida?' or 'what are the chances of hurricane landfall?'

ParametersJSON Schema
NameRequiredDescriptionDefault
storm_idNoSpecific storm (or latest active)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden and importantly reveals the cost '$0.005 USDC per call (x402)' and data source 'Powered by PROWL.' It also describes the return data categories, though it omits error handling behavior, rate limits, and specific output structure details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description efficiently front-loads critical cost information, followed by purpose, output specifications, and usage examples in a compact format. Each sentence delivers value, though the 'Powered by PROWL' attribution provides marginal utility for tool selection.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of annotations and output schema, the description adequately covers the tool's cost, purpose, and return data categories, but lacks detail on error states, output format structure, and caching behavior that would be necessary for a higher score.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for the single `storm_id` parameter, establishing a baseline score of 3. The description implicitly supports this by mentioning 'active storms' but does not add additional semantic details about parameter format or validation beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool calculates 'Hurricane landfall probability for active storms' and specifies outputs including 'Probability of landfall by state/region, projected intensity at landfall, and historical comparison.' While it effectively defines the scope, it does not explicitly differentiate from the sibling tool `hurricane_tracker`.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear positive usage guidance with specific example queries: 'Use this to answer 'will the hurricane hit Florida?' or 'what are the chances of hurricane landfall?'' However, it lacks explicit guidance on when not to use this tool or when to prefer alternatives like `hurricane_tracker`.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

hurricane_trackerAInspect

[$0.001 USDC per call (x402)] Active hurricane and tropical storm tracking. Current position, intensity, projected path, and estimated landfall. Powered by PROWL. Use this to answer 'are there any active hurricanes?' or 'where is the hurricane headed?'

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden. It successfully reveals cost ('$0.001 USDC per call') and data source ('Powered by PROWL'). However, it omits rate limits, data freshness/frequency, and behavior when no hurricanes exist (empty response vs error).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Optimal structure front-loaded with critical cost information, followed by capability summary, data source attribution, and usage examples. Every clause earns its place with zero redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Sufficient for a zero-parameter tool without output schema. The description compensates by enumerating return content domains (position, intensity, path, landfall). Could improve by specifying the return data structure (JSON object vs string) or caching behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema contains 0 parameters. The description correctly implies no filtering arguments are required by referring to 'Active' storms generally, matching the schema. Baseline 4 applies for zero-parameter tools.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the core function ('Active hurricane and tropical storm tracking') with specific data points returned (position, intensity, projected path, estimated landfall). However, it fails to explicitly differentiate from the sibling tool 'hurricane_landfall_probability', creating potential ambiguity about which tool to use for landfall-specific queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit example queries ('are there any active hurricanes?', 'where is the hurricane headed?') that guide the agent on when to invoke the tool. Lacks explicit guidance on when NOT to use it versus the sibling landfall probability tool, or what happens when no storms are active.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

macro_briefingAInspect

[$0.05 USDC per call (x402)] Comprehensive macro briefing synthesizing latest economic indicators, Fed signals, yield curve status, and relevant prediction market odds into a structured report. Covers growth, inflation, employment, and policy outlook with connections to active prediction markets. Powered by PROWL. Use this to answer 'give me a macro overview' or 'what is the economic outlook?'

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and successfully discloses critical behavioral traits: the $0.05 USDC cost per call, the data provider (PROWL), and the output format (structured report). Could improve by mentioning rate limits, caching behavior, or real-time vs. delayed data status.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Exceptionally well-structured with zero waste. Front-loads critical cost information ($0.05 USDC), follows with synthesis description, coverage areas, attribution, and usage examples. Every sentence earns its place in a compact description.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero parameters and no output schema, the description adequately covers the essential context: cost, data sources synthesized, coverage domains, and usage scenarios. Minor gap in not describing the specific structure or fields of the returned report, though 'structured report' provides some guidance.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema contains zero parameters. Per scoring guidelines, 0 parameters warrants a baseline score of 4. The description appropriately does not invent parameter semantics where none exist.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool synthesizes economic indicators, Fed signals, yield curve status, and prediction market odds into a structured report, covering growth, inflation, employment, and policy. The comprehensive scope clearly distinguishes it from specific sibling tools like economic_indicator or fed_rate_odds.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage guidance with example queries ('give me a macro overview' or 'what is the economic outlook?'), giving clear context for when to invoke this tool. Lacks explicit exclusions or named alternatives for specific indicator lookups, though the comprehensive nature implies differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

narrative_divergenceAInspect

[$0.005 USDC per call (x402)] Narrative divergence -when different sources tell different stories. News says one thing, Reddit says another, Google searches point elsewhere. Divergence often precedes market moves as different audiences process information at different speeds. Powered by PROWL. Use this to answer 'are media and public opinion aligned on X?' or 'where do narratives disagree?'

ParametersJSON Schema
NameRequiredDescriptionDefault
topicYesTopic to analyze
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses cost ($0.005 USDC per call), payment protocol (x402), provider (PROWL), and predictive utility ('precedes market moves'). However, it lacks details on rate limits, caching, or response structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Information-dense but well-structured with cost front-loaded, followed by definition, examples, significance, attribution, and use cases. Every sentence serves a purpose, though the bracketed cost prefix slightly disrupts flow.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool, the description adequately covers the value proposition and operational context (cost, provider). However, given the lack of output schema and annotations, it should ideally describe the return format or data sources included, which it omits.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the 'topic' parameter fully described as 'Topic to analyze'. The description implies semantic context through use case examples ('aligned on X'), but does not explicitly elaborate on parameter format, constraints, or valid values beyond the schema baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines narrative divergence as detecting when different sources (News, Reddit, Google searches) tell different stories, using specific examples. It effectively distinguishes from siblings like narrative_velocity or trending_narratives by focusing specifically on cross-source disagreement and alignment detection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases: 'Use this to answer are media and public opinion aligned on X?' and 'where do narratives disagree?' This gives clear guidance on when to select the tool, though it does not explicitly name sibling alternatives to avoid.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

narrative_lifecycleAInspect

[$0.03 USDC per call (x402)] Narrative lifecycle analysis -classify where a topic sits in its attention cycle: emerging (accelerating, < 48h old), peaking (high volume, decelerating), fading (declining), or dormant. Includes projected trajectory and historical comparable narratives. Powered by PROWL intelligence engine. Use this to answer 'is this story still growing or dying?' or 'how long will this narrative last?'

ParametersJSON Schema
NameRequiredDescriptionDefault
topicYesTopic to analyze
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses pricing ($0.03 USDC per call), data source (PROWL intelligence engine), classification methodology (time-based and volume-based criteria), and output components (projected trajectory, historical comparables). Missing error handling or rate limit details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure: cost upfront, core classification logic with parenthetical definitions, additional features (trajectory/comparables), and usage examples. Every sentence delivers distinct value (pricing, methodology, applicability) without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Without an output schema, the description effectively compensates by detailing the four lifecycle categories and additional return components (trajectory projections, historical comparables). Covers cost, source, classification logic, and use cases. Deducted one point for missing error state descriptions (e.g., topic not found).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a single parameter 'topic' described as 'Topic to analyze'. The description mentions analyzing 'where a topic sits' but adds minimal semantic detail beyond the schema regarding expected format (e.g., keywords vs. sentences). Baseline score appropriate given schema completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool classifies 'where a topic sits in its attention cycle' with specific categorical outputs (emerging, peaking, fading, dormant) including definitions (e.g., emerging = accelerating, <48h old). The lifecycle focus clearly distinguishes it from sibling tools like narrative_velocity or narrative_search.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases via example questions: 'is this story still growing or dying?' and 'how long will this narrative last?'. However, lacks explicit guidance on when to prefer this over narrative_velocity or trending_narratives siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

narrative_prediction_relevanceAInspect

[$0.03 USDC per call (x402)] Score how relevant a narrative trend is to active prediction markets. Fuzzy matches trending topics against active market questions. High scores mean the narrative likely moves prediction market odds. Powered by PROWL intelligence engine. Use this to answer 'will this news affect prediction markets?' or 'which trending stories could move betting odds?'

ParametersJSON Schema
NameRequiredDescriptionDefault
topicNoSpecific topic (or omit for auto-scan of trending)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden. It successfully discloses cost ('$0.03 USDC per call'), matching algorithm ('Fuzzy matches'), and score interpretation ('High scores mean the narrative likely moves prediction market odds'). Minor gap on rate limits or error conditions prevents a 5.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Information-dense and front-loaded: cost and core function appear immediately, followed by mechanism, interpretation, attribution, and usage examples. Every sentence earns its place with zero redundancy despite covering pricing, technical approach, and usage guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the single optional parameter and lack of output schema, the description adequately covers tool purpose, cost structure, and result interpretation. Minor deduction for not specifying the score format/range (e.g., 0-100 vs 0-1) given no output schema exists to document this.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage for the single 'topic' parameter. The description references 'trending topics' which aligns with the parameter's purpose, but does not add syntax details, format constraints, or semantic nuances beyond what the schema already provides. Baseline 3 is appropriate given schema completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'Score[s] how relevant a narrative trend is to active prediction markets' with specific mechanism details ('Fuzzy matches trending topics against active market questions'). It effectively distinguishes this tool from pure narrative tools (siblings like narrative_search, narrative_velocity) and pure prediction market tools by occupying the intersection domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage patterns: 'Use this to answer will this news affect prediction markets?' or 'which trending stories could move betting odds?' These example questions clearly signal when to select this tool versus alternatives like narrative_divergence or prediction_market_odds.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

narrative_velocityAInspect

[$0.005 USDC per call (x402)] Narrative velocity -rate of change in coverage/attention across news, social, and search channels. Accelerating velocity means a topic is breaking out. Decelerating means it's fading. Cross-source agreement makes signals stronger. Powered by PROWL. Use this to answer 'is this story gaining or losing momentum?' or 'what narratives are accelerating?'

ParametersJSON Schema
NameRequiredDescriptionDefault
topicYesTopic to analyze
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses critical cost information ('$0.005 USDC per call'), data source ('Powered by PROWL'), and signal quality factors ('Cross-source agreement makes signals stronger'). It explains how to interpret outputs (accelerating = breaking out, decelerating = fading). Missing only operational details like rate limits or cache behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure: leads with cost (critical for agent decision-making), defines the metric, explains interpretation logic, cites data source, and closes with explicit use cases. No filler content; every sentence provides actionable information for tool selection.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter analytical tool without an output schema, the description adequately explains the conceptual return value (momentum status/velocity). However, it could briefly indicate the return structure (e.g., velocity scores, trend direction) to complete the picture. Given the complexity and lack of output schema, it meets most but not all informational needs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (topic described as 'Topic to analyze'), establishing a baseline of 3. The description adds semantic context by implying the topic should be a 'story' or 'narrative' through usage examples, but does not elaborate on format requirements, valid topic lengths, or example inputs beyond the conceptual level.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool's function as measuring 'rate of change in coverage/attention across news, social, and search channels' and explains the core concept (accelerating vs decelerating velocity). It specifically targets 'narrative velocity' which distinguishes it from sibling tools like narrative_divergence or narrative_lifecycle.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases: 'Use this to answer is this story gaining or losing momentum?' and 'what narratives are accelerating?'. This gives clear positive guidance on when to invoke the tool. However, it lacks explicit differentiation from the 5+ sibling narrative tools (e.g., when to choose this over narrative_divergence or trending_narratives).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_arbitrageAInspect

[$0.03 USDC per call (x402)] Live cross-platform arbitrage opportunities in prediction markets. Finds events where you can buy Yes on one platform and No on another for guaranteed profit regardless of outcome. Includes estimated profit margin, required capital, and execution instructions. Powered by PROWL. Use this to answer 'are there arbitrage opportunities?' or 'how can I make risk-free profit from prediction markets?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 10)
categoryNoFilter by category
min_profitNoMinimum profit margin percentage (default 1)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden and succeeds by revealing critical behavioral traits: the cost per call ($0.03 USDC), the read-only nature (implied by 'finds' and 'Live' opportunities), and the specific return value structure (estimated profit margin, required capital, execution instructions).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Every sentence earns its place: cost disclosure is front-loaded, core functionality follows immediately, output structure is specified, and usage examples conclude the description. No redundant or wasted text despite packing in pricing, mechanics, return values, and provider attribution.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the moderate complexity (financial arbitrage with 3 optional filters) and absence of an output schema, the description adequately compensates by detailing what the tool returns. It could be improved by explicitly stating the read-only safety profile given the lack of annotations, but otherwise covers cost, mechanics, and usage comprehensively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage (limit, category, min_profit all documented). The description does not add semantic meaning, syntax examples, or parameter relationships beyond what the schema provides, meeting the baseline for high-coverage schemas.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool 'Finds events where you can buy Yes on one platform and No on another for guaranteed profit regardless of outcome,' providing a specific verb (finds), resource (cross-platform arbitrage opportunities), and mechanism that clearly distinguishes it from siblings like prediction_market_edge or prediction_market_mispricing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage guidance with example query patterns: 'Use this to answer 'are there arbitrage opportunities?' or 'how can I make risk-free profit from prediction markets?'' This offers clear context for invocation, though it does not explicitly contrast with alternative tools or provide exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_calibrationAInspect

[$0.003 USDC per call (x402)] Platform accuracy scores and calibration analysis. How often do markets priced at 70% actually resolve Yes 70% of the time? Identifies which platforms are most reliable, which categories are well-calibrated, and where systematic biases exist. Powered by PROWL historical analysis. Use this to answer 'how accurate are prediction markets?' or 'which platform is most reliable?'

ParametersJSON Schema
NameRequiredDescriptionDefault
categoryNoFilter by category
platformNoFilter by platform
time_periodNoAnalysis period
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but description adds critical behavioral context: pricing ($0.003 USDC per call with x402 multiplier) and data provenance (PROWL historical analysis). This cost transparency is valuable operational information. Missing safety profile (read-only status) and error behavior, but data source and cost disclosure exceed baseline expectations for unannotated tools.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with cost information front-loaded, followed by definition, concrete example (70% calibration question), capabilities, data source, and usage guidance. Information-dense with minimal waste, though the cost prefix slightly disrupts the flow. Each sentence earns its place by conveying distinct operational or functional information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, but description adequately covers conceptual return values (accuracy scores, reliability rankings, bias identification). Given the analytical complexity (calibration analysis) and 100% input schema coverage, the description provides sufficient context for an agent to understand what data will be returned, though specific return structure (JSON vs report) remains unspecified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (all 3 parameters have descriptions), establishing baseline 3. Description conceptually links parameters to outputs (platform→reliability scores, category→calibration analysis, time_period→historical analysis) but does not add parameter-specific semantics, formats, or validation rules beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Exceptionally clear: defines calibration analysis specifically ('How often do markets priced at 70% actually resolve Yes 70% of the time?'), identifies exact outputs (platform reliability, category calibration, systematic biases), and distinguishes from trading-focused siblings (arbitrage, edge, flow) by focusing on historical accuracy scoring rather than trading opportunities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit positive guidance ('Use this to answer how accurate are prediction markets? or which platform is most reliable?') that clearly signals when to select this tool. Lacks explicit 'when not to use' exclusions or named alternatives, though the descriptive differentiation from siblings is strong enough to imply proper usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_edgeAInspect

[$0.05 USDC per call (x402)] Comprehensive edge analysis combining all PROWL intelligence signals. For each market, synthesizes cross-platform spread data, money flow patterns, mispricing signals, calibration bias, and volume anomalies into a single edge score with full reasoning. The deepest analysis PROWL offers. Use this to answer 'give me the full picture on this prediction market' or 'what is the best opportunity right now?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 5)
categoryNoFilter by category
event_hashNoSpecific event hash for deep dive
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It successfully discloses pricing ('$0.05 USDC per call') and output characteristics ('single edge score with full reasoning'), but lacks other critical behavioral details like rate limits, caching behavior, or the specific data structure/format of the returned analysis.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with zero waste: opens with critical cost info (front-loaded), defines capability, establishes positioning ('deepest analysis'), and closes with usage examples. Every sentence earns its place with high information density.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (synthesizing 5+ signal types) and absence of both output schema and annotations, the description provides minimum viable coverage by conceptualizing the output ('edge score with reasoning'). However, significant gaps remain regarding return value structure, pagination (if applicable to the limit parameter), and side effects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage ('Max results', 'Filter by category', 'Specific event hash'), the schema fully documents parameters. The description implies the 'event_hash' use case through phrases like 'deep dive' and 'for each market', but doesn't explicitly add semantic meaning beyond the schema's self-documentation. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool's function using specific verbs ('combining', 'synthesizes') and resources ('PROWL intelligence signals', 'cross-platform spread data', etc.). It effectively distinguishes from the 9 sibling prediction market tools by positioning this as the comprehensive 'all-in-one' analysis that aggregates their individual signals into a single edge score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases ('Use this to answer...') with concrete example queries ('give me the full picture', 'best opportunity right now'). Implicitly contrasts with siblings by claiming to be the 'deepest analysis' and listing the specific signals it synthesizes, though it doesn't explicitly name which sibling tools to use for narrower analysis.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_flowAInspect

[Free] Money flow analysis and smart money tracking across prediction markets. Detects large trades, sudden shifts in position, and whale activity that moves odds. Identifies when informed traders are positioning before the crowd. Powered by PROWL. Use this to answer 'where is smart money going in prediction markets?' or 'are whales buying or selling on X?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 20)
categoryNoFilter by category
min_sizeNoMinimum trade size in USD
platformNoFilter by platform
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden. It successfully indicates cost ('[Free]') and analytical capabilities (detecting whale activity, informed positioning), but omits operational details like rate limits, authentication requirements, data freshness, or error behaviors that would be necessary for safe invocation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured and front-loaded with the core '[Free] Money flow analysis' purpose. Each sentence earns its place: capabilities, specific detections, attribution, and usage examples. It is information-dense without being verbose, though the single-paragraph structure packs multiple concepts that could benefit from slight breathing room.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the 100% schema coverage for 4 simple parameters, the description adequately covers input requirements. However, with no output schema provided, the description should ideally hint at return structure (e.g., trade lists vs. aggregations) or supported platforms, which are absent. It satisfies minimum requirements but leaves operational gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description adds conceptual context by mentioning 'large trades' and 'whale activity,' which semantically supports the min_size parameter, but does not explicitly document parameter syntax, valid values, or interdependencies beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs 'Money flow analysis and smart money tracking' with specific focus on 'large trades, sudden shifts in position, and whale activity.' It effectively distinguishes from siblings like prediction_market_odds or prediction_market_volume by emphasizing flow tracking and informed trader detection rather than static prices or aggregate volumes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance through example questions: 'where is smart money going?' and 'are whales buying or selling on X?' This gives clear context for when to invoke the tool. However, it lacks explicit exclusions or references to sibling alternatives (e.g., when to use prediction_market_volume instead).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_mispricingAInspect

[$0.03 USDC per call (x402)] AI-computed mispricing detection across prediction markets. Compares current odds against cross-platform consensus, volume-weighted fair value, historical calibration, and money flow signals to identify markets where the price is wrong. Each result includes a confidence score and reasoning. Powered by PROWL intelligence engine. Use this to answer 'which prediction markets are mispriced?' or 'what are the best bets right now?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 10)
categoryNoFilter by category
min_confidenceNoMinimum confidence 0-1 (default 0.5)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden and successfully communicates critical cost information ('$0.03 USDC per call'), the analytical engine used ('PROWL intelligence engine'), and output format ('confidence score and reasoning'), but omits data freshness, caching policies, or computation latency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The four-sentence structure efficiently fronts the pricing model and core value proposition, followed by methodology, outputs, and use cases. Each sentence delivers distinct value, though the '(x402)' notation is cryptic and 'Powered by PROWL intelligence engine' adds limited functional clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description partially compensates by noting that results include confidence scores and reasoning, but fails to describe the structure of market identifiers, odds data, or the full result schema. With 3 simple parameters fully documented in schema, the description meets minimum viability but leaves gaps for a complex analytical tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, establishing a baseline of 3. The description references the confidence scoring mechanism which implicitly contextualizes the min_confidence parameter, but does not augment the schema's definitions of limit or category parameters with additional usage guidance or valid category values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the core function ('AI-computed mispricing detection across prediction markets') and methodology ('Compares current odds against cross-platform consensus...'), clearly distinguishing it from sibling tools that merely report raw odds or volume by emphasizing the detection of pricing errors.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While it provides helpful example queries ('which prediction markets are mispriced?', 'what are the best bets right now?'), it fails to explicitly differentiate when to use this versus siblings like prediction_market_arbitrage, prediction_market_edge, or prediction_market_calibration, leaving ambiguity about tool selection among similar options.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_newAInspect

[$0.001 USDC per call (x402)] Newly created prediction markets across all platforms. Be first to spot emerging topics -new markets often reflect breaking news, upcoming events, or trending narratives before mainstream coverage. Powered by PROWL. Use this to answer 'what new prediction markets just launched?' or 'what events are people starting to bet on?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 50)
categoryNoFilter by category
hours_backNoHow far back to look (default 24)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Adds critical cost disclosure '$0.001 USDC per call' and data source 'Powered by PROWL' not found elsewhere. However, misses safety profile (read-only vs destructive) and return value structure given lack of output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure: cost upfront (critical for agent), followed by purpose, value proposition, and example queries. Every sentence earns its place with zero waste. Front-loaded with essential operational context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Strong coverage for a data retrieval tool: includes pricing, data provenance, scope ('across all platforms'), and usage patterns. Only gap is description of return values/structure, which would be helpful given no output schema exists.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage (limit, category, hours_back all documented). Description does not explicitly discuss parameters, but with full schema coverage, baseline 3 is appropriate—no additional parameter semantics required.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Explicitly states 'Newly created prediction markets across all platforms' with clear differentiation from the 10+ sibling prediction market tools (arbitrage, odds, volume, etc.) by focusing specifically on discovery of newly launched markets. Specific verb and resource identified.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage guidance: 'Use this to answer what new prediction markets just launched?' and explains value proposition (spotting breaking news before mainstream coverage). Lacks explicit 'when not to use' or alternative naming (e.g., don't use for existing market analysis).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_oddsAInspect

[Free] Live prediction market probabilities aggregated across multiple platforms. Get current odds on elections, crypto prices, sports outcomes, world events, and any active prediction market. Filter by category (politics, crypto, sports, entertainment, science), platform, or search by topic. Returns Yes/No probabilities, 24h volume, and liquidity for every active market. Powered by PROWL cross-platform data collection. Use this to answer 'what are the odds of X happening?' or 'what do prediction markets say about Y?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 50)
searchNoSearch market questions by keyword
categoryNoFilter by category
platformNoFilter by platform
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden effectively. It reveals cost ('[Free]'), data freshness ('Live'), scope ('aggregated across multiple platforms'), data source ('Powered by PROWL'), and return format ('Yes/No probabilities, 24h volume, and liquidity') despite the absence of an output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Information-dense structure front-loads the core function ('Live prediction market probabilities'). Every element earns its place: cost indicator [Free], capability details, return value description, and concrete usage examples. Slightly verbose but no wasted sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequately compensates for missing annotations and output schema by describing return values and behavioral traits. Given the tool has only 4 optional parameters with 100% schema coverage, the description provides sufficient context for correct invocation without overwhelming detail.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description confirms the category enum values ('politics, crypto, sports, entertainment, science') and maps 'search' to 'topic', but does not add semantic depth beyond what the schema already provides for the four parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb-resource pair ('Live prediction market probabilities') and clearly distinguishes this as the primary odds-fetching tool among specialized siblings like prediction_market_arbitrage and prediction_market_volume. It specifies aggregation across multiple platforms and enumerates supported categories.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases ('Use this to answer what are the odds of X happening?') that help identify when to invoke the tool. While it doesn't explicitly name siblings as alternatives, the focus on current probabilities implicitly distinguishes it from specialized analysis tools (arbitrage, edge, flow) in the sibling list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_resolutionAInspect

[$0.001 USDC per call (x402)] Recently resolved prediction markets and their outcomes. Track which predictions came true, who was right, and what the final odds were before resolution. Useful for backtesting strategies, verifying claims about past events, or tracking platform accuracy. Powered by PROWL. Use this to answer 'did X happen?' or 'what prediction markets resolved recently?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 50)
outcomeNoFilter by resolution
categoryNoFilter by category
days_backNoHow far back to look (default 7)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses critical cost information '$0.001 USDC per call' and data source 'Powered by PROWL,' and implies read-only access to historical data. However, it lacks details on pagination behavior, rate limits beyond cost, error handling, or exact return structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five well-structured sentences with zero waste. Cost warning is front-loaded (essential for paid API), followed by core function, use cases, attribution, and query patterns. Every sentence earns its place without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 4-parameter query tool without output schema, the description compensates reasonably by describing return content ('outcomes', 'who was right', 'final odds'). Missing explicit return structure documentation, but the cost disclosure and clear functional scope provide good contextual coverage for tool selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage (all 4 parameters have descriptions), the baseline is 3. The description mentions 'recently resolved' which contextualizes the days_back parameter and references 'outcomes' mapping to the outcome filter, but does not add syntax details, format examples, or valid category values beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool retrieves 'recently resolved prediction markets and their outcomes' with specific details about tracking 'which predictions came true, who was right, and what the final odds were.' This clearly distinguishes it from siblings like prediction_market_odds (current prices) or prediction_market_new (new markets) by focusing on historical resolution data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage patterns: 'Use this to answer did X happen? or what prediction markets resolved recently?' and lists specific use cases (backtesting strategies, verifying claims, tracking platform accuracy). Lacks explicit 'when not to use' guidance, though the sibling context implies this is not for active trading.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_spreadAInspect

[$0.005 USDC per call (x402)] Cross-platform odds disagreements for the same event. When one platform says 60% and another says 45%, something is off -and someone can profit. Finds the biggest spreads across all platforms using fuzzy event matching. Powered by PROWL cross-platform normalization. Use this to answer 'where do prediction markets disagree?' or 'are there pricing inefficiencies in prediction markets?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 20)
categoryNoFilter by category
min_spreadNoMinimum spread percentage (default 5)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Adds valuable cost information ($0.005 USDC per call) and algorithmic context (fuzzy event matching, PROWL normalization), but lacks safety disclosures (read-only status), rate limits, or auth requirements expected for mutation/query tools without annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with pricing, zero waste. Four distinct sentences cover: cost, definition/example (60% vs 45%), methodology (fuzzy matching/PROWL), and use cases. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 3-parameter tool with 100% schema coverage and no output schema, the description adequately covers the tool's niche, cost structure, and algorithmic approach. Minor gap in not describing the return structure, though the 60%/45% example implies the format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions (limit, category, min_spread). Description mentions 'biggest spreads' which conceptually maps to the parameters, but does not add syntax details or usage guidance beyond what the schema already provides, warranting the baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb ('finds') and resource ('cross-platform odds disagreements/spreads'), clearly distinguishes from siblings like 'prediction_market_arbitrage' or 'prediction_market_odds' by focusing on disagreement detection and fuzzy event matching.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('Use this to answer where do prediction markets disagree? or are there pricing inefficiencies?'), but lacks explicit 'when not to use' guidance or named sibling alternatives despite the crowded prediction market tool suite.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prediction_market_volumeAInspect

[$0.001 USDC per call (x402)] Trading volume and liquidity rankings across all prediction market platforms. Find the most actively traded markets, biggest bets, and where money is flowing. Spot sudden volume spikes that signal breaking news or insider activity. Powered by PROWL. Use this to answer 'what are people betting on right now?' or 'which prediction markets have the most money?'

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results (default 50)
sort_byNoSort field
categoryNoFilter by category
min_volumeNoMinimum 24h volume in USD
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden. It adds critical cost context '[$0.001 USDC per call (x402)]', scope ('across all prediction market platforms'), and interpretive value ('Spot sudden volume spikes...'). However, it omits safety profile (read-only vs destructive), rate limits, or error behaviors.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Efficiently structured with cost prefix, core functionality, interpretive context, and use-case suffix. Every sentence earns its place: pricing info upfront, value proposition in the middle, and invocation examples at the end. No redundant or vague filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 optional parameters and no output schema, the description adequately covers return type implicitly ('rankings') and provides rich context about data interpretation (volume spikes signaling news). Missing explicit return structure details keeps it from a 5 given the lack of output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description implies parameter concepts (e.g., 'rankings' maps to sort_by, 'volume' maps to min_volume) but does not add syntax details, formatting guidance, or semantic explanations beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Trading volume and liquidity rankings across all prediction market platforms'—a specific verb+resource combination. It clearly distinguishes from siblings (e.g., prediction_market_odds, prediction_market_arbitrage) by focusing on volume/liquidity metrics rather than pricing, spreads, or calibration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases: 'Use this to answer 'what are people betting on right now?' or 'which prediction markets have the most money?'' This gives clear context for when to invoke the tool. However, it lacks explicit exclusions or named alternatives (e.g., when to use prediction_market_flow instead).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

public_attentionAInspect

[$0.001 USDC per call (x402)] Public attention metrics as a proxy for awareness and interest. Sudden spikes in attention often precede or coincide with major events. Track attention trends for any topic. Powered by PROWL. Use this to answer 'is public interest in X growing?' or 'how much attention is Y getting?'

ParametersJSON Schema
NameRequiredDescriptionDefault
topicYesArticle title or topic
days_backNoTrend window (default 30)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses cost ('$0.001 USDC per call'), data provenance ('Powered by PROWL'), and behavioral insights about attention spikes. However, it omits safety classifications (read-only vs. destructive), rate limits, and output format details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with critical cost information and the core purpose. It efficiently uses 4-5 sentences to convey pricing, functionality, behavioral patterns, data source, and use cases without redundancy. The structure prioritizes the most agent-relevant information (cost, x402 payment protocol) first.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description adequately covers the tool's purpose and input semantics given the simple 2-parameter schema. However, with no output schema provided, the description fails to specify what data structure or metrics are returned (e.g., time series values, aggregate scores), leaving a significant gap for agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage ('Article title or topic', 'Trend window'), establishing a baseline of 3. The description mentions tracking 'any topic,' which aligns with the topic parameter, but does not add semantic details beyond the schema (e.g., topic formatting, days_back constraints).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool as providing 'Public attention metrics as a proxy for awareness and interest' with the specific action to 'Track attention trends.' However, it does not explicitly differentiate from the sibling tool 'search_interest,' which likely overlaps in functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear contextual guidance through specific use cases ('Use this to answer is public interest in X growing?') and notes behavioral patterns ('Sudden spikes in attention often precede... major events'). It lacks explicit 'when-not-to-use' instructions or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recession_probabilityAInspect

[$0.03 USDC per call (x402)] Recession probability score (0-100) computed from yield curve inversion, unemployment trends, GDP surprise, ISM manufacturing, Fed tone shift, and credit spreads. Each factor weighted and explained. Powered by PROWL intelligence engine. Use this to answer 'are we headed for a recession?' or 'what is the probability of a recession?'

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. It successfully reveals: cost ($0.03 USDC per call), output format (score 0-100 with weighted factor explanations), and data sources (six specific economic indicators). Missing only temporal/data freshness details that would justify a 5.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with critical cost information front-loaded in brackets, followed by output description, methodology specifics, and usage examples. The 'Powered by PROWL intelligence engine' adds minor promotional fluff, but overall information density is high with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema exists, the description adequately explains return values (probability score range 0-100 with explained factors). With zero input parameters and no annotations, the description covers the essential behavioral and output context needed for invocation decisions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema contains zero parameters. Per evaluation rules, 0 parameters establishes a baseline score of 4. The description correctly requires no parameter clarification.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity: states it computes a 'recession probability score (0-100)' from six named economic factors (yield curve, unemployment, GDP surprise, ISM manufacturing, Fed tone, credit spreads). This clearly distinguishes it from generic siblings like 'economic_indicator' or 'macro_briefing' by defining the exact composite output and inputs used.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage guidance via example queries: 'Use this to answer are we headed for a recession? or what is the probability of a recession?' This clearly signals when to select this tool versus alternatives like 'fed_rate_odds' or 'economic_surprise_index'. Lacks explicit 'when not to use' or named alternative comparisons needed for a 5.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

regional_risk_assessmentAInspect

[$0.03 USDC per call (x402)] Regional risk assessment -composite score from geopolitical events, tone trends, sanctions activity, and historical conflict patterns. Identifies regions with elevated risk of instability. Includes factor breakdown and historical context. Powered by PROWL intelligence engine. Use this to answer 'how risky is [region]?' or 'geopolitical risk assessment for [country]?'

ParametersJSON Schema
NameRequiredDescriptionDefault
regionYesRegion or country to assess
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses cost ($0.03 USDC per call) and hints at output structure ('factor breakdown and historical context'), but omits safety characteristics (read-only status), error handling, rate limits, or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with cost front-loaded, followed by definition, function, output characteristics, and usage examples. Minimal fluff, though 'Powered by PROWL intelligence engine' adds marginal value. Sentences are information-dense.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks output schema, so description must compensate. Mentions 'composite score', 'factor breakdown', and 'historical context' giving partial insight into returns, but fails to specify data types, score ranges, or full response structure expected from a risk assessment tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the 'region' parameter fully described as 'Region or country to assess'. Description reinforces this with usage examples mentioning '[region]' and '[country]', but adds no additional format constraints, validation rules, or enum values beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool produces a composite risk score from specific factors (geopolitical events, tone trends, sanctions, historical conflict patterns), distinguishing it from single-factor siblings like sanctions_tracker or conflict_escalation_tracker. Uses specific verbs (assesses, identifies) and defines the resource (regions/countries).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage examples ('how risky is [region]?' or 'geopolitical risk assessment for [country]?') that clarify when to invoke the tool. However, lacks explicit guidance on when NOT to use it (e.g., vs. regional_tension_index) or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

regional_tension_indexAInspect

[$0.005 USDC per call (x402)] Regional tension index -composite score from event counts, tone trends, and conflict/cooperation ratios. Rising tension often precedes escalation. Includes historical percentile ranking. Powered by PROWL. Use this to answer 'how tense is the situation in [region]?' or 'is conflict escalating?'

ParametersJSON Schema
NameRequiredDescriptionDefault
regionYesRegion or country
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. Successfully explains methodology (composite ingredients), output format (historical percentile ranking), predictive behavior (tension precedes escalation), cost ($0.005 USDC), and data source (PROWL). Lacks error handling or rate limit details but covers core behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Efficiently structured with cost/provider upfront, followed by methodology, interpretation logic, and usage examples. Every sentence delivers value: pricing, definition, behavioral insight, and invocation patterns. No waste despite high information density.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter query tool without output schema, the description adequately explains what the user receives (composite score, percentile ranking) and how to interpret it. Could be strengthened with return value structure details or update frequency, but sufficiently complete for tool selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with clear description 'Region or country.' Baseline score applies. Description reinforces this via the usage example '[region]' but does not add additional semantic context (e.g., accepted formats like ISO codes vs. country names) beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specifically defines the tool as a composite score derived from 'event counts, tone trends, and conflict/cooperation ratios.' Clearly positions it as a precursor metric ('Rising tension often precedes escalation'), distinguishing it from the sibling conflict_escalation_tracker which likely tracks active escalations rather than predictive tension signals.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage patterns: 'Use this to answer how tense is the situation in [region]?' or 'is conflict escalating?' Offers clear context for when to invoke. Lacks explicit named alternatives (e.g., 'use conflict_escalation_tracker for active conflicts'), though the 'precedes escalation' hint implies the distinction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

sanctions_trackerAInspect

[$0.001 USDC per call (x402)] US sanctions tracking from OFAC (Office of Foreign Assets Control). Recently added or removed entities, programs, and designations. Powered by PROWL. Use this to answer 'who was recently sanctioned?' or 'are there new sanctions on [country]?'

ParametersJSON Schema
NameRequiredDescriptionDefault
programNoSanctions program (e.g. 'RUSSIA', 'IRAN', 'SDGT')
days_backNoHow far back (default 30)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses critical cost information ('$0.001 USDC per call') and data source ('Powered by PROWL'). It also clarifies scope ('Recently added or removed' vs. full historical data). However, it omits rate limits, error behavior, data freshness guarantees, or output format details expected for a paid API tool with no annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent structure: cost prefix (critical for paid tool), source identification, content scope, attribution, and usage examples. Every sentence serves a distinct purpose. Information is front-loaded with the pricing and source (OFAC) being most prominent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-parameter tool without output schema, the description adequately covers cost, data source, functional scope, and usage intent. Minor gap: does not describe the return value structure (e.g., list of entities, fields returned) which would help agents understand how to process results, but the usage examples partially compensate by indicating the tool answers specific question types.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage (program and days_back are well-documented in the schema). The description implies the temporal filtering through 'recently' and 'new' references which maps to days_back, and country references which maps to program, but adds no explicit parameter syntax or format details beyond what the schema provides. Baseline 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly identifies the specific resource (OFAC sanctions data) and action (tracking recently added/removed entities and designations). It distinguishes effectively from siblings like conflict_escalation_tracker or geopolitical_events by focusing specifically on US financial sanctions designations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage patterns: 'Use this to answer who was recently sanctioned?' and 'are there new sanctions on [country]?' These examples clarify the tool's temporal focus (recent changes) and query patterns. Lacks explicit 'when not to use' guidance or named alternatives, but the examples provide clear positive guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_interestAInspect

[$0.001 USDC per call (x402)] Search interest data -search volume over time for any topic. Detect rising search interest that signals growing public awareness. Normalized 0-100 scale. Powered by PROWL. Use this to answer 'is X trending in searches?' or 'how much search interest is there in Y?'

ParametersJSON Schema
NameRequiredDescriptionDefault
geoNoCountry code (default US)
topicYesSearch term or topic
days_backNoTrend window (default 30)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden effectively. It reveals critical cost information ('$0.001 USDC per call'), output scale ('Normalized 0-100'), data provenance ('Powered by PROWL'), and interpretive value ('Detect rising search interest'). Missing only minor details like rate limits or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Exceptionally dense and front-loaded. The cost prefix is critical for agent decision-making. Every clause adds value: functionality (search volume), interpretation (detect rising interest), data format (0-100 scale), attribution (PROWL), and invocation triggers (example questions). Zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 3-parameter tool without annotations or output schema, the description adequately compensates by describing the return value format (normalized scale) and cost structure. Could be improved with default value mentions or error behavior, but covers the essential behavioral contract for agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage (topic, geo, days_back all documented). The description mentions 'any topic' and 'over time' which loosely map to parameters but doesn't add syntax details, validation rules, or examples beyond the schema. Baseline 3 is appropriate given schema completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool retrieves 'search volume over time' with specific metrics (normalized 0-100 scale). Distinguishes from siblings like 'public_attention' by focusing specifically on search data rather than general attention metrics, though it doesn't explicitly name alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases via the template questions ('is X trending in searches?' and 'how much search interest is there in Y?'), giving clear guidance on when to invoke. Lacks explicit 'when not to use' guidance or named sibling comparisons, but the specificity of the questions effectively constrains usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

treasury_yield_curveAInspect

[$0.001 USDC per call (x402)] US Treasury yield curve -1-month through 30-year rates. Detect yield curve inversions that historically predict recessions. Includes current spread (10Y-2Y) and inversion status. Powered by PROWL. Use this to answer 'is the yield curve inverted?' or 'what are current treasury rates?'

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. It successfully discloses cost structure ($0.001 USDC per call with x402 multiplier) and data provenance (Powered by PROWL). Could explicitly state read-only nature, but cost disclosure is critical behavioral context for agent decision-making.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Efficiently front-loaded with cost metadata, followed by resource description, specific capabilities, attribution, and usage examples. Every sentence provides distinct value (pricing, scope, features, source, use cases) with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite missing output schema, description compensates by detailing return contents (rates range, spread calculation, inversion status). Completeness is strong for a parameter-free data retrieval tool, though explicit mention of return format structure would elevate to 5.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Zero parameters present per schema. Per scoring rules, 0 params = baseline 4. Description appropriately does not fabricate parameter details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the resource (US Treasury yield curve) and scope (-1-month through 30-year rates). It specifies unique capabilities (detecting inversions, 10Y-2Y spread) that distinguish it from siblings like recession_probability or economic_indicator.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage guidance with quoted examples: 'Use this to answer 'is the yield curve inverted?' or 'what are current treasury rates?''. Lacks explicit 'when not to use' or named alternatives (e.g., vs. recession_probability), but the scope is clearly bounded.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

un_voting_patternsAInspect

[$0.001 USDC per call (x402)] UN General Assembly and Security Council voting patterns. Track how countries vote on key resolutions, detect shifting alliances, and spot diplomatic realignments. Powered by PROWL. Use this to answer 'how did [country] vote at the UN?' or 'are alliances shifting?'

ParametersJSON Schema
NameRequiredDescriptionDefault
topicNoResolution topic filter
countryNoCountry to focus on
days_backNoHow far back (default 90)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. It adds critical cost information ('$0.001 USDC per call') and data source attribution ('Powered by PROWL'), but omits return format details, rate limits, authentication requirements, or error behaviors expected for a data retrieval tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Efficiently front-loaded with cost notice, followed by capability summary, data source, and usage examples. Every sentence provides distinct value (pricing, function, provenance, query patterns) with no redundant or filler content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Appropriately complete for a 3-parameter data retrieval tool. Describes the conceptual output (voting patterns, alliance shifts) comprehensively. Absence of output schema could warrant technical return value description, but the functional explanation of what data is retrieved suffices for agent selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage ('Resolution topic filter', 'Country to focus on', 'How far back'), establishing baseline 3. The description implies parameter usage through example questions ('how did [country] vote') but does not add format specifications, enum values, or syntax rules beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs (track, detect, spot) with clear resources (UN General Assembly, Security Council voting patterns, key resolutions). It clearly distinguishes from siblings by focusing on diplomatic/voting data rather than crypto, weather, or economic indicators.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage guidance via 'Use this to answer...' with concrete query patterns ('how did [country] vote at the UN?' or 'are alliances shifting?'). Lacks explicit 'when not to use' exclusions, though the domain separation from siblings makes alternatives obvious.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

weather_alertsAInspect

[$0.001 USDC per call (x402)] Active severe weather alerts across the US -hurricanes, tornadoes, heat waves, flooding, winter storms. Each alert includes severity, affected area, timing, and impact assessment. Structured for prediction market evaluation and risk assessment. Powered by PROWL. Use this to answer 'are there any weather alerts?' or 'severe weather warnings?'

ParametersJSON Schema
NameRequiredDescriptionDefault
stateNoFilter by US state code (e.g. CA, TX, FL)
severityNoMinimum severity
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses pricing ($0.001 USDC per call), data source (PROWL), and return structure (severity, area, timing, impact). Missing rate limits or data freshness guarantees, but cost transparency is valuable behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Efficiently front-loaded with cost and primary function. Each sentence serves distinct purpose: scope definition, event examples, return payload description, use case examples, and attribution. No redundancy or wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Compensates well for lack of output schema by detailing return fields (severity, timing, impact). Given 2 optional parameters and many weather-related siblings, description adequately positions the tool via prediction market framing and explicit query patterns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage ('Filter by US state code', 'Minimum severity'). Description implies these filters by stating scope ('across the US') and severity examples, but adds no syntax details beyond the schema. Baseline 3 appropriate when schema does heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb+resource ('Active severe weather alerts across the US') with clear scope including specific event types (hurricanes, tornadoes, etc.). Distinguishes from siblings via 'prediction market evaluation' use case, differentiating it from generic weather_current or detailed hurricane_tracker tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit query patterns ('Use this to answer...') showing when to invoke. Lacks explicit 'when-not' guidance comparing to sibling tools like weather_current or weather_forecast, though the prediction market framing implies specialized financial use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

weather_contract_evaluationAInspect

[$0.03 USDC per call (x402)] Evaluate weather-contingent prediction market contracts. Given a contract specification (e.g. 'temperature in Phoenix exceeds 115F by July 2026'), returns probability based on forecast data, historical base rates, and climate trends. Powered by PROWL intelligence engine. Use this to answer 'should I bet on this weather contract?' or 'what is the fair price for this weather market?'

ParametersJSON Schema
NameRequiredDescriptionDefault
contract_descriptionYesNatural language contract specification
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. It successfully adds critical behavioral context: the cost '$0.03 USDC per call (x402)', the return type 'returns probability', and methodology 'forecast data, historical base rates, and climate trends'. Missing only minor details like rate limits or exact output format structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with zero waste. Front-loaded with cost information (critical for agent decision-making) and primary verb. Each subsequent sentence adds distinct value: input example, methodology disclosure, and usage scenarios. No redundancy with schema or title.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool without output schema, the description adequately covers the semantic meaning of inputs and outputs, cost structure, and decision context. Slight deduction because it does not specify the probability output format (e.g., 0-1 vs 0-100) or precise structure of the returned evaluation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the parameter described as 'Natural language contract specification'. The description adds significant value by providing a concrete example ('e.g. temperature in Phoenix exceeds 115F by July 2026'), clarifying the expected input format and specificity level beyond the schema's generic description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool 'Evaluate[s] weather-contingent prediction market contracts' using a specific verb and resource. It clearly distinguishes from sibling weather tools (e.g., weather_forecast) by specifying prediction market contract evaluation, and from generic prediction market tools by specifying the weather-contingent domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases: 'Use this to answer should I bet on this weather contract? or what is the fair price for this weather market?'. This gives clear positive guidance on when to select the tool. However, it lacks explicit 'when not to use' guidance or named alternatives from the sibling list (e.g., when to use weather_forecast vs this tool).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

weather_currentAInspect

[Free] Current weather conditions for any US location. Temperature, humidity, wind, pressure, and conditions from PROWL weather monitoring. Structured for prediction market evaluation and event planning -not just raw readings. Powered by PROWL. Use this to answer 'what is the weather in [city]?' or 'current temperature in [location]?'

ParametersJSON Schema
NameRequiredDescriptionDefault
latYesLatitude
lngYesLongitude
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the data source (PROWL), cost model ([Free]), and return structure (structured for prediction markets vs raw). However, it lacks information on rate limits, caching behavior, or error handling for invalid coordinates.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with key functionality and use cases. It contains minor marketing cruft ('[Free]', 'Powered by PROWL') but every sentence contributes meaningful information about capabilities or data sources.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description adequately compensates by listing the specific weather data returned (temperature, humidity, wind, pressure, conditions) and explaining the structured nature of the data. Sufficient for a 2-parameter read-only tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (both lat and lng are documented), establishing a baseline of 3. The description mentions 'any US location' but does not add critical semantic context that coordinates are required despite example queries implying city names are accepted.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides 'Current weather conditions' with specific data points (temperature, humidity, wind, pressure). It distinguishes from siblings by emphasizing 'prediction market evaluation and event planning' versus raw readings, and implicitly from weather_forecast by focusing on 'current' conditions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides example queries ('what is the weather in [city]?') indicating when to use the tool, but fails to clarify that coordinates (lat/lng) are required rather than city names, which could cause invocation errors. It does not explicitly differentiate from sibling tools like weather_forecast or weather_alerts.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

weather_forecastAInspect

[$0.001 USDC per call (x402)] 7-day weather forecast for any US location. Daily high/low, precipitation probability, wind, and conditions. Structured for contract evaluation and threshold analysis. Powered by PROWL. Use this to answer 'what is the weather forecast?' or 'will it rain this week?'

ParametersJSON Schema
NameRequiredDescriptionDefault
latYesLatitude
lngYesLongitude
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full disclosure burden and succeeds well: it reveals pricing ($0.001 USDC per call), geographic scope (US only), output structure, and data provider (PROWL). Missing only rate limits or caching behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with cost and primary purpose. Information is efficiently packed across four sentences. The 'Powered by PROWL' attribution adds minimal value for tool selection, slightly reducing the score from perfect.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Appropriately complete for a 2-parameter tool without output schema. Covers cost, return data structure, geographic constraints, and sample use cases. Does not need to explain return values since output schema is absent and description adequately covers expected data fields.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While schema has 100% coverage with basic 'Latitude/Longitude' labels, the description adds crucial semantic constraint that coordinates must be 'US location'—geographic validation not present in the schema. This compensates for the schema's minimalism.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it provides a '7-day weather forecast' with specific outputs (daily high/low, precipitation, wind). However, mentioning 'contract evaluation and threshold analysis' creates ambiguity with siblings weather_contract_evaluation and weather_threshold_probability, muddying differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage examples ('Use this to answer...'), giving clear positive guidance. However, it fails to clarify when NOT to use this tool (e.g., vs weather_current for current conditions) or how it differs from specialized weather siblings like weather_contract_evaluation despite overlapping functionality claims.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

weather_impact_analysisAInspect

[$0.03 USDC per call (x402)] Weather impact analysis -cross-references active alerts and forecasts with population density, infrastructure, and economic activity to assess real-world impact. Severity * exposure scoring. Powered by PROWL intelligence engine. Use this to answer 'how bad will this storm be?' or 'economic impact of the weather?'

ParametersJSON Schema
NameRequiredDescriptionDefault
latNoLatitude (or omit for national)
lngNoLongitude
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and delivers cost transparency ('$0.03 USDC per call'), data source context ('Powered by PROWL intelligence engine'), and methodology ('Severity * exposure scoring'). It implies read-only data aggregation but doesn't explicitly state safety/permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded cost information followed by functional description and use-case examples. The '[...]' notation and 'x402' are slightly cryptic, but every sentence serves a distinct purpose (cost, function, method, usage) without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Lacks output schema and provides minimal detail on return format (only mentions 'scoring'). Given the complexity of impact analysis (economic + infrastructure data), the description should ideally characterize the output structure (e.g., risk scores, affected population estimates) to compensate for missing output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage (lat/lng with 'national' option noted). The description text itself adds no parameter details, relying entirely on the schema. With high schema coverage, this meets the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool cross-references weather alerts with 'population density, infrastructure, and economic activity' to assess real-world impact, distinguishing it from basic weather retrieval siblings (weather_current, weather_forecast). The mention of 'Severity * exposure scoring' adds methodological specificity, though it briefly restates the tool name at the start.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases: 'Use this to answer 'how bad will this storm be?' or 'economic impact of the weather?'' These examples effectively signal when to select this tool over siblings like hurricane_tracker or weather_alerts, though it doesn't explicitly name the alternatives to avoid.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

weather_threshold_probabilityAInspect

[$0.005 USDC per call (x402)] Probability of weather crossing specific thresholds -will temperature exceed 100F? Will rainfall exceed 2 inches? Computed from forecast ensembles. Designed for prediction market contract evaluation. Powered by PROWL. Use this to answer 'will it be over 100 degrees?' or 'probability of heavy rain?'

ParametersJSON Schema
NameRequiredDescriptionDefault
latYesLatitude
lngYesLongitude
metricYesMetric to evaluate
thresholdYesThreshold value
days_aheadNoForecast window (default 7)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses pricing ($0.005 USDC per call), computation method (forecast ensembles), and provider (PROWL). Missing output format specification (0-1 vs percentage vs object structure) which is critical given no output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Information-dense with cost front-loaded, followed by purpose, method, use case, and examples. 'Powered by PROWL' adds minor fluff. Overall efficient structure with zero redundant sentences beyond attribution.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Missing return value description despite no output schema and no annotations. For a probability tool, specifying whether it returns a scalar (0-1), percentage, or complex object is essential. Cost and method disclosure partially compensate but gap remains.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage ('Latitude', 'Threshold value', etc.), establishing baseline of 3. Description adds concrete examples (100F, 2 inches) illustrating threshold values but does not significantly expand semantic understanding beyond schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action (probability of crossing thresholds) and resource (weather metrics) with concrete examples ('exceed 100F', 'exceed 2 inches'). Distinguishes from general weather tools via focus on probabilistic thresholds and forecast ensembles.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear usage context ('Designed for prediction market contract evaluation') and example queries ('will it be over 100 degrees?'). Lacks explicit contrast with siblings like weather_forecast or weather_contract_evaluation, though the prediction market focus helps differentiate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources