Skip to main content
Glama

flopsindex-mcp

Server Details

Live GPU compute & inference token price indices for AI agents — 591 reference indices across H100/A100/B200/B300 spot+on-demand, Claude/GPT/Llama token pricing, and more. Every value is methodology-versioned and citable via the /v1/verify handshake.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.1/5 across 24 of 24 tools scored.

Server CoherenceA
Disambiguation5/5

Each tool targets a distinct purpose: compute margin, forecast, LMP, index quoting, verification, etc. Descriptions are detailed and clearly differentiate even similar-sounding tools like get_regional vs get_regional_spread. No two tools could be confused for the same operation.

Naming Consistency4/5

Most tools follow a 'get_' prefix pattern (18/24) for data retrieval, with some outliers like compute_margin, gpu_capex, list_indices, and spread. The mix of verbs (get, compute, list, verify) is reasonable given the varied operations, but not perfectly uniform.

Tool Count4/5

24 tools is on the higher side, but each serves a clear role within the complex domain of FLOPS indices, LMP, GPU pricing, and verification. The count feels justified rather than excessive, though a slightly tighter set could be considered.

Completeness4/5

The tool surface covers the core workflows: index discovery, quoting, verification, forecasting, margin computation, and regional analysis. However, several tools (get_arbitrage_windows, get_basis, get_lmp_history, get_sla_delta) are indicated as 'endpoint_pending', which represents minor gaps in the current implementation.

Available Tools

24 tools
compute_marginAInspect

Spark-spread (compute margin) per (sku, region): price − chip_power × PUE × $/kWh − rack_amortization. Returns a JSON envelope with price, power_cost, rack_cost, margin, margin_pct. Optional pue + kwh_source override defaults.

ParametersJSON Schema
NameRequiredDescriptionDefault
pueNo
skuYes
regionYes
kwh_sourceNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses the formula and return envelope, but does not explicitly state whether it is a read-only computation or has side effects. Missing behavioral traits like 'does not modify data'.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first gives purpose and formula, second gives return envelope and optional overrides. No fluff, front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Explains return envelope fields, core formula, and optional params. Missing default values for PUE and kwh_source, error conditions, and assumptions (e.g., data sources). Still fairly complete for a simple computation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description adds value for optional pue and kwh_source ('override defaults'), but for required sku and region, it adds no further meaning beyond the schema. It does not describe constraints or format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool computes margin for a given (sku, region) using a specific formula. It distinguishes itself from siblings like get_price or spread by focusing on margin computation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implied usage: when computing margin per SKU and region, possibly with custom PUE or electricity cost source. But lacks explicit when-not to use or references to alternative siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_arbitrage_windowsAInspect

Compute-arbitrage windows: time-of-day periods when spot vs on-demand spread exceeds a threshold, signaling an interruptible-workload migration opportunity. Returns ranked {window_start, window_end, expected_savings_pct, confidence} entries. Returns {ok: false, reason: 'endpoint_pending'} until the public route lands.

ParametersJSON Schema
NameRequiredDescriptionDefault
skuYesGPU SKU (e.g. 'h100_sxm5', 'a100_sxm4', 'b300_sxm7').
regionYesRegion code (e.g. 'us_west', 'eu_north').
min_savings_pctNoMinimum savings threshold (0-1). Default 0.15 = 15%.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but the description discloses important behavioral traits: the tool returns specific fields (window_start, window_end, expected_savings_pct, confidence) and may return an error 'endpoint_pending' if the route is not available. This adds transparency beyond basic function.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is compact with two sentences: first states purpose clearly, second covers return format and potential error state. No extraneous information, though could be slightly more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema and annotations, the description provides sufficient context: it explains the computation, return fields, and possible transient error. It lacks details on data sources or prerequisites but is adequate for a compute tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds little beyond what the schema already provides for parameters (sku, region, min_savings_pct). The default for min_savings_pct is mentioned but doesn't significantly enhance semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool computes arbitrage windows (time-of-day periods) based on spot vs on-demand spread, with a specific resource and return format. It distinguishes itself from siblings like get_price or get_regional_spread by focusing on arbitrage opportunities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for identifying interruptible-workload migration opportunities but does not explicitly state when to use vs alternatives or provide exclusion criteria. The purpose is clear but lacks direct guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_basisAInspect

Forecast-vs-actual basis for one ISO hub — the signed spread between day-ahead forecast LMP and the latest RTM settle. A power-cost-uncertainty proxy for compute-margin sensitivities. Returns {ok: false, reason: 'endpoint_pending'} until the public surface lands.

ParametersJSON Schema
NameRequiredDescriptionDefault
isoYesISO code (caiso, miso, pjm, ercot, isone, nyiso, entso_e, jepx).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses the current endpoint behavior (returns pending message) but lacks details on idempotency, cost, or data retention. Adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently convey purpose, context, and current state. Front-loaded with essential information, no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description covers the purpose, use case, and current limitation. Could benefit from hinting at the expected output format when operational, but overall complete enough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers the single parameter (iso) with a list of codes. Description adds no additional meaning beyond the schema, so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly defines the tool as returning the forecast-vs-actual basis (spread) for an ISO hub. The description distinguishes it as a proxy for compute-margin sensitivities, but does not explicitly differentiate from sibling tools like get_forecast or get_price.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for power-cost-uncertainty and margin sensitivity analysis, but no explicit when-to-use or when-not-to-use. Mentions the current pending state, which is a usage caveat.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_forecastAInspect

Day-ahead LMP forecast snapshots across every forecast-capable ISO adapter (CAISO / MISO / PJM / ENTSO-E / JEPX). Returns {generated_at, snapshots, errors, unconfigured, summary}. Auth-free public Track-D surface. Returns {ok: false, reason: 'endpoint_pending'|'auth_required'} if the wire-up is pending.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description discloses auth-free public access, return structure, and possible error reasons ('endpoint_pending', 'auth_required'), providing helpful behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no redundancy. First sentence clearly states purpose and scope; second adds return format and error cases. Front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Zero-parameter tool with complete description covering return shape, error states, and authentication status. No additional context needed despite siblings.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Tool has zero parameters, so schema coverage is 100%. No param descriptions needed; baseline of 4 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description specifies verb 'get', resource 'forecast snapshots', scope covers multiple ISO adapters. Distinguishes from sibling get_forecast_accuracy by focusing on day-ahead LMP forecasts.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States scope and auth-free nature but lacks explicit when-to-use vs alternatives. No mention of exclusions or when to prefer get_forecast_accuracy.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_forecast_accuracyAInspect

Backtest accuracy for forecast values vs realized settles. Returns {iso, hub, window_days, mae, rmse, mape, n_samples, sample_basis} per hub. Optional region kwarg restricts to a canonical region tag.

ParametersJSON Schema
NameRequiredDescriptionDefault
isoYesISO code.
regionNoOptional canonical region tag (us-west, us-east, ..., eu-west).
window_daysNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears the burden. It implies a read-only computation, but does not disclose auth requirements, rate limits, or whether the backtest is cached. It adds some context about return fields but lacks depth.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with the purpose first and key details second. Every word adds value; no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately lists return fields but does not explain each metric (MAE, RMSE) or the 'hub' field. It is sufficient for a simple tool but leaves some ambiguity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds meaning beyond the schema by specifying the region format and examples (us-west, etc.) and explaining the return structure. However, it does not clarify the 'window_days' parameter beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool backtests forecast accuracy vs realized settles and lists the return fields per hub. It is distinct from sibling tools like 'get_forecast' which likely returns forecast values, not accuracy metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for accuracy backtesting but does not specify when to use versus alternatives like 'get_forecast' or 'get_basis'. No exclusions or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_indexAInspect

Resolve a FLOPS index to its source-opaque PUBLIC payload (the r15-contracts C3 citation contract). Returns exactly {index_id, value, unit, as_of, data_tier, confidence, methodology_url, verify_url, citation_url, permalink} and DELIBERATELY no providers / sources / num_sources / num_providers / computation_proof. Prefer get_index when you intend to CITE the value.

ParametersJSON Schema
NameRequiredDescriptionDefault
index_idYesFLOPS index id / slug, e.g. 'FLOPS-H100-OD', 'FLOPS-A100-SPOT', 'ITPI-...'.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It specifies the data source ('r15-contracts C3 citation contract'), lists exact return fields, and explicitly states what is omitted. This provides sufficient behavioral context for a read tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first describes the core action and return fields, second provides usage guidance. No wasted words; front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter fetch tool with no output schema, the description fully covers return fields, omissions, and source contract. No missing critical context needed for agent selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter (index_id) with 100% schema coverage including examples. Description adds minimal value beyond schema (e.g., mentions 'FLOPS index id / slug'). Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool resolves a FLOPS index to its public payload, listing specific return fields. It distinguishes from siblings by noting what is deliberately omitted and provides a usage hint: 'Prefer get_index when you intend to CITE the value.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises when to prefer this tool ('when you intend to CITE the value'), but does not explicitly name sibling tools or state when not to use it. However, the context of deliberate omissions implies alternatives for source/providers info.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_inference_qosAInspect

Inference quality-of-service (IQOS) coverage — the QUALITY companion to ITPI (price). Returns {count, models:[{provider, model, score, ttft_ms, throughput_tps, p95_ms, errorrate_pct, uptime_pct, ts}], stale_count, stale:[...]}. score is 0-100. HONESTY (mig 053): a model is reported ONLY from a genuine remote vantage probe; an unmeasured model is listed under stale with a null score, NEVER scored as 0/downtime. The named provider/model IS the published subject (inference labs are public), distinct from GPU-spot SELLER identities which remain source-opaque.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries full burden and discloses key behaviors: models only from genuine probes, stale handling, null scores, identity distinction. Very detailed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Information-rich but dense; return format mixed with behavioral notes could be more concise or structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no params, no annotations, no output schema, description fully covers purpose, structure, and behavioral constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Zero parameters, baseline 4. No parameter info needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states tool returns IQOS coverage, specifies return structure, and positions as quality companion to price tool (ITPI). Differentiates from sibling tools like get_price.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies use for quality metrics via 'QUALITY companion', but no explicit when-to-use, when-not-to-use, or alternatives listed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_iso_health_eventsAInspect

Last N fetch attempts for a single ISO adapter from the AdapterCache ring buffer. Diagnoses whether a feed is healthy or silently 4xx-ing. Admin-gated — set FLOPSINDEX_ADMIN_API_KEY to authenticate; without it returns {ok: false, reason: 'auth_required'}.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesISO adapter name (e.g. 'caiso', 'miso', 'pjm', 'ercot', 'isone', 'nyiso', 'entso_e', 'jepx').
limitNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses authentication requirements, the data source (AdapterCache ring buffer), and the diagnostic purpose. It does not describe the normal return format but addresses auth failure behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences that front-load the purpose and immediately provide key usage context. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Without an output schema, the description does not specify the normal return format, only the auth error. For a diagnostic tool, this lack of output description is a gap. However, it covers authentication and input constraints adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 50%: the 'name' parameter has a list of examples, and 'limit' has constraints. The description adds no extra meaning beyond the schema, so it meets the baseline but does not compensate for the missing parameter docs.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool's purpose: retrieving the last N fetch attempts for a single ISO adapter to diagnose feed health. It clearly distinguishes from sibling tools, which focus on margins, forecasts, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description notes the tool is admin-gated and requires an API key, with a specific error response if missing. It implies usage for diagnosing feed health but does not explicitly mention when not to use it or provide alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_lmp_historyAInspect

Historical LMP timeseries for one ISO hub. Returns decimated points (≤500) across the requested window. For basis backtests, forecast-error analyses, and compute-margin sensitivity studies. Returns {ok: false, reason: 'endpoint_pending'} until the public route lands.

ParametersJSON Schema
NameRequiredDescriptionDefault
hubNoOptional hub identifier (e.g. 'SP15' for CAISO, 'INDIANA.HUB' for MISO). Omit for the ISO's primary settlement hub.
isoYesISO code (caiso, miso, pjm, ercot, isone, nyiso, entso_e, jepx).
rangeNo7d
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. The description discloses the decimation limit (≤500 points) and a pending endpoint state returning an error. It does not mention idempotency, auth requirements, or rate limits, but for a read-only tool this is acceptable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: function, behavior (decimation), use cases, and a note about pending endpoint. No wasted words; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple query tool with 3 params and no output schema, the description covers purpose, usage, key behavioral traits (decimation and pending state). It does not describe successful return structure, but that is acceptable given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 67% description coverage (iso and hub have descriptions, range only has enum and default). The description adds little beyond schema: it mentions decimation which relates to range, but no new parameter-level detail.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns historical LMP timeseries for one ISO hub with decimated points. It mentions specific use cases (basis backtests, etc.), which helps differentiate from sibling tools like get_price or get_basis, though not explicitly contrasting.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description lists concrete use cases (basis backtests, forecast-error analyses, compute-margin sensitivity studies) indicating when to use. However, it does not specify when not to use or suggest alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_methodologyAInspect

Fetch a methodology spec for a given slug (and optional version). Returns canonical math + source-weighting policy + trim rules. Cite the methodology version pin URL in contracts.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYese.g. 'flci-h100'.
formatNojson
versionNoOptional pinned version (e.g. 'v0.9'). Defaults to latest.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses return contents (math, policy, trim rules) and a usage note about citing URLs, but does not explicitly state safety, idempotency, or error behavior. The read-only nature is implied by 'Fetch' but not confirmed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two efficient sentences with no wasted words. The first sentence covers purpose, the second adds a critical usage instruction. It is front-loaded and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations or output schema, the description provides a good overview of purpose, key parameters, return content, and a contractual requirement. It lacks details on error handling or URL format, but is sufficient for a simple fetch tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 67%, with 'format' missing a description. The description adds nothing beyond the schema for 'slug' and 'version', and does not mention 'format' at all. The usage note about citing URLs is tangential.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Fetch', the resource 'methodology spec', and the key parameter 'slug'. It distinguishes itself from siblings like 'list_methodologies' by specifying a single slug fetch, and it mentions return content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for fetching a specific methodology spec, but lacks explicit guidance on when not to use or how it compares to alternatives like 'list_methodologies'. No clear context or exclusions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_priceAInspect

Fetch the current published value for a FLOPS compute or inference price index. Returns {value, unit, ts, tier, confidence, verify_url, citation_url}. Cheapest way to answer 'what does X trade at right now?' Confidence is HIGH/MED/LOW; tier is LIVE/SETTLED/SEED. Carries no methodology or source attribution by design.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesFLOPS index slug. Accepts FLOPS-<MODEL>-OD, FLOPS-<MODEL>-SPOT, FLOPS-<MODEL>-OD-T1, FLOPS-B300-OD-NORDICS, or a bare gpu_model.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses return fields, confidence levels, tier types, and intentionally notes 'carries no methodology or source attribution by design'. With no annotations, this adequately describes behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two clear sentences: first covers purpose and output, second covers usage and behavioral note. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given simple inputs and no output schema, the description fully covers purpose, parameters, return format, and behavioral traits. An agent can use it confidently.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Explains the single 'slug' parameter in detail with specific format examples (FLOPS-<MODEL>-OD, bare gpu_model, etc.), adding significant value beyond the schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Fetch the current published value for a FLOPS compute or inference price index'. Specifies verb, resource, and scope, distinguishing it from siblings like get_timeseries or get_methodology.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Indicates it's the 'cheapest way to answer "what does X trade at right now?"', providing context for when to use. Implicitly contrasts with more detailed tools. Lacks explicit alternatives but still useful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_regionalAInspect

Regional decomposition for one public spot index across the canonical 14-region taxonomy. Returns the global value + one entry per region: {region, value, data_tier, ts} and, for each LIVE region, a dispersion band spread_band={low, high, iqr} describing the intra-region price RANGE (NOT a source set). data_tier is LIVE / below_k_anon / insufficient_sources; quote/settle only LIVE. index_id is UPPERCASE (case-sensitive public-spot filter). Source-opaque: no provider identities; the band's n is paid-tier-only and absent from this auth-free call.

ParametersJSON Schema
NameRequiredDescriptionDefault
index_idYesPublic spot index id, UPPERCASE (e.g. 'FLOPS-H100-OD', 'FLOPS-A100-OD').
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses source opacity, data tier conditions, that the band's n is paid-tier-only and absent from this auth-free call, and case sensitivity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the purpose and is efficient, though it packs many details into one paragraph. It could be slightly more structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the single parameter and no output schema, the description covers all necessary details: output format, data tiers, dispersion bands, and authentication context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema already provides a clear description for index_id (UPPERCASE, examples). The description adds minimal extra meaning beyond repetition, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it performs 'Regional decomposition for one public spot index across the canonical 14-region taxonomy' and details the output structure, distinguishing it from siblings like get_regional_spread.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the tool's scope but does not provide explicit guidance on when to use it versus alternatives or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_regional_spreadAInspect

Catalog-wide regional spread — one row per public spot index with >=2 priced regions, sorted widest-first. {index_id, max_region, min_region, spread, spread_pct} plus max_spread_band / min_spread_band. Answers 'which SKU has the widest regional price dispersion right now?' Auth-free. Source-opaque (regions + ranges, never a seller set).

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses auth requirements (auth-free), data source opacity (never a seller set), output fields in detail, and the filter condition (>=2 priced regions). It does not discuss performance or side effects, but for a read-only query this is acceptable.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, each adding distinct value: purpose and format in first, field list in second, use case in third. No redundancy, front-loaded with core purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no parameters, the description fully covers what the tool returns (field names), how it filters (>=2 priced regions), sorting (widest-first), and important caveats (auth-free, source-opaque). It is sufficient for an agent to invoke and interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are zero parameters, so schema coverage is 100% by default. Per guidelines, 0 parameters warrants a baseline of 4. The description adds no parameter info because none exist.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns 'catalog-wide regional spread' with specific verb+resource: one row per public spot index with >=2 priced regions. It answers the explicit question 'which SKU has the widest regional price dispersion right now?' and distinguishes from siblings like 'spread' or 'get_regional' by scope and aggregation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'Auth-free' as a usage condition. It implies usage for discovering widest dispersion across indices but does not explicitly state when not to use or name alternatives among siblings. However, the context is clear enough for an agent to decide.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_sla_deltaAInspect

Published-vs-observed provisioning latency for a DePIN provider (Akash / Render / io.net / Aleph.im / Phala / Spheron). Underpins FLOPS-SLA-DELTA + the CLRI-DePIN QoS weighting. Returns {ok: false, reason: 'endpoint_pending'} until the surface lands.

ParametersJSON Schema
NameRequiredDescriptionDefault
providerYesDePIN provider id (e.g. 'akash', 'render', 'io_net', 'aleph_im', 'phala', 'spheron').
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses that the tool may return {ok: false, reason: 'endpoint_pending'} until data lands, which is important for an agent. No annotations are provided, so this disclosure adds value, though it doesn't cover all potential failure modes.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states purpose, second adds behavioral context. No unnecessary words, well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one param and no output schema, the description covers purpose, behavioral quirk, and motivational context. Minor gap: no explanation of the return structure or error states beyond the pending case.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'provider' is fully described in the schema (100% coverage). The description redundantly lists the same provider values, adding no new semantic meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns 'Published-vs-observed provisioning latency for a DePIN provider' and lists specific provider names, distinguishing it from siblings like compute_margin or get_price.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context about underlying indices but no explicit guidance on when to use this tool vs alternatives. The mention of 'endpoint_pending' is a behavioral note, not usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_timeseriesAInspect

Fetch a decimated timeseries (≤200 points) for a FLOPS index. Use for plotting recent price history or computing simple trends. Returns ordered [ts, value] tuples. Range is one of 24h / 7d / 30d / 90d / 1y.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYes
rangeNo7d
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description covers key behavioral traits: decimation to ≤200 points, ordered [ts, value] tuples, and available range options. It doesn't cover auth or rate limits, but for a fetch tool this is sufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, each adding value: operation, use case, return format, range options. Front-loaded and no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 2 params, no output schema, and no annotations, the description covers purpose, return format, and constraints. It doesn't explain what a FLOPS index is, but that's likely domain knowledge. Complete enough for usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description must compensate. It implies slug is an index name via 'for a FLOPS index' and lists the range enum values. However, it doesn't describe slug format or details, providing only partial parameter meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Fetch a decimated timeseries (≤200 points) for a FLOPS index,' specifying the verb, resource, and constraint. It distinguishes from siblings like get_price and get_index by focusing on timeseries with decimation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use for plotting recent price history or computing simple trends,' providing clear use context. However, no mention of alternatives or when not to use, so slightly lacking in exclusion guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_trading_catalogAInspect

The tradeable public spot catalog — one row per index {slug, label, family} the trading terminal can quote. Auth-free. Source-opaque (no seller identities).

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, description carries full burden. It discloses auth-free (read-only implied) and source-opaque, but lacks explicit disclosure of no side effects, rate limits, or data freshness. Could be more comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is extremely concise: three short sentences front-loading the core purpose, then adding behavioral traits. Every phrase adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a zero-parameter, no-output-schema tool, the description adequately explains the output structure (one row per index with slug, label, family) and key behaviors (auth-free, source-opaque). Could mention if the list is dynamic or static, but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters, so description need not add parameter meaning. Schema coverage is 100% (vacuously). Baseline for 0 params is 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it is a 'tradeable public spot catalog' with one row per index containing slug, label, family. The verb 'get' is implied, and the resource is clearly the catalog. It distinguishes from siblings like get_index (likely for a single index) by describing the full catalog.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It mentions 'Auth-free' which hints at ease of use, but does not explicitly state when to use this tool versus alternatives like get_index or search_indices. The use case (discovering available indices) is implied but not explicitly guided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_trading_terminalAInspect

One desk-ready market-data payload for a public spot index: a quote (last/change%/confidence + a bid/ask PROXY from the dispersion band, NOT an order book), the forward curve, a regional basis ladder (per-region premium vs the global headline, widest-first), a recent price tape, market-depth tier, and realized volatility. INDICATIVE only — no live execution. Source-opaque: regions/ranges, never a seller set; exact depth-n is paid-tier-only and absent from this auth-free call.

ParametersJSON Schema
NameRequiredDescriptionDefault
index_idYesPublic spot index id, UPPERCASE (e.g. 'FLOPS-H100-OD').
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses indicative mode, no live execution, source-opaque, and auth-free nature. Could mention update frequency or caching, but adequately covers key behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a dense single paragraph with all key points front-loaded. Every sentence adds value, though could be improved with bullet points for readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description thoroughly explains return components and limitations. It is sufficient for an agent to understand the tool's output and context relative to siblings.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear parameter description. The tool description adds no additional parameter meaning beyond listing return components, so baseline is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies the tool returns a comprehensive market-data payload for a public spot index, listing exact components (quote, forward curve, regional basis ladder, etc.) and contrasts with order books. This clearly distinguishes it from siblings and avoids tautology.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies use for a combined market overview, but does not explicitly name alternatives or when-not-to-use. However, the detail allows agents to infer context; close to explicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gpu_capexAInspect

Reference street price for a single GPU module (chip + carrier, not full DGX/HGX system). 3-tier cascade per SKU: T2 LIVE_USASPENDING (median of last 180d US federal procurement contracts via USAspending, no markup — REAL new-purchase prices, >=3 awards; PREFERRED over T1 when both meet threshold); T1 LIVE_EBAY_X1.15 (trimmed-median of last 30d eBay completed-listings x 1.15 markup, >=5 sales); T3 SEED (quarterly GRVCI v0_benchmark + OEM channel reporting) as fallback. When LIVE, response additionally carries live_observations (source-specific: agency_sample for federal, trim_fraction + secondary_median_usd for ebay) and also_seed audit fallback {price_usd, source, effective}. Omit sku to return the full seed map + meta.live_api_status. Methodology: https://app.flopsindex.com/methodology/GPU_CAPEX_LIVE_METHODOLOGY.md

ParametersJSON Schema
NameRequiredDescriptionDefault
skuNoFLOPS SKU id (e.g. 'h100_sxm5', 'b300_sxm7'). Omit for the full 23-SKU seed map.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and excels: it details the data sources (USAspending, eBay, GRVCI), the logic for each tier (e.g., '>=3 awards', '>=5 sales'), and the structure of the response (live_observations with source-specific fields, also_seed fallback). All behavioral traits are disclosed, and there is no contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is dense and packed with useful information, but it is written as a single paragraph without bullet points or clear separation of the cascade logic. It front-loads the core purpose well, but the structure could be improved for readability. Despite this, every sentence is valuable and earned its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the tool (3-tier cascade, no output schema), the description is impressively complete: it covers the methodology, response structure, and fallback behavior. It lacks only minor details like error handling or authentication requirements, but overall it is thorough and includes a link to the full methodology.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (one optional string parameter with a basic description). The description adds substantial meaning: it explains that omitting sku returns the full seed map, and it details the pricing tiers, which are not captured in the schema. This goes well beyond the schema's minimal description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Reference street price for a single GPU module (chip + carrier, not full DGX/HGX system).' It specifies the exact resource and action, and the sibling tools list shows no overlap in domain (e.g., compute margin, arbitrage, indices), so it is distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on how to use the tool: it explains the 3-tier cascade (T2 over T1 when both meet threshold, T3 as fallback) and when to omit the sku parameter to get the full seed map. It does not explicitly state when not to use this tool versus siblings, but the siblings are clearly separate in function, so this is not a major gap.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_indicesAInspect

List all public FLOPS compute-price indices. Returns a JSON array of {index_id, family, cadence, unit}. Use this to discover available indices before calling get_price or verify. No auth required.

ParametersJSON Schema
NameRequiredDescriptionDefault
family_filterNoOptional family prefix to filter by (e.g. 'FLOPS-H100' or 'FLOPS-B300'). Case-sensitive substring match.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description adds important behavioral details: return format (JSON array of fields) and that no auth is required. This goes beyond a bare listing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with key purpose, no redundant information. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with one optional parameter, the description covers purpose, return format, and usage context. No gaps given the low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description only restates the optional filter exist. It adds no new meaning beyond the schema's description of 'Case-sensitive substring match'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states a specific verb and resource ('List all public FLOPS compute-price indices') and explicitly distinguishes itself from siblings by advising use 'before calling get_price or verify'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit context for when to use (before get_price/verify) and notes no authentication required. While no explicit when-not-to-use is given, the guidance is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_methodologiesAInspect

List all published methodology specs (slug + latest version). Use to discover available methodology IDs before calling get_methodology.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries full burden. It implies a read-only list operation and specifies the output fields (slug + version). No mention of pagination or errors, but for a simple listing tool, this is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences: first states functionality, second provides usage context. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a zero-parameter listing tool with no output schema, the description sufficiently covers the return type (slug + latest version) and its purpose in the workflow (preceding get_methodology).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, and schema coverage is 100%, so the description cannot add value beyond the schema. The baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'List all published methodology specs (slug + latest version)', providing a specific verb and resource, and distinguishes it from the sibling 'get_methodology' by noting its use for discovery.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description clearly says 'Use to discover available methodology IDs before calling get_methodology', which tells the agent when to use this tool and how it relates to a sibling tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recompute_auditAInspect

IOSCO Principle 9+17 recompute receipt — replays the published value from canonical inputs and returns {status: green|yellow|red, variance_bps, methodology_version, inputs_hash, replay_value, published_value}. Use to verify a settlement-grade value end-to-end.

ParametersJSON Schema
NameRequiredDescriptionDefault
as_ofNoOptional ISO timestamp.
index_idYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must carry full behavioral transparency. It discloses the replay action and return fields but does not mention whether the tool is read-only, idempotent, or has side effects. The behavioral traits are plausible but not fully explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at two sentences, with all critical information front-loaded: the purpose, output fields, and usage hint. No unnecessary words or repetitions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (IOSCO principles, audit recompute), the description covers core purpose and output, but lacks parameter explanations and behavioral details beyond the return format. Without an output schema, the agent must infer the meaning of fields like 'variance_bps' and 'inputs_hash'. Adequate but incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 50%: 'as_of' is partially described as 'Optional ISO timestamp' in schema, but 'index_id' has no schema description. The tool description adds minimal meaning—it mentions 'canonical inputs' but does not clarify the role of each parameter or their expected values, leaving gaps for an agent.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool recomputes an IOSCO Principle 9+17 receipt from canonical inputs and returns specific fields, with a clear verb and resource ('recompute_audit'). It distinguishes from siblings (e.g., compute_margin, get_index) through its unique purpose of end-to-end verification of settlement-grade values.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes 'Use to verify a settlement-grade value end-to-end,' implying when to use it, but does not specify when not to use it or mention alternative tools. There is no guidance on context or prerequisites, leaving some ambiguity for the agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_indicesAInspect

Resolve a free-text query to canonical FLOPS index slugs. Use when you don't know the exact slug — e.g. 'H100 spot' returns the matching SPOT family entries. Best paired with get_price to look up + quote a value in one turn.

ParametersJSON Schema
NameRequiredDescriptionDefault
qYesFree-text query (e.g. 'b300 nordics on-demand').
limitNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description must convey behavioral traits. It describes the search/resolution operation but does not mention any side effects, auth requirements, or whether the operation is read-only. The description is adequate for a simple lookup, but not rich enough to fully compensate for the lack of annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with an example, front-loaded with purpose. Every word adds value, no redundancy. Extremely concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple tool (2 params, no output schema), the description covers purpose, usage, and a workflow tip. It does not describe the return format or pagination behavior for 'limit', but the tool is straightforward enough that this is a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 50% (only 'q' has a description in the schema, but 'limit' is documented via schema metadata). The description adds valuable context for 'q' via an example ('b300 nordics on-demand'), but does not address 'limit' at all. Given the moderate coverage, the description provides useful but incomplete parameter clarification.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Describes the tool as resolving a free-text query to canonical FLOPS index slugs, using a clear verb and resource. Distinguishes from siblings like list_indices (which lists all indices) and get_index (which retrieves a specific index by slug).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'Use when you don't know the exact slug' with a concrete example. Suggests pairing with get_price for a common workflow. Lacks explicit when-not-to-use, but the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

spreadAInspect

Time-aligned spread between two index legs. Picks two (gpu_model, index_type) legs, inner-joins them on hourly buckets with k-anon>=3 floor, returns the spread series plus summary stats (mean, stdev, p25/p50/p75, z-score-now, percentile-rank-now). mode=absolute returns A-B; mode=ratio returns A/B.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNo'absolute' (A-B) or 'ratio' (A/B).
rangeNo'24h'/'7d'/'30d' (default)/'90d'/'180d'.
a_gpu_modelYes
b_gpu_modelYes
a_index_typeYes
b_index_typeYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Since no annotations are provided, the description fully discloses behavioral traits: it performs an inner join on hourly buckets with a k-anon minimum, returns a spread series and summary statistics. It does not explicitly state read-only behavior, but the context implies it is a query tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently convey both the high-level purpose and key implementation details. Every word adds value, with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 6 parameters and no output schema. The description outlines the output (spread series and summary stats) but lacks specifics on data structure, pagination, or limits. The k-anon constraint is mentioned but could be elaborated.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema covers only 33% of parameters (mode and range have descriptions). The description adds meaning for the four unnamed parameters by explaining they are legs used in an inner join, compensating for the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs like 'picks', 'inner-joins', and 'returns' to clearly define the tool's action. It uniquely identifies the resource as the spread between two index legs, which differentiates it from sibling tools such as 'get_basis' or 'get_regional_spread'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains how to use parameters (e.g., mode, range) but fails to provide guidance on when to use this tool versus its siblings. Given the large number of sibling tools, explicit differentiation is lacking.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verifyAInspect

Verify a FLOPS index value: returns the latest published value, methodology pointer, source-count, and audit receipt for the given index_id. Use to cite a FLOPS index with provenance.

ParametersJSON Schema
NameRequiredDescriptionDefault
index_idYesFLOPS index identifier (e.g. 'FLOPS-H100-OD-T1', 'FLOPS-B300-SPOT').
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must disclose behavior. It states it returns four specific data items, implying read-only operation, but does not mention error conditions, rate limits, or authentication requirements. The behavior is partially transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences with no redundant clauses. It front-loads the action and return values, making it easy to parse. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description explains the return values (value, methodology pointer, source-count, audit receipt). It does not cover error handling or edge cases, but for a simple verification tool, it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a descriptive parameter doc ('FLOPS index identifier (e.g. 'FLOPS-H100-OD-T1', …)'). The description adds no additional meaning beyond the schema. Baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool verifies a FLOPS index value and lists specific return fields (latest published value, methodology pointer, source-count, audit receipt). The verb 'verify' and resource 'FLOPS index value' are specific. It distinguishes from siblings like 'get_index' or 'recompute_audit' by focusing on verification with provenance.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions a use case ('Use to cite a FLOPS index with provenance') but lacks comparison with alternatives or when-not-to-use. No explicit guidance on when to choose this over siblings like 'get_index' or 'list_indices'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources