WaveGuard

by com.emergentphysicslab

Ownership verified

Server Details

Anomaly detection API powered by physics simulation. Scan any data for outliers.

Status: Healthy
Last Tested: 2026-05-20 03:32
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.5/5.0

Tool DescriptionsB

Average 3.8/5 across 19 of 19 tools scored. Lowest: 2.6/5.

Server CoherenceA

Disambiguation3/5

The tools have overlapping purposes that could cause confusion, such as waveguard_scan and waveguard_scan_timeseries both detecting anomalies but in different data types, and waveguard_price_manipulation and waveguard_volume_check both targeting crypto manipulation detection. However, descriptions help clarify distinctions, like structural vs. time-series anomalies, preventing complete misselection.

Naming Consistency5/5

All tool names follow a consistent snake_case pattern with a 'waveguard_' prefix and descriptive nouns or noun phrases (e.g., waveguard_action_surface, waveguard_market_data). This uniformity makes the set predictable and easy to parse, with no deviations in style or convention.

Tool Count4/5

With 19 tools, the count is slightly high but reasonable for the broad domain of data analysis and anomaly detection in crypto and structured data. Most tools serve distinct functions, though some overlap suggests minor bloat, but it doesn't feel excessive for the server's scope.

Completeness4/5

The toolset covers a wide range of data analysis tasks, including anomaly detection, similarity comparison, risk assessment, and market data fetching, with clear workflows. Minor gaps exist, such as lacking direct data manipulation or export tools, but agents can work around these using the provided tools effectively.

Available Tools

19 tools

waveguard_action_surfaceC

Read-onlyIdempotent

Inspect

Score candidate actions and extract robust action zones.

ParametersJSON Schema

Name	Required	Description
`training`	Yes	2+ baseline normal samples used to define the reference profile.
`field_level`	No	0 = real scalar field, 1 = complex field.
`sensitivity`	No	Anomaly sensitivity multiplier (default: 1.0).
`action_tests`	Yes	1+ candidate actions/scenarios to score against baseline.
`encoder_type`	No	Optional encoder override. Omit to auto-detect.
`action_labels`	No	Optional labels for each action variant.

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already disclose read-only, idempotent, non-destructive traits. Description adds no behavioral context beyond this—omitting output format, computational cost, or what 'robust' quantification means.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded, and free of redundancy. However, extreme brevity leaves critical gaps given the tool's complexity and lack of output schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Missing output schema requires description to compensate with return value documentation, which it fails to provide. Domain complexity (real/complex fields, encoder types) warrants more than 8 words of context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, descriptions map clearly to parameters. The description provides conceptual framing ('candidate actions') but does not add parameter interactions, valid ranges, or syntax beyond the schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific verbs (score, extract) and resources (actions, zones) but employs opaque domain jargon ('action zones') without defining scope or distinguishing from siblings like waveguard_counterfactual or waveguard_mechanism_probe.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides no guidance on when to select this tool versus the 17 sibling waveguard tools, nor prerequisites for the baseline training data or when sensitivity tuning is required.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_cascade_riskB

Read-onlyIdempotent

Inspect

Estimate shock propagation and resilience from adjacency-linked entities.

ParametersJSON Schema

Name	Required	Description
`entities`	Yes	2+ entities/nodes participating in the cascade graph.
`field_level`	No	Field representation level. Default 1 for graph interaction dynamics.
`sensitivity`	No	Anomaly sensitivity multiplier (default: 1.0).
`encoder_type`	No	Optional encoder override. Omit to auto-detect.
`shock_indices`	Yes	Indices of initially shocked entities within the entities array.
`shock_strength`	No	Initial perturbation magnitude injected at shock indices.
`adjacency_matrix`	Yes	N×N weighted adjacency matrix describing link strengths between entities.
`training_context`	Yes	2+ baseline context samples used for normalization.

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The verb 'Estimate' aligns with readOnlyHint=true and idempotentHint=true annotations, confirming it's a calculation operation. However, adds no detail on execution time, convergence behavior, or what the resilience metric represents beyond the annotations' safety profile.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with action verb. No redundancy. However, extreme brevity (9 words) under-serves the tool's complexity (8 parameters including matrix math), preventing higher score.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for identifying the tool's domain but incomplete for a complex graph-analysis tool with no output schema. Fails to describe return values (resilience scores? propagation paths? vulnerability rankings?) or expected computational cost.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing baseline 3. Description mentions 'adjacency-linked entities' which loosely maps to adjacency_matrix and entities parameters, but adds no syntax clarification (e.g., matrix dimension requirements) or semantic relationships between shock_indices and entities array.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Uses specific domain verbs ('Estimate') and resources ('shock propagation', 'resilience') with clear scope ('adjacency-linked entities'). Distinguishes general domain (graph cascade analysis) from siblings like wallet_profile or price_manipulation, though lacks explicit 'unlike X' differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides no guidance on when to select this tool versus siblings like waveguard_instability, waveguard_counterfactual, or waveguard_trajectory_scan. No prerequisites or preconditions mentioned despite complex matrix input requirements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_compareA

Read-onlyIdempotent

Inspect

Compare two data items for structural similarity using physics-based fingerprints. Returns cosine similarity (0–1) and Euclidean distance. Use for duplicate detection, behavioral matching, drift analysis, or checking if two tokens/wallets/contracts are structurally similar.

Cosine similarity > 0.95 = very similar. < 0.80 = structurally different.

ParametersJSON Schema

Name	Required	Description
`data_a`	Yes	First data item to compare.
`data_b`	Yes	Second data item to compare (same type as data_a).
`encoder_type`	No	Data encoder. Omit to auto-detect.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations confirm read-only/idempotent safety, but description adds crucial output semantics: 'Returns cosine similarity (0–1) and Euclidean distance' and interpretation thresholds (>0.95 very similar, <0.80 structurally different). This behavioral context is absent from annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with purpose and return values. Use cases and interpretation thresholds follow logically. No wasted words; each sentence provides distinct value (operation, return format, use cases, threshold interpretation).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, so description appropriately documents return values (cosine similarity, Euclidean distance) and interpretation guidelines. Combined with rich annotations (readOnly, idempotent) and clear use cases, the description is complete for this complexity level.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage (baseline 3), but description adds valuable domain context: examples of comparable items (tokens/wallets/contracts) and the fingerprinting concept, which helps users understand what encoder_type selection implies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action (compare) and resource (data items) using physics-based fingerprints, clearly distinguishing from siblings like waveguard_scan or waveguard_token_risk which analyze single entities or markets rather than pairwise structural similarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases: 'duplicate detection, behavioral matching, drift analysis, or checking if two tokens/wallets/contracts are structurally similar.' Lacks explicit exclusions or named alternatives, but use cases are specific enough for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_counterfactualB

Read-onlyIdempotent

Inspect

Run baseline plus counterfactual variants and measure verdict/score sensitivity.

ParametersJSON Schema

Name	Required	Description
`training`	Yes	2+ baseline normal samples used to build the reference profile.
`base_test`	Yes	Baseline candidate sample to evaluate before counterfactual perturbations.
`field_level`	No	0 = real scalar field (faster), 1 = complex field (richer phase dynamics).
`sensitivity`	No	Anomaly sensitivity multiplier (default: 1.0). Higher values flag more aggressively.
`encoder_type`	No	Optional encoder override. Omit to auto-detect from input structure.
`counterfactual_tests`	Yes	1+ perturbed variants of base_test for sensitivity analysis.

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations cover safety (readOnly, idempotent, non-destructive). Description adds that it produces 'verdict/score sensitivity' measurements, giving a hint about output semantics. However, it omits computational cost, potential timeouts, or how to interpret sensitivity values.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single efficient sentence with no filler. Front-loaded with action. However, given the complexity (6 parameters, counterfactual analysis concept), it borders on overly terse and could benefit from one clarifying clause about output format.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate given strong schema coverage and safety annotations, but lacks output description (what the sensitivity scores represent) or domain context (risk/anomaly detection) that would help interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description provides conceptual grouping (baseline vs counterfactual variants) that aligns with the base_test and counterfactual_tests parameters, but doesn't add syntax, format details, or constraints beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verbs (Run, measure) and resource (counterfactual variants). The mention of 'counterfactual' distinguishes it from siblings like waveguard_compare or waveguard_scan, though it doesn't explicitly state the WaveGuard risk-analysis domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this versus alternatives like waveguard_compare or waveguard_scan. No mention of prerequisites (e.g., when counterfactual analysis is preferable over direct comparison).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_fingerprintA

Read-onlyIdempotent

Inspect

Get a physics embedding of any data item (52-dim at Level 0, 62-dim at Level 1 with phase statistics). The fingerprint captures structural properties via wave-equation dynamics — useful for similarity search, clustering, baseline comparison, and drift detection. Works on JSON objects, token metrics, wallet activity, trading data, or any structured data.

Returns a deterministic vector with labeled dimensions (chi statistics, energy distribution, gradient patterns, and phase coherence at Level 1).

ParametersJSON Schema

Name	Required	Description
`data`	Yes	Any data item to fingerprint: JSON object, numeric array, string, or structured record.
`field_level`	No	0 = real scalar 52-dim (default), 1 = complex field 62-dim.
`encoder_type`	No	Data encoder. Omit to auto-detect.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds substantial value beyond annotations: specifies output dimensions (52-dim/62-dim), explains what vector components represent (chi statistics, energy distribution, phase coherence), confirms determinism, and lists compatible data types (JSON, wallet activity, trading data).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three tightly structured paragraphs with zero waste: immediate capability statement with dimension specs, use case enumeration, and return value description. Every sentence delivers distinct information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Compensates for missing output schema by detailing return vector structure and dimension labels. Adequately covers the physics domain complexity, though could briefly mention error conditions or constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 100% schema coverage, description adds crucial domain context: field_level semantics (Level 0 vs 1 dimension counts and phase statistics), and contextualizes the data parameter with concrete examples (token metrics, wallet activity) that inform encoder_type selection.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verb ('Get a physics embedding'), resource ('any data item'), and distinguishes from siblings via use cases (similarity search, clustering, drift detection) vs the risk analysis/scanning focus of other waveguard tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear positive guidance listing specific use cases (similarity search, clustering, baseline comparison, drift detection). Lacks explicit negative guidance or named sibling alternatives, but effectively communicates when to select this tool over risk-oriented siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_healthA

Read-onlyIdempotent

Inspect

Check WaveGuard API health, GPU availability, version, and engine status. No authentication required. Returns status, version, and GPU info.

ParametersJSON Schema

Name	Required	Description	Default
`verbose`	No	Return detailed health info including memory and uptime (default: false).

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Strong: While annotations cover read-only/idempotent hints, the description valuably adds authentication requirements ('No authentication required') and discloses return values ('status, version, and GPU info') despite the absence of an output schema. No rate limit or error behavior mentioned but appropriate for tool complexity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent: Two sentences with zero waste. First sentence establishes purpose and scope; second covers prerequisites and return values. Front-loaded with key information, no redundant phrases.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Complete: For a simple health-check tool with one optional boolean parameter and no output schema, the description adequately covers functionality, prerequisites, and return structure. Annotations adequately cover behavioral hints. Nothing material is missing given the tool's limited complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Baseline: Schema coverage is 100%, so the schema fully documents the 'verbose' parameter. The description mentions no parameters, relying entirely on the structured schema. With complete schema coverage, score 3 is appropriate per rubric.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent: Uses specific verb 'Check' with explicit resources (API health, GPU availability, version, engine status). The scope clearly distinguishes this from analytic siblings (waveguard_scan, waveguard_risk, etc.) by focusing on infrastructure status rather than data analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Adequate: States 'No authentication required' which provides prerequisite context, but lacks explicit when-to-use guidance relative to siblings (e.g., 'use this before invoking analysis tools to verify service availability'). Usage is implied by the diagnostic nature of the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_instabilityB

Read-onlyIdempotent

Inspect

Estimate instability under controlled perturb-and-resolve trials.

ParametersJSON Schema

Name	Required	Description
`test`	Yes	1+ candidate samples to stress-test with perturbation trials.
`trials`	No	Number of perturbation trials per sample.
`training`	Yes	2+ baseline normal samples for reference dynamics.
`field_level`	No	0 = real scalar field, 1 = complex field.
`sensitivity`	No	Anomaly sensitivity multiplier (default: 1.0).
`encoder_type`	No	Optional encoder override. Omit to auto-detect.
`perturbation_strength`	No	Relative perturbation amplitude applied during instability assay.

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=true, idempotentHint=true, and destructiveHint=false. Description adds the 'perturb-and-resolve' methodology context not captured in annotations, explaining how the estimation occurs. However, omits domain-specific behavior (e.g.,whether this measures market volatility vs system fragility) and computational characteristics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with verb and subject. Zero redundant words. However, given the tool's complexity (7 parameters with specific domains like field_level and encoder_type), the extreme brevity may underserve the agent's understanding despite being structurally efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers core operation but lacks domain context (financial/ML/physical systems) despite rich schema and many siblings. With no output schema and 100% input schema coverage, the description suffices for parameter mapping but misses opportunity to explain what 'instability' means in this ecosystem.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing baseline 3. Description mentions 'trials' and 'perturb' which semantically links to the trials and perturbation_strength parameters, reinforcing the schema descriptions. Does not add syntax details or parameter interdependencies beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action (Estimate) and resource (instability) with methodology (controlled perturb-and-resolve trials). However, lacks differentiation from siblings like waveguard_cascade_risk or waveguard_action_surface regarding when each specific instability analysis applies.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides no guidance on when to select this tool versus the 17 sibling waveguard tools. No mention of prerequisites, required data formats, or scenarios where this instability assay is preferred over scans or risk assessments.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_interaction_matrixB

Read-onlyIdempotent

Inspect

Compute pairwise interaction matrix and cluster decomposition for entities.

ParametersJSON Schema

Name	Required	Description
`entities`	Yes	2+ entities to evaluate for pairwise interaction effects.
`field_level`	No	Field representation level. Default 1 for interaction/phase features.
`sensitivity`	No	Anomaly sensitivity multiplier (default: 1.0).
`encoder_type`	No	Optional encoder override. Omit to auto-detect.
`training_context`	Yes	2+ baseline context samples used for normalization.

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations cover safety (readOnlyHint, destructiveHint, idempotentHint) and extensibility (openWorldHint). The description adds algorithmic context ('cluster decomposition') implying grouped output structure, but omits computational cost, normalization methodology (despite training_context parameter), or how field_level affects the matrix calculation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely compact single sentence (9 words) with no redundancy. Front-loaded with the operative verb 'Compute'. However, given the domain complexity and 5-parameter schema, this brevity may underserve users needing contextual orientation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Sufficient for invocation given rich annotations (safety/idempotency) and complete input schema. However, lacks output expectations (no output schema provided) and domain context for 'waveguard' ecosystem tools, leaving operational completeness gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed descriptions for all 5 parameters. The tool description adds minimal semantic value beyond the schema—merely stating the high-level operation without explaining relationships between parameters (e.g., how training_context normalizes entities) or usage patterns for optional overrides like encoder_type.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states specific operation ('Compute pairwise interaction matrix and cluster decomposition') and target resource ('entities'), using clear technical verbs. However, it does not explicitly differentiate from siblings like 'waveguard_compare' or 'waveguard_scan', leaving ambiguity about when this matrix computation is preferred over direct comparison or scanning operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides no guidance on when to select this tool versus the 17 sibling waveguard tools. Missing prerequisites (e.g., when training_context is required), exclusion criteria, or workflow positioning ('use this after scan, before action_surface').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_market_dataA

Read-onlyIdempotent

Inspect

Fetch live crypto market data from CoinGecko and DexScreener. No external data needed — WaveGuard pulls it for you.

Use 'coin_id' for CoinGecko (e.g. 'bitcoin', 'ethereum', 'solana'). Use 'contract_address' for DexScreener (any chain). Use 'search' to find token IDs by name/symbol.

Returns: price, volume, market cap, liquidity, price history, OHLC candles — ready to feed into waveguard_token_risk, waveguard_volume_check, or waveguard_price_manipulation.

ParametersJSON Schema

Name	Required	Description
`days`	No	Number of days of history (default: 90 for price_history, 30 for ohlc).
`count`	No	Number of results for top_coins (default: 25).
`query`	No	Search query. Required for search, dex_search.
`action`	Yes	What data to fetch: - token_data: full metrics for a CoinGecko coin - price_history: daily prices (for price_manipulation) - ohlc: OHLC candles (for volume_check) - top_coins: top N by market cap (training baseline) - search: find CoinGecko coin IDs - dex_token: DEX data by contract address - dex_search: search DEX pairs
`coin_id`	No	CoinGecko coin ID (e.g. 'bitcoin', 'ethereum'). Required for token_data, price_history, ohlc.
`contract_address`	No	Token contract address (any chain). Required for dex_token.

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations cover safety (readOnlyHint, destructiveHint, idempotentHint) and openWorldHint. Description adds value by specifying the external API sources (CoinGecko/DexScreener) and explaining that WaveGuard handles the external fetching ('pulls it for you'). Minor gap: does not mention rate limits, caching behavior, or failure modes.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with three distinct information blocks: (1) purpose/sources, (2) parameter usage patterns, (3) return values and downstream integration. No filler text; every sentence conveys specific selection or invocation guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for a multi-action data fetcher: covers both data sources (CoinGecko vs DexScreener), all action types, required vs optional parameter contexts, return data categories, and ecosystem integration. Compensates fully for missing output schema by enumerating return categories (price, volume, OHLC, etc.).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing baseline 3. Description adds significant semantic value by mapping parameters to data sources (coin_id→CoinGecko, contract_address→DEX) and cross-referencing action types with specific downstream analysis tools (e.g., price_history 'for price_manipulation'), which the schema enum descriptions do not cover.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity: states it 'Fetch[es] live crypto market data' with named sources (CoinGecko, DexScreener). Clearly distinguishes from sibling analysis tools (risk, manipulation, etc.) by positioning as the data acquisition layer.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance on parameter selection ('Use coin_id for CoinGecko', 'Use contract_address for DexScreener') and critical integration context naming three downstream sibling tools it feeds into (waveguard_token_risk, waveguard_volume_check, waveguard_price_manipulation), creating a clear workflow narrative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_mechanism_probeC

Read-onlyIdempotent

Inspect

Run targeted interventions and rank effect sizes.

ParametersJSON Schema

Name	Required	Description
`training`	Yes	2+ baseline normal samples used to construct the reference profile.
`base_test`	Yes	Baseline candidate sample before interventions.
`field_level`	No	0 = real scalar field, 1 = complex field.
`sensitivity`	No	Anomaly sensitivity multiplier (default: 1.0).
`encoder_type`	No	Optional encoder override. Omit to auto-detect.
`intervention_tests`	Yes	1+ intervention variants used to estimate effect sizes.
`intervention_labels`	No	Optional labels for intervention variants (same order as intervention_tests).

Tool Definition Quality

C2.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds the key behavioral detail that the tool ranks effect sizes, which annotations do not cover. However, given annotations already declare readOnlyHint=true and destructiveHint=false, the description should clarify that these 'interventions' are computational/simulated rather than external state changes. It also omits mention of idempotentHint and openWorldHint from annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely efficient at 6 words, but verges on insufficient for a 7-parameter tool with complex statistical requirements (minItems constraints, variant labels). The brevity leaves critical gaps in agent comprehension that could be filled with one additional sentence on prerequisites or output format.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 100% schema coverage and rich annotations covering safety/idempotency, the description adequately covers the core operation conceptually. However, for a complex causal inference tool with no output schema, it minimally meets the threshold—missing elaboration on the training/testing paradigm, encoder selection logic, or how sensitivity tuning affects results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description provides no additional parameter-specific context (e.g., it does not explain the relationship between training/base_test/intervention_tests, or when to use field_level 0 vs 1), relying entirely on the schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

Uses technical terms ('targeted interventions', 'rank effect sizes') that describe a causal analysis operation, which is more specific than tautology. However, lacks domain context (waveguard implies finance/crypto) and does not distinguish from causal siblings like 'waveguard_counterfactual' or 'waveguard_interaction_matrix'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides no explicit guidance on when to select this tool versus the many causal/analysis siblings (e.g., when to probe mechanisms vs run counterfactuals). The phrase 'targeted interventions' implies usage context but offers no exclusion criteria or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_multi_horizon_outlookB

Read-onlyIdempotent

Inspect

Compute horizon-specific anomaly outlook and consistency across windows.

ParametersJSON Schema

Name	Required	Description
`horizons`	Yes	List of horizon lengths (in sequence steps) to evaluate.
`sequence`	Yes	Ordered sample sequence used for multi-horizon outlook analysis.
`training`	Yes	2+ baseline normal samples used to establish reference behavior.
`field_level`	No	0 = real scalar field, 1 = complex field.
`sensitivity`	No	Anomaly sensitivity multiplier (default: 1.0).
`encoder_type`	No	Optional encoder override. Omit to auto-detect.

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description aligns by using 'Compute' (read-only operation). It adds minimal behavioral context ('consistency across windows') but omits details about what 'outlook' entails, computational complexity, or whether results are probabilistic vs deterministic.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, 9 words, front-loaded with action verb. No redundancy or filler. However, density of technical jargon ('horizon-specific', 'consistency across windows') without supporting context slightly hinders immediate comprehension for an agent selecting among 18 similar tools.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Complex analytical tool with 6 parameters (including field types and sensitivity tuning) but no output schema. Annotations cover safety profile but description omits what the computation returns (scores, probabilities, binary flags) and how field_level affects behavior. Minimum viable for complexity level.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, providing detailed descriptions for all 6 parameters including field_level enums and array constraints. The description mentions 'horizon-specific' and 'windows' which loosely map to the horizons and sequence/training parameters, but adds no syntactic guidance or usage examples beyond the well-documented schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action (compute) and domain object (anomaly outlook) with scope qualifier (horizon-specific). Mentions 'consistency across windows' which differentiates from simple point-in-time anomaly detection. However, lacks differentiation from sibling tools like waveguard_scan or waveguard_trajectory_scan which also analyze sequences.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage context ('multi-horizon') suggesting use for time-series prediction or forecasting scenarios. However, lacks explicit guidance on when to prefer this over waveguard_scan_timeseries or waveguard_trajectory_scan, and doesn't mention prerequisites like minimum data requirements despite the schema requiring minItems=2 for training.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_phase_coherenceB

Read-onlyIdempotent

Inspect

Measure coherence/entropy and collapse-risk indicators for candidate data.

ParametersJSON Schema

Name	Required	Description
`test`	Yes	1+ candidate samples to evaluate for phase coherence and entropy.
`training`	Yes	2+ baseline normal samples for reference coherence metrics.
`field_level`	No	Field representation level. Default 1 for phase-aware analysis.
`sensitivity`	No	Anomaly sensitivity multiplier (default: 1.0).
`encoder_type`	No	Optional encoder override. Omit to auto-detect.

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and openWorldHint. The description adds value by specifying what gets measured (coherence, entropy, collapse-risk) but omits methodology details, computational intensity, or failure modes that would help an agent understand tool behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with zero redundancy. However, given the tool's complexity (5 parameters, specialized domain) and numerous siblings, the extreme brevity sacrifices contextual guidance that could have been front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate given rich annotations and complete schema coverage, but minimal for the domain complexity. The description does not explain what 'phase coherence' entails, how training/test samples should relate, or what success/failure looks like. Usable but not illuminating.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the structured fields already document all 5 parameters clearly. The description adds conceptual framing ('candidate data') that maps to the 'test' parameter but does not augment parameter syntax or semantics beyond what the schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific measurement actions (coherence/entropy) and target (collapse-risk indicators) with clear verb+resource structure. However, it does not differentiate from siblings like 'waveguard_cascade_risk' or 'waveguard_instability' which may overlap in risk-analysis domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides no guidance on when to select this tool versus the 18 siblings, nor does it explain the training/test paradigm implied by the parameters. The term 'candidate data' hints at the test/train split but lacks explicit contextual guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_price_manipulationA

Read-onlyIdempotent

Inspect

Detect price manipulation in time-series data. Send a price or price+volume history as a numeric array. Early windows define 'normal' trading, recent windows are tested for manipulation patterns (pump-and-dump, spoofing, layering).

Example: Send 90 days of closing prices → detect manipulated windows.

ParametersJSON Schema

Name	Required	Description
`data`	Yes	Price time-series array (chronological). At least 20 data points.
`sensitivity`	No	Detection sensitivity (default: 1.5).
`window_size`	No	Window size (default: 10). Smaller = finer detection.
`test_windows`	No	Number of recent windows to test (default: half).

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds meaningful context beyond annotations: discloses specific manipulation patterns detected (pump-and-dump, spoofing, layering) and explains the comparative windowing methodology. Aligns perfectly with readOnly/idempotent annotations (pure analysis). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Optimal efficiency: three sentences total. Purpose front-loaded, zero redundancy, concrete example included without bloat.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for input/output behavior despite missing output schema: explains what gets detected and the detection methodology. Annotations cover safety profile (readOnly, idempotent). Minor gap: no return structure documentation, but acceptable given pattern specificity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage (baseline 3), description adds valuable semantics: clarifies data can include volume ('price or price+volume'), explains windowing logic relevant to window_size/test_windows, and provides scale example (90 days).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity: verb 'Detect' with clear resource 'price manipulation' and explicit pattern types (pump-and-dump, spoofing, layering) that distinguish it from generic analysis siblings like waveguard_scan_timeseries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear temporal methodology (early windows=normal, recent=tested) and input format guidance ('Send... as numeric array'). Lacks explicit sibling differentiation (e.g., when to use waveguard_scan_timeseries vs this), but the specific pattern list provides implicit context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_scanA

Read-onlyIdempotent

Inspect

Find outliers and anomalies in structured data — ideal as a second step after pulling records from Google Sheets, Airtable, Supabase, Notion databases, HubSpot, Financial APIs, GitHub, NPM, or any source that returns rows of JSON. Fully stateless: send known-good rows as training and suspect rows as test in ONE call. Returns per-row anomaly scores, confidence levels, and the top features explaining WHY each row was flagged.

Typical workflow: (1) Pull data from another tool (e.g. Google Sheets, Supabase query, HubSpot deals). (2) Pass the first N rows as training (normal baseline). (3) Pass remaining or new rows as test. (4) Report which rows are anomalous and why.

Works on JSON objects, numbers, text, arrays. No separate training step required.

Examples:

Spreadsheet QA: Pull 500 sales rows from Sheets → train on first 400 → test last 100 → flag outlier entries
Financial screening: Get ratios for 50 stocks from a financial API → find anomalous ones
CRM hygiene: Pull HubSpot deals → flag deals with unusual discount/value patterns
Dependency audit: Get NPM package metrics → flag packages with anomalous quality scores
Commit review: Pull GitHub commit metadata → flag unusual commit patterns

ParametersJSON Schema

Name	Required	Description
`test`	Yes	1+ data points to check for anomalies — new entries, recent rows, or the subset you want validated. Same type/shape as training. Each sample is scored independently.
`training`	Yes	2+ examples of NORMAL/expected data — the known-good baseline. Typically the bulk of rows from a spreadsheet, database query, or API response. All samples should be the same type/shape. More samples = better baseline (10-100 is ideal for tabular data).
`field_level`	No	Physics field complexity. 0 = real scalar (default). 1 = complex field (phase-aware, 62-dim fingerprint).
`sensitivity`	No	Anomaly threshold multiplier (default: 2.0). Lower = more sensitive. Higher = less sensitive. Range: 0.5 to 5.0.
`encoder_type`	No	Data encoder type. Omit to auto-detect from data shape.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds critical behavioral context beyond annotations: explicitly states 'Fully stateless' (confirms idempotent/readOnly annotations), details return structure ('per-row anomaly scores, confidence levels, and the top features'), and clarifies auto-detection capability ('No separate training step'). Does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear information hierarchy: purpose → behavior → workflow → examples. Front-loaded with specific verbs. Lengthy due to five domain examples, but each earns its place by demonstrating different data sources. Single paragraph could be broken for scannability, but no waste detected.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Excellent completeness given no output schema exists. Description compensates by detailing return values ('anomaly scores, confidence levels, top features'). Covers data type support (JSON, text, arrays), statelessness, and integration patterns. Sufficient for an agent to predict outcomes and handle 5-parameter complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, baseline is 3. The description adds crucial semantic context for the training/test split: explaining that training represents 'known-good rows' and 'normal baseline', while test contains 'suspect rows'. This workflow intent ('first N rows as training') supplements the schema's technical definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description immediately states the core function ('Find outliers and anomalies in structured data') with specific verbs and resources. It distinguishes from siblings by emphasizing tabular/JSON row processing and integration with sheet/database tools, while the sibling 'waveguard_scan_timeseries' implies specialized temporal handling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit workflow guidance ('Typical workflow: 1...2...3...4') and positioning ('ideal as a second step after pulling records'). Lists diverse domain examples (Sheets, CRM, NPM, GitHub). Lacks explicit sibling comparison (e.g., when to use vs waveguard_scan_timeseries), but context clearly defines appropriate input shapes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_scan_timeseriesA

Read-onlyIdempotent

Inspect

Detect anomalies in time-series data — use after pulling numeric metrics from monitoring APIs, financial data sources, IoT sensors, or spreadsheet columns. Send a single numeric array and specify a window size. Early windows define 'normal', recent windows are tested for anomalies.

Typical workflow: (1) Pull a column of numbers from Sheets, a Supabase time-series table, or a metrics API. (2) Pass the array here. (3) Get back which time windows are anomalous.

Examples:

Revenue monitoring: Pull monthly revenue from Sheets → detect anomalous months
Stock screening: Pull 90 days of closing prices → find unusual price windows
Server health: Pull response-time metrics → identify degradation windows
Sensor QA: Pull temperature readings from IoT API → flag sensor drift

ParametersJSON Schema

Name	Required	Description
`data`	Yes	Numeric time-series array, ordered chronologically. Should have at least 3x window_size data points.
`sensitivity`	No	Anomaly sensitivity (default: 1.0). Higher = more sensitive.
`window_size`	No	Number of data points per window (default: 10). Smaller windows detect finer-grained anomalies.
`test_windows`	No	Number of most recent windows to test (default: half of total windows). The rest are used as training (normal baseline).

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With readOnlyHint=true and idempotentHint=true in annotations, description adds crucial algorithmic context: 'Early windows define normal, recent windows are tested' explains the train/test split. Clarifies return value 'Get back which time windows are anomalous' compensating for missing output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well front-loaded with purpose-first structure. Workflow list and examples are appropriately detailed for an AI to recognize applicability patterns. Slightly redundant example format (all use 'Pull X → detect Y') but each illustrates distinct domains.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Excellent completeness for a 4-parameter analysis tool. Despite no output schema, explicitly describes return format ('which time windows are anomalous'). Mentions data volume requirements ('at least 3x window_size'). Annotations cover safety profile (readOnly/destructive), freeing description to focus on domain usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (baseline 3). Description adds conceptual framework linking parameters: explains 'window size' in context of 'Early windows' vs 'recent windows' (clarifying test_windows logic). Notes chronological ordering requirement for data array, adding semantics beyond schema's bare description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verb+resource: 'Detect anomalies in time-series data'. Explicitly distinguishes from generic siblings (waveguard_scan, waveguard_trajectory_scan) by specifying time-series context, numeric arrays, and window-based analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear workflow (pull data → pass array → get results) and contextual trigger 'use after pulling numeric metrics'. Includes 4 concrete domain examples (revenue, stock, server, sensor). Lacks explicit comparison to sibling alternatives like waveguard_trajectory_scan.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_token_riskA

Read-onlyIdempotent

Inspect

Assess crypto token legitimacy risk. Send metrics from known-good tokens as training (price, volume, holders, liquidity, market_cap, age_days, etc.) and suspect tokens as test. Detects pump-and-dump patterns, fake metrics, and anomalous token profiles.

Example: Pull CoinGecko data for 20 established tokens → train. Test a new token → get risk score and which metrics are suspicious.

ParametersJSON Schema

Name	Required	Description
`test`	Yes	1+ suspect token metric objects to evaluate.
`training`	Yes	3+ known-good token metric objects. Each should include fields like price, volume_24h, market_cap, holders, liquidity, age_days, etc.
`sensitivity`	No	Risk sensitivity (default: 1.5). Higher = more flags.

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and destructiveHint=false; the description adds valuable behavioral context by detailing what gets detected (pump-and-dump, fake metrics) and what outputs to expect (risk score and identification of suspicious metrics), which goes beyond the safety profile annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Efficiently structured with zero waste: first sentence establishes purpose, second clarifies the training/test methodology, third lists specific detection capabilities, and the final example sentence concrete-izes the workflow. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description adequately explains return values (risk score and suspicious metric identification). It sufficiently covers the complexity of the ML training requirements and expected data format for a tool of this type.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing a baseline of 3. The description adds significant value by enumerating concrete expected fields (price, volume, holders, liquidity, market_cap, age_days) within the training/test objects, helping users construct valid metric payloads beyond the generic schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'Assess crypto token legitimacy risk' with specific detection targets (pump-and-dump patterns, fake metrics, anomalous profiles). It distinguishes from siblings by emphasizing the unique training/test machine learning paradigm requiring known-good baseline tokens.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context through a concrete workflow example (pulling CoinGecko data for 20 tokens to train, then testing a new token), which implicitly guides when to use this comparative approach. Lacks explicit naming of sibling alternatives like waveguard_scan for non-comparative analysis.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_trajectory_scanB

Read-onlyIdempotent

Inspect

Analyze sequence drift and regime shifts over ordered samples.

ParametersJSON Schema

Name	Required	Description
`sequence`	Yes	Ordered samples (time sequence) to scan for drift and regime shifts.
`training`	Yes	2+ baseline normal samples used to establish the reference regime.
`field_level`	No	0 = real scalar field, 1 = complex field.
`sensitivity`	No	Anomaly sensitivity multiplier (default: 1.0).
`encoder_type`	No	Optional encoder override. Omit to auto-detect.

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds domain-specific context (drift detection, regime shift identification) beyond safety annotations. However, with no output schema, fails to describe what the analysis returns (e.g., change points, confidence scores) or computational intensity. Does not contradict readOnly/idempotent annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single efficient sentence with action verb front-loaded. No redundancy or schema repetition. However, extreme brevity leaves gaps in sibling differentiation and output description that an additional sentence could remedy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a read-only analytical tool with rich annotations and complete input schema. Missing: explicit return value description (no output schema exists), optimal use cases, or failure modes. Sibling waveguard_scan suggests need for differentiation that isn't provided.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, documenting all 5 parameters including field_level enum semantics and training data requirements. Description provides no additional parameter guidance, meeting baseline expectations where schema is complete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific analytical action (analyze) and domain concepts (sequence drift, regime shifts) over ordered samples. Mentions 'trajectory' implicitly via sequence focus. However, does not explicitly differentiate from similar sibling tools like waveguard_scan_timeseries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides no guidance on when to select this tool versus alternatives (e.g., waveguard_scan vs waveguard_scan_timeseries). No mention of prerequisites or required parameter relationships beyond the schema constraints.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_volume_checkA

Read-onlyIdempotent

Inspect

Detect wash trading and fake volume in OHLCV candle data. Send known-legitimate candles as training and suspect candles as test. Detects artificial volume spikes, suspiciously regular patterns, and manipulated price-volume relationships.

Example: Send 100 candles from a liquid pair as baseline, test candles from a suspicious pair.

ParametersJSON Schema

Name	Required	Description
`test`	Yes	1+ suspect candle objects to evaluate.
`training`	Yes	3+ OHLCV candle objects from known-legitimate trading. Fields: open, high, low, close, volume.
`sensitivity`	No	Detection sensitivity (default: 1.5).

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnly/idempotent/safe properties. Description adds specific detection capabilities beyond annotations: 'artificial volume spikes, suspiciously regular patterns, and manipulated price-volume relationships'. Includes concrete example (100 candles from liquid pair) illustrating expected input scale.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences total: purpose, mechanism, detection specifics, and concrete example. No waste. Front-loaded with core function. Example earns its place by clarifying expected data volume (100 candles).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Rich annotations cover safety profile. 100% schema coverage. Description explains detection targets adequately for a comparative analysis tool. No output schema but description implies classification/anomaly detection output sufficiently.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage documenting legitimate vs suspect candles. Description reinforces semantics by mapping 'training' to 'known-legitimate' and 'test' to 'suspect', and adds practical example (100 candles) demonstrating expected parameter values, adding value beyond raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb 'Detect' + clear resource 'wash trading and fake volume in OHLCV candle data'. Distinguished from sibling waveguard_price_manipulation by focusing specifically on volume authenticity and wash trading patterns.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear usage pattern via 'Send known-legitimate candles as training and suspect candles as test', explaining the required input structure. Lacks explicit comparison to sibling alternatives (e.g., when to use vs waveguard_scan), but the training/test distinction provides strong implicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

waveguard_wallet_profileA

Read-onlyIdempotent

Inspect

Profile wallet behavior against baselines. Send normal wallet transaction patterns as training (tx_count, avg_value, unique_tokens, gas_spent, active_days, etc.) and suspect wallets as test. Detects bot activity, wash trading wallets, and sybil patterns.

Example: Profile 50 organic wallets → test 10 suspect addresses.

ParametersJSON Schema

Name	Required	Description
`test`	Yes	1+ suspect wallet profiles to evaluate.
`training`	Yes	3+ known-organic wallet activity profiles.
`sensitivity`	No	Detection sensitivity (default: 1.5).

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations establish read-only/idempotent safety profile. Description adds substantial value by disclosing what patterns are detected (bots, wash trading, sybils) and the training methodology mechanics. It also hints at expected data features (tx_count, avg_value, etc.), providing behavioral context beyond the safety annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste: first states action, second explains methodology with feature examples, third provides concrete ratios. Every clause earns its place and the information is front-loaded appropriately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 100% schema coverage and rich annotations, the description successfully explains the tool's analytical purpose, input requirements, and detection scope. Minor gap in not describing output format (scores/classes), but complete for a profiling tool given strong param and annotation coverage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage establishing baselines, but description adds critical semantic meaning by explaining what constitutes 'training' (known-organic wallets with normal patterns) versus 'test' (suspect wallets) and listing example feature fields, clarifying the expected data shape beyond generic 'profiles'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verb 'Profile' + resource 'wallet behavior' with explicit scope 'against baselines'. Distinguishes from siblings like waveguard_scan or waveguard_compare by specifying the training/test methodology and specific detection targets (bot activity, wash trading, sybil patterns).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides concrete usage pattern via the training/test paradigm explanation and explicit example (50 organic → 10 suspect). While it doesn't explicitly name sibling alternatives, the specific supervised-learning workflow clearly signals when to use this tool versus simple scanning or comparison tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?