WaveGuard
Server Quality Checklist
- Disambiguation3/5
While waveguard_health is clearly distinct, waveguard_scan explicitly claims to work on 'time series' data, creating overlap with waveguard_scan_timeseries. An agent with sequential numeric data may struggle to determine whether to use the general scanner with manual windowing or the specialized tool with automatic windowing, despite the differing input schemas.
Naming Consistency4/5All tools consistently use the 'waveguard_' prefix. The naming follows a readable pattern where 'health' indicates status checking and 'scan' indicates anomaly detection, with 'timeseries' serving as a clear modifier. Minor inconsistency exists between noun-based 'health' and verb-based 'scan' patterns, but overall predictability is high.
Tool Count4/5Three tools is slightly minimal but well-suited for a focused, stateless anomaly detection service. The set covers health verification, generic multi-format scanning, and optimized time-series processing without unnecessary bloat, though additional configuration or batch management tools could expand utility.
Completeness4/5For a stateless anomaly detection API, the surface provides complete coverage of core operations: service health checks, arbitrary data scanning (JSON, text, numbers), and specialized time-series analysis. Since results are returned synchronously, no result-retrieval tools are needed. Minor gaps might include model configuration or parameter tuning endpoints.
Average 4.5/5 across 3 of 3 tools scored.
See the tool scores section below for per-tool breakdowns.
This repository includes a README.md file.
This repository includes a LICENSE file.
Latest release: v3.3.0
No tool usage detected in the last 30 days. Usage tracking helps demonstrate server value.
Tip: use the "Try in Browser" feature on the server page to seed initial usage.
Add a glama.json file to provide metadata about your server.
- This server provides 3 tools. View schema
No known security issues or vulnerabilities reported.
This server has been verified by its author.
Add related servers to improve discoverability.
Tool Scores
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It adds critical behavioral context: 'No authentication required.' However, it omits other behavioral traits like read-only safety, rate limits, or error handling when the service is unhealthy, leaving gaps given the lack of annotation coverage.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences with zero waste: first defines scope, second states auth requirements, third provides usage guidance. Information is front-loaded with the core action, and every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a zero-parameter health check without output schema, the description adequately covers the essential pre-execution checks (auth, service status) and intended workflow. It lists what status components are checked, compensating partially for the missing output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has zero parameters, establishing baseline 4 per scoring rules. No parameters require semantic explanation, and the description correctly focuses on operational behavior rather than inventing parameter documentation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
States specific verb 'Check' and enumerates exact resources checked (API health, GPU availability, version, engine status). The phrase 'before scanning' effectively distinguishes this from sibling scanning tools (waveguard_scan, waveguard_scan_timeseries) by establishing the workflow order.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'verify the service is running before scanning.' This creates a clear prerequisite relationship with the sibling scan tools, though it stops short of explicitly naming the scan alternatives or stating 'do not use for scanning'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full disclosure burden. It successfully explains the GPU-accelerated computation, the automatic windowing logic, the training/test division (first N windows vs remaining), and the return structure (anomaly scores, confidence, p-values). It could improve by mentioning error conditions, computational constraints, or idempotency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is excellently structured: purpose (sentence 1), input requirements (sentence 2), processing logic (sentence 3), outputs (sentence 4), and concrete example (paragraph 2). Every sentence conveys distinct information without redundancy, and critical details are front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex ML tool with 4 parameters and no output schema, the description is remarkably complete. It compensates for missing annotations by detailing the algorithmic behavior (overlapping windows, training baseline) and return values. Minor gap: lacks discussion of failure modes or minimum data requirements (though schema enforces minItems: 4).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, establishing a baseline of 3. The description adds significant value by explaining how parameters interact algorithmically—specifically how 'data' and 'window_size' determine training windows, and how 'test_windows' relates to the auto-split behavior. The example reinforces parameter semantics effectively.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description opens with a specific verb ('Detect'), clear resource ('anomalies in time-series data'), and distinct methodology ('GPU-accelerated wave physics simulation'). This clearly distinguishes it from siblings like 'waveguard_scan' and 'waveguard_health' by specifying the time-series domain and wave physics approach.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear usage context through the concrete example (100 CPU-usage readings) and explains the internal training/test split mechanics. However, it lacks explicit guidance on when to choose this over sibling tools (e.g., 'use this for sequential numeric data vs waveguard_scan for general purposes').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It successfully discloses: statelessness ('Fully stateless'), computational characteristics ('GPU-accelerated'), output format ('Returns per-sample anomaly scores, confidence levels, and the top features'), and scope flexibility ('Works on any data type'). No contradictions with missing annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Six sentences, zero waste. Front-loaded with purpose and method, followed by operational constraints (stateless, one call), output contract (since no output schema exists), scope, workflow simplification, and concrete example. Every sentence provides unique information not redundant with the schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description comprehensively compensates by detailing return values (scores, confidence, feature explanations), covering all data types supported, explaining the stateless architecture, and providing usage examples. Sufficient for an agent to invoke correctly without additional context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, establishing baseline 3. Description adds value by providing concrete example values ('3-5 normal readings as training') and emphasizing the critical workflow constraint that both arrays must be sent in 'ONE call', which adds semantic context to how training and test parameters relate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Opens with specific verb ('Detect') and resource ('anomalies in data'), includes implementation detail ('GPU-accelerated wave physics simulation') that distinguishes methodology, and explicitly notes universality ('Works on any data type') which differentiates it from the sibling waveguard_scan_timeseries.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines5/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit workflow guidance ('send training data... and test data... in ONE call'), clarifies prerequisites ('No separate training step required'), includes concrete example ('to check if server metrics are anomalous...'), and explains the stateless nature which informs when to use this versus stateful alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
GitHub Badge
Glama performs regular codebase and documentation scans to:
- Confirm that the MCP server is working as expected.
- Confirm that there are no obvious security issues.
- Evaluate tool definition quality.
Our badge communicates server capabilities, safety, and installation instructions.
Card Badge
Copy to your README.md:
Score Badge
Copy to your README.md:
How to claim the server?
If you are the author of the server, you simply need to authenticate using GitHub.
However, if the MCP server belongs to an organization, you need to first add glama.json to the root of your repository.
{
"$schema": "https://glama.ai/mcp/schemas/server.json",
"maintainers": [
"your-github-username"
]
}Then, authenticate using GitHub.
Browse examples.
How to make a release?
A "release" on Glama is not the same as a GitHub release. To create a Glama release:
- Claim the server if you haven't already.
- Go to the Dockerfile admin page, configure the build spec, and click Deploy.
- Once the build test succeeds, click Make Release, enter a version, and publish.
This process allows Glama to run security checks on your server and enables users to deploy it.
How to add a LICENSE?
Please follow the instructions in the GitHub documentation.
Once GitHub recognizes the license, the system will automatically detect it within a few hours.
If the license does not appear on the server after some time, you can manually trigger a new scan using the MCP server admin interface.
How to sync the server with GitHub?
Servers are automatically synced at least once per day, but you can also sync manually at any time to instantly update the server profile.
To manually sync the server, click the "Sync Server" button in the MCP server admin interface.
How is the quality score calculated?
The overall quality score combines two components: Tool Definition Quality (70%) and Server Coherence (30%).
Tool Definition Quality measures how well each tool describes itself to AI agents. Every tool is scored 1–5 across six dimensions: Purpose Clarity (25%), Usage Guidelines (20%), Behavioral Transparency (20%), Parameter Semantics (15%), Conciseness & Structure (10%), and Contextual Completeness (10%). The server-level definition quality score is calculated as 60% mean TDQS + 40% minimum TDQS, so a single poorly described tool pulls the score down.
Server Coherence evaluates how well the tools work together as a set, scoring four dimensions equally: Disambiguation (can agents tell tools apart?), Naming Consistency, Tool Count Appropriateness, and Completeness (are there gaps in the tool surface?).
Tiers are derived from the overall score: A (≥3.5), B (≥3.0), C (≥2.0), D (≥1.0), F (<1.0). B and above is considered passing.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/gpartin/WaveGuardClient'
If you have feedback or need assistance with the MCP directory API, please join our Discord server