GoldenCheck

Name: GoldenCheck
Author: benzsevern

by io.github.benzsevern

Server Details

Auto-discover validation rules from data — scan, profile, health-score. No rules to write.

Status: Healthy
Last Tested: 2026-06-20 08:33
Transport: Streamable HTTP
URL
Repository: benzsevern/goldencheck
GitHub Stars: 1

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.7/5.0

Tool DescriptionsB

Average 3.6/5 across 19 of 19 tools scored. Lowest: 2.9/5.

Server CoherenceA

Disambiguation4/5

Most tools have distinct purposes, but some overlap exists: 'profile' and 'scan' both analyze data files for quality issues, though 'profile' focuses on column-level statistics and health scores while 'scan' returns findings with severity and confidence. 'explain_column' and 'get_column_detail' could be confused as both provide column-specific information, but 'explain_column' offers a natural-language narrative whereas 'get_column_detail' gives detailed profile data. Overall, descriptions help differentiate them, but minor ambiguity remains.

Naming Consistency5/5

Tool names follow a highly consistent verb_noun pattern throughout, such as 'analyze_data', 'approve_reject', 'auto_configure', and 'compare_domains'. All tools use snake_case uniformly, with clear verbs like 'analyze', 'get', 'list', and 'validate'. There are no deviations in naming conventions, making the set predictable and easy to understand.

Tool Count4/5

With 19 tools, the count is slightly high but reasonable for a data quality and validation server covering scanning, profiling, domain management, review workflows, and configuration. Each tool appears to serve a specific function in this domain, though some consolidation might be possible (e.g., 'profile' and 'scan'). It's well-scoped for the apparent purpose without being excessive.

Completeness5/5

The tool set provides comprehensive coverage for data quality workflows: scanning and profiling ('scan', 'profile'), analysis and explanation ('analyze_data', 'explain_column'), domain management ('list_domains', 'install_domain'), review processes ('review_queue', 'approve_reject'), configuration ('auto_configure', 'validate'), and output generation ('pipeline_handoff'). There are no obvious gaps; tools support the full lifecycle from initial detection to validation and attestation.

Available Tools

19 tools

analyze_dataBInspect

Analyze a data file to detect its domain, profile columns, and recommend a scanning strategy. Returns domain detection, column count, row count, strategy decisions, and alternative approaches.

ParametersJSON Schema

Name	Required	Description	Default
`file_path`	Yes	Path to the data file (CSV, Parquet, Excel)

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It documents return values (domain detection, counts, strategies) but fails to specify whether the operation is read-only, destructive, or has side effects on the source file. 'Analyze' implies safety but should be explicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two tightly constructed sentences: first declaring the action (analyze/detection/recommendation), second detailing returns. No redundancy or filler. Information density is high with zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool without an output schema, the description adequately compensates by enumerating return values (column count, row count, strategy decisions). Lacks only safety/permission context, which is minor given the analytical nature but relevant given zero annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with file_path fully documented as accepting CSV, Parquet, or Excel files. The description mentions analyzing a 'data file' but adds no semantic context beyond the schema regarding path formats, accessibility requirements, or file size constraints. Baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes data files to detect domains, profile columns, and recommend scanning strategies. It specifies distinct outputs (domain detection, strategy decisions) that differentiate it from simple profiling or scanning, though it doesn't explicitly contrast with siblings like 'profile' or 'scan'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'profile' (which also profiles columns) or 'scan' (which executes scanning). It mentions 'alternative approaches' in the return value but doesn't clarify selection criteria or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

approve_rejectCInspect

Approve (pin) or reject (dismiss) a review queue item. Decision must be 'pin' or 'dismiss'.

ParametersJSON Schema

Name	Required	Description
`reason`	No	Optional reason for the decision
`item_id`	Yes	Review item ID to update
`decision`	Yes	Decision: 'pin' (approve) or 'dismiss' (reject)

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but fails to disclose critical behavioral traits: it does not state whether the operation is destructive, irreversible, or what state changes occur to the item after pinning/dismissing. It only adds the semantic mapping of enum values ('pin' = approve, 'dismiss' = reject) which is also present in the schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately brief with two sentences. The first effectively captures the purpose upfront. The second states a constraint that, while redundant with the enum, reinforces critical valid values for the LLM.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple 3-parameter schema with no output schema, the description meets minimum viability by identifying the target resource and action. However, gaps remain in behavioral transparency and usage context that would help an agent understand side effects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description adds no additional semantic meaning beyond the schema (e.g., no format details for item_id, no examples for reason, no workflow context).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs approval or rejection ('pin' or 'dismiss') actions on review queue items. However, it does not explicitly distinguish from the sibling 'review_queue' tool, which likely retrieves items rather than acting on them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives (e.g., after fetching items from 'review_queue'), nor does it explain when to choose 'pin' versus 'dismiss' or prerequisites for invocation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

auto_configureAInspect

Scan a data file, triage findings by confidence, and generate goldencheck.yml content from the pinned findings. Optionally accepts constraints to filter or adjust the generated config.

ParametersJSON Schema

Name	Required	Description	Default
`file_path`	Yes	Path to the data file
`constraints`	No	Optional constraints: {min_confidence, severity_filter, include_columns, exclude_columns}

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It mentions 'triage findings by confidence' and 'pinned findings,' hinting at a multi-stage workflow, but fails to explain what 'pinned findings' are, whether the tool writes to disk or returns content, or if it overwrites existing files.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficiently constructed sentences with zero waste. The first covers the core workflow (scan → triage → generate), and the second addresses the optional parameter, appropriately front-loading the most critical information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a 2-parameter tool but missing critical behavioral context due to absent annotations. Without an output schema, the description should clarify whether the goldencheck.yml content is returned in the response or written to a file, and should define the prerequisite 'pinned findings' concept.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While the schema has 100% coverage and documents the constraints object structure, the description adds valuable semantic context that constraints are used to 'filter or adjust the generated config,' clarifying their purpose beyond the raw schema definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool scans data, triages by confidence, and generates 'goldencheck.yml' content—specific verb and output format. However, it does not explicitly differentiate this from the sibling 'scan' tool, which could cause confusion since both involve scanning data files.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage through the specific output (goldencheck.yml generation), suggesting when to use this versus general analysis tools. However, it lacks explicit when/when-not guidance or named alternatives to clarify the workflow sequence (e.g., where 'pinned findings' originate).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compare_domainsAInspect

Scan a file with every available domain pack (plus base/no-domain) and compare health scores. Recommends the best-fitting domain.

ParametersJSON Schema

Name	Required	Description	Default
`file_path`	Yes	Path to the data file

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It explains functional behavior (scans all domains, compares) but omits operational traits: whether it's read-only vs. destructive, computational cost of scanning 'every' domain, idempotency, or return format details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two well-constructed sentences with zero waste. Front-loaded with the action ('Scan... and compare') followed by the outcome ('Recommends...'), creating clear information hierarchy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a single-parameter tool, but lacks specification of return value structure since no output schema exists. Description states it 'recommends' a domain but doesn't clarify if it returns a ranked list, a single winner, or detailed comparison metrics.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with 'file_path' described as 'Path to the data file'. Description mentions 'Scan a file' which aligns with the parameter, but adds no additional semantics regarding file format expectations, path conventions, or constraints beyond the schema's baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verbs ('scan', 'compare', 'recommends') and resources ('domain pack', 'health scores') clearly define the function. Explicitly mentions 'every available domain pack' which effectively distinguishes it from sibling tools like 'scan' or 'health_score' that likely operate on single domains.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage context through 'Recommends the best-fitting domain' (use when you need to identify the optimal domain), but lacks explicit when-not guidance or direct comparison to alternatives like 'scan' or 'health_score' for single-domain analysis.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

explain_columnBInspect

Get a natural-language health narrative for a specific column. Scans the file, profiles the column, and explains all findings.

ParametersJSON Schema

Name	Required	Description	Default
`column`	Yes	Column name to explain
`file_path`	Yes	Path to the data file

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries full burden. It discloses that the tool 'scans the file' (I/O operation) and 'profiles' (computation), but omits critical operational details: whether it is read-only, performance implications on large files, caching behavior, or what constitutes 'health' in this context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences with zero waste. Front-loaded with the core value proposition ('Get a natural-language health narrative'), followed by the implementation details. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-parameter tool without output schema or annotations, the description adequately compensates by describing the return value conceptually ('health narrative'). Missing only safety/guidance context to be fully complete for an AI agent's decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing a baseline of 3. The description mentions 'specific column' and implies file targeting but does not add semantic constraints (e.g., supported file formats, case sensitivity for column names) or usage patterns beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates a 'natural-language health narrative' and describes the process (scan, profile, explain). However, it lacks explicit differentiation from siblings like 'explain_finding' or 'profile', leaving ambiguity about when this narrative approach is preferred over technical profiling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance provided on when to use this tool versus alternatives like 'get_column_detail' (for raw metadata), 'profile' (for statistics), or 'explain_finding' (for specific issues). The description states what it does but not when to choose it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

explain_findingBInspect

Explain a single finding in natural language. Requires the finding as a JSON dict and the file_path to load a profile for context.

ParametersJSON Schema

Name	Required	Description	Default
`finding`	Yes	Finding dict with keys: severity, column, check, message, affected_rows, confidence, sample_values
`file_path`	Yes	Path to the data file (needed for profile context)

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds valuable behavioral context not in schema: output is 'natural language' and it 'load[s] a profile for context'. However, with no annotations provided, it omits safety profile (read-only vs destructive), error handling, and side effects beyond profile loading.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, zero waste. Front-loaded with purpose ('Explain...'), followed by requirements. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a 2-parameter tool with complete schema coverage, mentioning output format ('natural language') compensating partially for missing output schema. However, gaps remain regarding relationship to 'profile' concept and differentiation from similar explanation tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing baseline 3. Description reinforces that parameters are required but adds minimal semantic value beyond schema (repeats 'JSON dict' for object type and 'profile context' which schema already mentions).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb ('Explain') and resource ('single finding') with output format ('natural language'), but lacks explicit differentiation from sibling 'explain_column' despite similar naming patterns.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States prerequisites ('Requires the finding... and file_path') but provides no guidance on when to select this tool versus siblings like 'explain_column' or 'suggest_fix', nor when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_column_detailBInspect

Get detailed profile and findings for a specific column.

ParametersJSON Schema

Name	Required	Description	Default
`column`	Yes	Column name to inspect
`file_path`	Yes	Path to the data file

Tool Definition Quality

B3.2/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It adds valuable context by specifying the return content ('profile and findings'), but omits operational details such as read-only status, caching behavior, or error conditions (e.g., file not found).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single efficient sentence with no filler words. It is front-loaded with the action and object. While extremely brief (9 words), it avoids the tautology trap and every word serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple 2-parameter input schema with complete coverage, the description meets minimum needs but has gaps. Notably, there is no output schema, and while 'profile and findings' hints at the return value, structural details are missing. It also omits sibling differentiation despite a crowded tool namespace.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents both parameters. The description mentions 'specific column' which aligns with the column parameter, but adds no semantic detail beyond what the schema already provides (e.g., no guidance on path formats or column naming conventions). Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Get') and clearly identifies the resource ('detailed profile and findings for a specific column'). However, it fails to distinguish from similar sibling tools like 'explain_column', 'profile', or 'analyze_data', which could cause selection ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites (e.g., file must exist, column must be valid) or exclusions, leaving the agent without decision criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_domain_infoAInspect

Get detailed info about a specific domain pack — lists all semantic types, their name hints, and suppression rules.

ParametersJSON Schema

Name	Required	Description	Default
`domain`	Yes	Domain pack name (e.g., healthcare, finance, ecommerce)

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses what data is returned (semantic types, hints, suppression rules), compensating for the missing output schema. However, it omits operational details such as error behavior when a domain is not found, caching, or permission requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the action ('Get detailed info'). Every clause earns its place by specifying either the resource or the detailed output content. No redundant or filler text is present.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one required string parameter) and lack of output schema, the description adequately explains what the tool returns. It could be improved by mentioning error conditions or authentication requirements, but the core functionality and output structure are sufficiently documented.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a clear description and examples for the 'domain' parameter. The description references 'a specific domain pack' which aligns with the parameter, but does not add semantic meaning, validation rules, or format guidance beyond what the schema already provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get detailed info') and resource ('specific domain pack'), and distinguishes itself from siblings like list_domains or compare_domains by specifying the exact content returned ('semantic types, name hints, and suppression rules'). However, it does not explicitly name sibling tools to contrast against.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use the tool by detailing what information it retrieves (semantic types and suppression rules), suggesting use when inspecting domain configuration details. However, it lacks explicit guidance on when NOT to use it or which sibling tools (like list_domains or compare_domains) serve different purposes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

health_scoreAInspect

Get the health score (A-F, 0-100) for a data file. Quick summary of overall data quality.

ParametersJSON Schema

Name	Required	Description	Default
`file_path`	Yes	Path to the data file

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses the return format (A-F, 0-100 scale) and characterizes the operation as a quick summary, but omits safety details (read-only status), error behaviors, or performance characteristics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste. First sentence delivers the core action and output format; second sentence provides the value proposition. Front-loaded and appropriately sized for the tool's simplicity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool with no output schema, the description adequately covers the return value format and purpose. Missing annotations for safety/destructiveness slightly lowers the score, but the description compensates sufficiently for a simple getter tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with file_path fully described. The description adds no explicit parameter guidance, meeting the baseline score of 3 for well-documented schemas where the description focuses on purpose rather than parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb ('Get') and resource ('health score'), with specific output format (A-F, 0-100). The phrase 'Quick summary' provides some differentiation from detailed analysis siblings like analyze_data or profile, though it doesn't explicitly name alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('Quick summary of overall data quality'), suggesting when to use it (for quick overviews), but lacks explicit when-not-to-use guidance or named alternatives from the sibling list.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

install_domainAInspect

Download a community domain pack from the goldencheck-types repository and save it for use in future scans.

ParametersJSON Schema

Name	Required	Description	Default
`domain`	Yes	Domain pack name to install
`output_path`	No	Output path (default: goldencheck_domain.yaml)	goldencheck_domain.yaml

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses the external source (goldencheck-types repository) and side effect (saving to filesystem). However, it omits critical behavioral details like idempotency, overwrite behavior, network requirements, or authentication needs for the download.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The single-sentence description is perfectly front-loaded with the action, contains zero redundancy, and every phrase ('community domain pack', 'goldencheck-types repository', 'future scans') adds essential context about scope, source, and intent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-parameter tool with no output schema, the description adequately covers the core workflow. It appropriately explains the 'why' (future scans) and 'where from' (goldencheck-types). Minor gap regarding file overwrite behavior prevents a 5.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While the schema has 100% coverage, the description adds valuable semantic context by specifying that the 'domain' parameter refers to a 'community domain pack' from a specific repository ('goldencheck-types'), enriching the agent's understanding beyond the schema's generic 'Domain pack name to install'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('Download', 'save') and clearly identifies the resource (community domain pack from goldencheck-types repository) and its purpose (future scans). It effectively distinguishes this installation tool from sibling query tools like list_domains or get_domain_info.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage context ('for use in future scans'), suggesting when to use the tool in a workflow. However, it lacks explicit guidance on when not to use it, prerequisites (e.g., checking if already installed), or differentiation from siblings like list_domains.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_checksAInspect

List all available profiler checks and what they detect. No arguments needed.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. Adds valuable context that results include detection details ('what they detect'), but omits safety characteristics (read-only status), return format structure, or performance considerations that annotations would typically cover.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste: first establishes purpose, second clarifies parameter requirements. Front-loaded with action and resource. No redundant or speculative content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Appropriate for a simple discovery tool with no parameters. Mentions content scope ('what they detect') compensating partially for missing output schema. Would benefit from indicating return type (list of check metadata), but sufficient for correct invocation given low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Zero parameters present (baseline 4). Description confirms the empty schema with explicit statement 'No arguments needed', validating the input requirements without redundancy.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb 'List' with clear resource 'profiler checks' and scope clarification 'what they detect'. Distinct from execution siblings like 'profile', 'scan', and 'analyze_data' through the list-vs-execute semantic distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States 'No arguments needed' providing invocation guidance, but lacks explicit selection guidance contrasting when to use this discovery tool versus execution alternatives like 'profile' or 'scan'. Usage relative to siblings is implied but not stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_domainsAInspect

List all available domain packs (healthcare, finance, ecommerce, etc.). Domain packs provide specialized semantic type definitions for specific data domains.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It adds valuable context explaining that domain packs provide 'specialized semantic type definitions,' but omits behavioral details like pagination, caching, whether the operation is read-only, or what specific fields are returned for each domain pack.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste. The first sentence front-loads the action and scope with helpful examples; the second sentence adds conceptual context about domain packs that aids agent understanding without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a parameter-less list tool with no output schema, the description adequately explains what is returned conceptually (domain packs) and their purpose. It could be improved by describing the return structure (e.g., array of names vs. metadata objects), but the examples and domain explanation provide sufficient context for tool selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema contains zero parameters. According to scoring guidelines, zero parameters establishes a baseline of 4. The description appropriately requires no parameter explanation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verb 'List' with resource 'domain packs' and provides concrete examples (healthcare, finance, ecommerce). It clearly distinguishes from siblings like get_domain_info (singular retrieval) or install_domain (mutation) by emphasizing 'all available' and the catalog nature of the operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage through examples and by explaining what domain packs are (specialized semantic type definitions), suggesting use when browsing available domains. However, it lacks explicit when-to-use guidance versus siblings like compare_domains or get_domain_info, and doesn't state prerequisites like needing to call this before install_domain.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pipeline_handoffBInspect

Generate a structured quality attestation JSON for a data file. Includes health score, findings summary, pinned rules, and attestation status (PASS, PASS_WITH_WARNINGS, REVIEW_REQUIRED, FAIL).

ParametersJSON Schema

Name	Required	Description	Default
`job_name`	Yes	Job name for the handoff record
`file_path`	Yes	Path to the data file

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden. It valuably enumerates the possible attestation statuses (PASS, PASS_WITH_WARNINGS, REVIEW_REQUIRED, FAIL) and output JSON structure, but fails to clarify side effects (does it persist to disk? create a record?), idempotency, or error behaviors.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences with zero waste: first sentence establishes the core action and resource, second sentence details the output contents. Information is front-loaded and appropriately dense.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description compensates effectively by detailing the JSON structure and possible status values. With only two well-documented input parameters, the description provides adequate completeness for tool selection, though it could clarify the persistence model.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for both required parameters ('file_path' and 'job_name'), establishing a baseline of 3. The description mentions 'data file' which loosely maps to the file_path parameter, but does not add semantic context beyond what the schema already provides (e.g., acceptable file formats, path conventions, or job_name semantics).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'Generate[s] a structured quality attestation JSON for a data file' with specific outputs (health score, findings summary, pinned rules, attestation status). However, it does not explicitly differentiate from siblings like 'health_score' or 'validate', though the mention of 'attestation' format provides implicit distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'health_score', 'analyze_data', 'validate', or 'scan'. It does not mention prerequisites, required states, or workflow positioning within the pipeline context implied by the tool name.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

profileBInspect

Profile a data file and return column-level statistics: type, null%, unique%, min/max, top values, detected formats. Also returns a health score (A-F) based on finding severity.

ParametersJSON Schema

Name	Required	Description	Default
`file_path`	Yes	Path to the data file
`sample_size`	No	Max rows to sample (default 100000)

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It implies read-only behavior via 'profile' and 'return statistics', and the sampling behavior is inferable from the sample_size parameter, but it lacks explicit safety guarantees, supported file formats, or performance characteristics (e.g., handling of large files).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is optimally concise at two sentences (30 words). It is front-loaded with the core action ('Profile a data file') and every clause conveys specific return values or data points without filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema, the description adequately compensates by enumerating the specific statistics and health score format (A-F) returned. However, it omits the response structure (e.g., whether results are nested by column) and error scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description does not add parameter-specific semantics beyond what the schema already documents for 'file_path' and 'sample_size', nor does it clarify the interaction between sampling and the returned statistics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool profiles data files and specifies the exact statistics returned (type, null%, unique%, min/max, etc.). However, it does not explicitly differentiate from sibling tools like 'analyze_data' or clarify when profiling is preferred over scanning or validation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'analyze_data', 'scan', or 'health_score' (which it notably also returns). It lacks prerequisites (e.g., file format requirements) or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

review_queueAInspect

List all pending review items for a given job. Returns items that need human decision (medium-confidence findings).

ParametersJSON Schema

Name	Required	Description	Default
`job_name`	Yes	Job name to filter review items

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries the full burden and successfully adds critical behavioral context: it specifies the returned items are 'medium-confidence findings' requiring 'human decision.' This domain-specific filtering logic is valuable behavioral disclosure. It could be enhanced with details on pagination, item locking, or empty result handling, but the core behavioral trait (confidence level) is well-documented.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste. The first sentence front-loads the action and resource ('List all pending review items'), while the second adds essential qualifying context ('medium-confidence findings'). Every word earns its place; no redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single required parameter, no output schema), the description is appropriately complete. It compensates for the missing output schema by explaining what gets returned (items needing human decision). For a list-retrieval tool, this level of detail is sufficient, though mentioning the relationship to 'approve_reject' would strengthen workflow completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage ('Job name to filter review items'), the baseline is 3. The description references 'for a given job,' which aligns with the job_name parameter, but does not add semantic nuance beyond the schema (e.g., whether job_name accepts wildcards, case sensitivity, or format constraints).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists pending review items with specific scope ('for a given job') and distinguishes this retrieval function from action-oriented siblings like 'approve_reject' through the verb choice 'List.' However, it does not explicitly differentiate from 'review_stats,' which likely returns aggregate data rather than individual items.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the use case through 'items that need human decision (medium-confidence findings),' suggesting when to invoke the tool (when seeking reviewable items). However, it lacks explicit guidance on when NOT to use it or which sibling tool (likely 'approve_reject') should handle the actual approval/rejection actions after retrieval.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

review_statsAInspect

Get review queue statistics for a job — counts of pending, pinned, and dismissed items.

ParametersJSON Schema

Name	Required	Description	Default
`job_name`	Yes	Job name to get stats for

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses what data is returned (counts of pending, pinned, and dismissed items), but omits safety characteristics (read-only nature), error behaviors, or rate limiting that would help an agent understand operational constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single efficient sentence using an em-dash to separate the action from the specific return values. Every clause earns its place: first half defines the operation, second half specifies the statistical breakdown. Perfectly front-loaded and appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-parameter tool without output schema, the description adequately compensates by enumerating what statistics are returned (pending, pinned, dismissed). However, it stops short of describing the return structure format or error conditions when a job is not found.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the 'job_name' parameter fully described in the schema. The description references 'for a job' which aligns with the parameter, but adds no additional semantic context like format constraints, valid job name patterns, or examples. Baseline 3 is appropriate given complete schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verb 'Get' with resource 'review queue statistics' and scope 'for a job'. It effectively distinguishes from sibling tool 'review_queue' by emphasizing 'statistics' and 'counts' rather than the queue items themselves, clarifying this returns aggregated metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While the description implies this is for statistical overview (counts) versus retrieving actual items, it lacks explicit guidance on when to use this versus 'review_queue' or prerequisites like job existence. The usage is inferred from 'statistics' but not stated explicitly.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scanAInspect

Scan a data file (CSV, Parquet, Excel) for data quality issues. Returns findings with severity, confidence, affected rows, and sample values. No configuration needed — rules are discovered from the data.

ParametersJSON Schema

Name	Required	Description	Default
`file_path`	Yes	Path to the data file (CSV, Parquet, Excel)
`llm_boost`	No	Enable LLM enhancement (requires API key env var)
`sample_size`	No	Max rows to sample (default 100000)
`llm_provider`	No	LLM provider: 'anthropic' or 'openai'	anthropic

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the burden of behavioral disclosure. It successfully describes return values (severity, confidence, affected rows) and the auto-discovery mechanism, but omits explicit safety traits (read-only status) or performance characteristics that annotations would typically cover.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences with zero waste. Front-loaded with core action and supported formats, followed by return value description, and closing with key behavioral trait (auto-discovery). Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Compensates effectively for the missing output schema by detailing return structure (findings with severity, confidence, etc.). Covers the auto-discovery paradigm and supported formats. Deducted one point only for lack of explicit safety/destructiveness disclosure given zero annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing a baseline of 3. The description mentions supported formats (CSV, Parquet, Excel) which adds context to file_path, but does not elaborate on llm_boost, sample_size semantics, or provide guidance beyond the schema descriptions for optional parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description provides specific verb (scan), resource (data file), and scope (data quality issues). It effectively distinguishes from siblings like 'validate' and 'list_checks' by emphasizing that 'rules are discovered from the data,' highlighting its zero-configuration auto-discovery approach.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides implicit usage context via 'No configuration needed,' suggesting use when automatic detection is preferred over manual rule setup. However, it lacks explicit comparison to siblings (e.g., 'use this instead of validate when you lack predefined rules') or specific prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

suggest_fixAInspect

Preview fixes for a data file without applying them. Shows what would change (columns, fix types, rows affected, before/after samples).

ParametersJSON Schema

Name	Required	Description	Default
`mode`	No	Fix mode: 'safe' (default) or 'aggressive'	safe
`file_path`	Yes	Path to the data file

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full behavioral disclosure burden. It successfully communicates the non-destructive 'preview' nature and details the specific preview content (columns, fix types, rows affected, before/after samples). However, it omits error handling, authentication requirements, or performance characteristics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of two highly efficient sentences with zero redundancy. The first sentence establishes the core function, and the second elaborates on the preview content. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-parameter tool with simple types and no output schema, the description adequately covers purpose and behavior. However, given the absence of an output schema, it could benefit from a brief note on the return format (e.g., 'returns a preview object' or similar) to complete the agent's understanding of the tool's output.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, providing clear documentation for both 'file_path' and 'mode' parameters. The description mentions 'fix types' which loosely contextualizes the 'mode' enum (safe/aggressive), but does not add specific parameter guidance beyond what the schema already provides. Baseline 3 is appropriate for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'Preview fixes for a data file' with the specific verb 'Preview' and resource 'data file'. The phrase 'without applying them' effectively distinguishes this as a dry-run/suggestion tool, differentiating it from execution-oriented siblings like 'approve_reject' or 'auto_configure'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While the description implies a workflow (preview before applying), it lacks explicit guidance on when to use this tool versus analytical alternatives like 'analyze_data' or 'validate'. No 'when-not-to-use' criteria or explicit sibling comparisons are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validateBInspect

Validate a data file against pinned rules in goldencheck.yml. Returns validation findings (existence, required, unique, enum, range checks).

ParametersJSON Schema

Name	Required	Description	Default
`file_path`	Yes	Path to the data file
`config_path`	No	Path to goldencheck.yml (default: ./goldencheck.yml)	goldencheck.yml

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully enumerates the specific validation checks performed (existence, required, unique, enum, range), but fails to explicitly state whether the operation is read-only or destructive, or disclose any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of two dense, well-structured sentences with zero redundancy. The first sentence establishes the operation and target, while the second details the return value, front-loading the most critical information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 parameters, no nested objects) and lack of output schema, the description adequately covers the validation types returned. However, it lacks necessary safety disclosure (read-only vs. destructive) that would normally be provided by annotations, leaving a gap for agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description adds semantic context by referring to 'pinned rules' in goldencheck.yml, which reinforces the purpose of config_path, but does not elaborate on parameter formats, constraints, or interaction between parameters beyond the schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the core action ('Validate a data file') and the specific mechanism ('against pinned rules in goldencheck.yml'), distinguishing it from generic validation or analysis siblings like 'scan' or 'analyze_data'. However, it does not explicitly contrast its use case with these similar siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no explicit guidance on when to select this tool versus similar siblings like 'scan', 'analyze_data', or 'profile', nor does it mention prerequisites (e.g., goldencheck.yml must exist) or when to avoid using it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

GoldenCheck

Server Details

Tool Definition Quality

Available Tools

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Discussions

Your Connectors

Resources