governance-platform

by ai.geodesiclabs

Server Details

Deterministic AI governance platform. Validates agent outputs, discovers patterns, solves math.

Status: Healthy
Last Tested: 2026-05-20 16:31
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.7/5.0

Tool DescriptionsA

Average 4.2/5 across 31 of 31 tools scored. Lowest: 3/5.

Server CoherenceA

Disambiguation4/5

Most tools have clearly distinct purposes with detailed descriptions that explain their specific use cases. There is slight overlap between validate, validate_and_repair, and get_execution_trace, but descriptions clarify the differences. Overall, an agent can reliably distinguish between tools.

Naming Consistency4/5

The majority of tools follow a verb_noun pattern (e.g., create_blueprint, list_api_keys). A few tools use noun phrases (account_status, structural_types) or are compound verbs (validate_and_repair), but the pattern is largely consistent and predictable.

Tool Count3/5

With 31 tools, the server covers a wide range of governance operations, including blueprint management, validation, repair, anomaly detection, key management, and chain execution. This is on the higher end but still within reason for a complex platform. Some functionality could be split into separate servers.

Completeness4/5

The tool surface covers the full lifecycle of governance—creating and managing blueprints, validating, repairing, discovering patterns, checking anomalies and drift, and auditing. There are minor gaps, such as chain status retrieval, but overall it is thorough for the intended domain.

Available Tools

31 tools

account_statusAInspect

Report the calling account's plan, key usage, and limits.

Use this to introspect what the caller is allowed to do. Agents that hit
rate limits or key-count caps can call this to explain the limit to the
human and suggest upgrading if needed.

Args:
    api_key: GeodesicAI API key (starts with gai_)

Returns:
    plan: The user's current plan — one of pilot, trial, tier1, tier2, beta, enterprise
    plan_label: Human-readable plan name (e.g. "Personal Access", "Small Business")
    account_key_count: Number of account-level API keys currently issued
    account_key_limit: Maximum account keys allowed on this plan
    blueprint_count: Number of Blueprints owned by this user
    blueprint_limit: Maximum Blueprints allowed on this plan
    email: The user's email address (for reference in support)
    user_id: Stable user identifier
    trial_days_remaining: Days left on trial, if plan == "trial"; else null

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and effectively discloses behavioral traits: it describes a read-only introspection operation (implied by 'report'), specifies the context for usage (rate limits, key caps), and outlines the return structure with detailed field explanations, though it doesn't mention authentication requirements beyond the api_key parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with purpose and usage, followed by clear sections for Args and Returns. Every sentence adds value without redundancy, making it efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (account introspection with multiple return fields), no annotations, and no output schema, the description is complete: it covers purpose, usage guidelines, parameter details, and a comprehensive return value breakdown, leaving no gaps for the agent to understand and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds significant meaning beyond the input schema, which has 0% coverage. It explains the api_key parameter ('GeodesicAI API key (starts with gai_)'), providing format details not in the schema, and thoroughly documents all return fields with semantics, compensating for the lack of output schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('report', 'introspect') and resources ('account's plan, key usage, and limits'), distinguishing it from sibling tools that focus on analysis, creation, validation, or repair operations rather than account introspection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance is provided on when to use this tool ('to introspect what the caller is allowed to do', 'Agents that hit rate limits or key-count caps can call this to explain the limit to the human and suggest upgrading if needed'), including specific scenarios and alternative actions (suggesting upgrades).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_anomalyAInspect

Deep anomaly analysis with geometric proof. No Blueprint required.

Explains WHY data is anomalous using three independent methods:
1. Structural fingerprinting — distance from the learned manifold
2. Cluster analysis — deviation from structural type centroids
3. Twist-compression obstruction — fundamental constraint conflicts

Returns a human-readable geometric proof of anomaly, not just a flag.

Args:
    structured_data: The data to analyze for anomalies

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`structured_data`	Yes

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, description carries full burden and succeeds in disclosing the three specific analytical methods employed (structural fingerprinting, cluster analysis, twist-compression obstruction) and output format (human-readable geometric proof). Minor gap: doesn't explicitly confirm read-only safety or computational cost of 'deep' analysis.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear front-loading of purpose ('Deep anomaly analysis with geometric proof'), followed by methodology enumeration and return value description. The 'Args:' section is somewhat abrupt and isolated from the narrative flow, slightly disrupting readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema exists, the description adequately explains return values (human-readable geometric proof). Addresses the complex analytical nature of the tool. Minor deduction for failing to mention authentication requirements (api_key) or expected structure of the nested structured_data input.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, requiring the description to fully compensate. It documents 'structured_data' in the Args section but completely omits 'api_key', leaving 50% of parameters undocumented. The nested nature of structured_data (additionalProperties: true) is also not explained.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action (deep anomaly analysis) with distinctive methodology (geometric proof) and output format. Explicitly distinguishes from simpler flagging tools via 'not just a flag' and differentiates from blueprint-dependent siblings with 'No Blueprint required.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage context (when you need explanatory proof vs. simple detection) and notes prerequisite absence (no Blueprint). However, lacks explicit comparison to similar siblings like check_drift, validate, or check_feasibility regarding when to prefer this deeper analysis.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

approve_ruleAInspect

Promote a discovered rule into Blueprint-compatible format.

After running discover_patterns, use this to approve high-confidence
rules for inclusion in a Blueprint.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    rule_id: ID of the discovered rule (from discover_patterns results)
    blueprint: Discovery session namespace

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`rule_id`	Yes
`blueprint`	No	default

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Adds valuable context about format conversion ('Blueprint-compatible format') and side effect ('inclusion in a Blueprint'). However, lacks disclosure on idempotency, reversibility, or what happens if the rule is already approved.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with clear purpose statement, followed by workflow context, then structured Args documentation. No redundant or filler text; every sentence provides actionable information for tool selection and invocation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero annotations and no output schema, the description successfully documents the parameter semantics and workflow integration. Minor gap: does not describe return values or success/failure behavior, which would help the agent handle the response.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but the Args section fully compensates by documenting all 3 parameters with rich semantics: api_key includes format hint ('starts with gai_'), rule_id specifies provenance ('from discover_patterns results'), and blueprint clarifies concept ('Discovery session namespace').

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action ('Promote into Blueprint-compatible format') and resource ('discovered rule'). Clearly positions this as an approval action that follows pattern discovery, implicitly distinguishing it from sibling 'reject_rule' and fitting it into the 'discover_patterns' workflow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('After running discover_patterns') and selection criteria ('high-confidence rules'). References the prerequisite sibling tool. Lacks explicit mention of 'reject_rule' as the alternative for low-confidence rules, though this is implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

authorize_executionAInspect

Decide whether an action should be allowed to proceed.

Runs full validation, then applies the Blueprint's execution gate.
Returns a simple allow/block decision with reasoning.

Use this when your agent is about to take a real-world action (payment,
filing, API call, data write) and needs a deterministic go/no-go.

Different from validate: validate says "is this data correct?"
authorize_execution says "should this action happen?"

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: The data associated with the action
    blueprint: Blueprint governing this action type

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`blueprint`	Yes
`structured_data`	Yes

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses internal behavior ('Runs full validation, then applies the Blueprint's execution gate') and output format ('Returns a simple allow/block decision with reasoning'). Minor gap: does not clarify if calling this tool has side effects (e.g., logging the decision) or if it is idempotent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections: purpose statement, behavioral details, usage context with concrete examples, sibling differentiation, and parameter documentation. Every sentence provides distinct value; no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Strong completeness given constraints (no annotations, no output schema, 0% schema coverage). Describes return value ('allow/block decision with reasoning') despite lack of output schema. Could improve by mentioning error conditions (e.g., invalid blueprint) or whether the decision is recorded.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, requiring full compensation. The 'Args' section documents all three parameters: api_key includes format hint ('starts with gai_'), structured_data explains it contains 'the data associated with the action', and blueprint notes it 'governs this action type'. Deducted one point because structured_data could clarify expected schema/shape.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific action ('Decide whether an action should be allowed to proceed'), identifies the resource (Blueprint's execution gate), and explicitly distinguishes from sibling tool 'validate' ('Different from validate: validate says 'is this data correct?' authorize_execution says 'should this action happen?'').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('when your agent is about to take a real-world action (payment, filing, API call, data write) and needs a deterministic go/no-go') and provides clear differentiation from the 'validate' alternative, framing the decision as validation vs. authorization.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_driftAInspect

Check whether the data pattern has shifted since previous observations.

Works with or without a Blueprint. Monitors structural stability across
a stream of data. Detects regime changes when the data's geometric
embedding moves to a different region of the constraint space —
indicating the source data's structure has fundamentally changed.

Call this repeatedly as new data arrives to track drift over time.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: New data point to check for drift
    blueprint: Blueprint for geometric embedding context

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`blueprint`	No	default
`structured_data`	Yes

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully explains what constitutes drift ('geometric embedding moves to a different region'), but fails to disclose operational traits: whether the tool is read-only, idempotent, has side effects, or what errors might occur. The repeated call suggestion implies safety but doesn't explicitly state it.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with purpose front-loaded, conceptual explanation in the middle, and parameter documentation at the end. The geometric embedding explanation, while technical, earns its place by defining the detection mechanism. Only minor verbosity prevents a 5.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description should explain return values (e.g., drift detected boolean, confidence score), but it doesn't. While parameters are well-covered, the absence of output documentation and error handling details leaves a significant gap for a monitoring tool intended to be called repeatedly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema description coverage, the Args section comprehensively documents all three parameters. It adds crucial semantic context: api_key format ('starts with gai_'), structured_data's role ('New data point'), and blueprint's purpose ('geometric embedding context'). This fully compensates for the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks for data pattern shifts and specifically mentions 'since previous observations' and 'track drift over time,' which distinguishes it from point-in-time analysis tools like analyze_anomaly. It defines the specific domain (geometric embedding, constraint space) though could more explicitly contrast with siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance ('Call this repeatedly as new data arrives') and notes flexibility ('Works with or without a Blueprint'). However, it lacks explicit 'when not to use' guidance or named alternatives, leaving the agent to infer when to prefer this over analyze_anomaly or validate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_feasibilityAInspect

Check whether a set of constraints can be simultaneously satisfied.

Uses the twist-compression operator to detect structural obstructions —
fundamental conflicts in the constraint system that cannot be resolved
by adjusting any single field. Returns obstruction type, magnitude,
and the specific constraint interactions causing the conflict.

Fast pre-check before running full validation.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    constraints: List of derivation rules and formal constraints to check
    field_values: Current field values (numeric key-value pairs)

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`constraints`	Yes
`field_values`	Yes

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully explains the detection algorithm ('twist-compression operator'), what constitutes a conflict ('fundamental conflicts...cannot be resolved by adjusting any single field'), and the return structure ('obstruction type, magnitude, and specific constraint interactions'). It lacks operational details like rate limits or error states, but covers the core behavioral logic well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with zero waste: a clear purpose statement, technical method explanation, return value specification, and usage context—all in four sentences. The Args section follows as structured documentation. Every sentence earns its place by adding distinct information not present in the schema or title.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 3-parameter tool with no output schema and no annotations, the description achieves completeness by documenting all inputs (via Args), explaining the algorithmic behavior, and describing the conceptual return values ('obstruction type, magnitude') despite the lack of formal output schema. It provides sufficient information for an agent to invoke the tool correctly and interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Given 0% schema description coverage, the Args section provides critical compensatory documentation: it adds format constraints ('starts with gai_'), semantic typing ('derivation rules and formal constraints'), and value constraints ('numeric key-value pairs') that are completely absent from the schema. This is exemplary parameter documentation for low-coverage schemas.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb-resource pair ('Check whether a set of constraints can be simultaneously satisfied') and distinguishes itself from siblings like 'validate' and 'validate_and_repair' by positioning this as a 'Fast pre-check before running full validation.' The technical specificity ('twist-compression operator', 'structural obstructions') further clarifies its unique role in the toolset.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear temporal context ('Fast pre-check before running full validation') indicating when to use it in a workflow. However, while it implies the existence of full validation alternatives, it does not explicitly name sibling tools like 'validate' or describe when NOT to use this tool (e.g., when you need actual solutions rather than feasibility checks).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_realizationAInspect

Run structural realization analysis on a payload.

Embeds the payload via the Blueprint's declared embedding schema,
projects it onto the Blueprint's reference subspace, and returns
a realization score, residual, projection angle, and full report.

The Blueprint must include a `realization` configuration block
(see RealizationConfig in Platform_Agent.realization.schema). If
no realization config is present, the report status is "skipped"
and the payload is treated as unconstrained by the realization
layer.

For Blueprints using basis mode "auto", the first N payloads
bootstrap the reference subspace; until the bootstrap pool is
full, the report status is "skipped". After bootstrap, every
subsequent payload is projected against the locked subspace and
receives a real score.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: payload to analyze
    blueprint: Blueprint name (defaults to "default")

Returns:
    dict with keys:
      status:              "pass" | "review" | "skipped"
      realization_score:   float in [0, 1], higher = better fit
      residual:            ||v - P_U(v)||
      angle_degrees:       angle between v and P_U(v)
      in_subspace:         bool — residual < tolerance
      basis_mode:          "vectors" | "auto" | "uninitialized"
      basis_dimension:     k of the reference subspace
      vector_dimension:    D of the embedded payload
      report:              full RealizationReport dict (may include
                           invariance/stability stacks if enabled)

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`blueprint`	No	default
`structured_data`	Yes

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully explains the algorithm (embed, project, return scores), edge cases (bootstrap, missing config), and return values. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-organized with clear sections and bullet points for return values. It is slightly verbose but front-loads the main purpose and provides necessary detail efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description comprehensively addresses the tool's complexity: 3 parameters (including nested objects), no output schema, and no annotations. It explains all return fields, edge cases, and the projection algorithm, leaving no critical gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema description coverage, the description adds meaning to each parameter: api_key format, structured_data as payload, blueprint default. It lacks details on structured_data structure but provides sufficient context for usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it runs structural realization analysis on a payload and details the process and outputs. It is distinct from siblings like check_drift or check_feasibility by focusing on subspace projection and realization scoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains conditions under which results are skipped (no realization config, bootstrap mode), providing context for when the tool will return meaningful results. However, it does not explicitly compare to alternatives or state when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

counterfactualAInspect

Compare outcomes under different rule sets.

Given the same data, runs trajectory analysis under two different
sets of rules/constraints and shows how the valid state space differs.
Useful for what-if analysis: "what happens if I change this rule?"

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: Data to analyze
    blueprint: Primary Blueprint (rule set A)
    rules_b: Alternative derivation rules (rule set B)
    constraints_b: Alternative constraints (rule set B)

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`rules_b`	No
`blueprint`	No	default
`constraints_b`	No
`structured_data`	Yes

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses that the tool performs 'trajectory analysis' and compares 'valid state space,' which hints at the computational nature of the operation. However, it fails to specify whether this is read-only (implied by 'compare' but not stated), expected runtime, rate limits, or the specific structure of the output since no output schema exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly structured with a front-loaded purpose statement ('Compare outcomes under different rule sets') followed by behavioral details and a structured Args list. No sentences are wasted; the 'what-if analysis' phrase immediately signals the tool's analytical nature.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 5 parameters, nested objects, and no output schema, the description adequately covers all input parameters and explains the comparative logic. It conceptually describes the output ('shows how the valid state space differs'). It could be improved by describing the actual return format or structure since no output schema is available, but it meets the minimum threshold for invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Given 0% schema description coverage, the Args section fully compensates by documenting all five parameters. It adds crucial semantic context that the schema lacks, particularly the relationship between 'blueprint' (rule set A) and 'rules_b'/'constraints_b' (rule set B), and the api_key format hint ('starts with gai_').

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs comparative trajectory analysis under two rule sets to show differences in valid state space. It effectively distinguishes this as a 'what-if' analysis tool, though it could more explicitly differentiate from siblings like 'check_feasibility' or 'validate' which operate on single rule sets.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides an implied usage pattern through the 'what happens if I change this rule?' example, suggesting it's for comparing alternative configurations. However, it lacks explicit guidance on when to use this versus 'check_feasibility' or 'validate', and doesn't specify prerequisites or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_blueprintAInspect

Create a Blueprint — a governance contract that defines validation rules.

A Blueprint tells the platform what "correct" means for your data: which
fields exist, what math must hold between them, and what value ranges
are acceptable. Without a Blueprint, the platform has nothing to validate against.

If you don't know what rules to define, use load_rule_pack to start from
a prebuilt template, or use discover_patterns to find rules from your data.

Use the blueprint_guide prompt for the complete reference of all available
rule types, constraint types, and configuration options.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    customer_name: Organization or project name (used for folder naming)
    workflow_name: Unique identifier for this Blueprint (used as the 'blueprint' parameter in validate)
    mode: "observe" (platform checks agent's work) or "enforce" (platform computes derived fields)
    extracted_fields: Fields the agent extracts from source data (e.g. ["vendor", "qty", "unit_cost"])
    derived_fields: Fields computed from other fields (e.g. ["subtotal", "total"])
    derivation_rules: Math rules defining field relationships. Available types: "add" (target = a + b), "subtract" (target = a - b), "multiply" (target = a × b), "divide" (target = a ÷ b), "round" (round field to N places), "copy" (copy source to target), "items_multiply" (per-item a × b in a list), "items_sum" (sum a field across list items). Each rule requires "type" and the relevant fields. See blueprint_guide prompt for full schema.
    formal_constraints: Value bounds and ratio constraints. Available types: "magnitude_anchor" (field within min/max range, requires "field", "min", "max"), "relative_anchor" (ratio a/b within tolerance, requires "a", "b", "expected_ratio", "tolerance"), "max_action_threshold" (trigger action if field exceeds threshold, requires "field", "threshold", "action"). See blueprint_guide prompt for full schema.
    semantic_checks: Domain-specific validation checks
    require_math: Validate mathematical relationships (default true)
    require_consistency: Check internal consistency (default true)
    require_coherence: Check structural coherence (default true)
    require_provenance: Require agents to report extraction source locations
    require_high_assurance: Strictest validation — feasibility, spectral, and global consistency required
    enable_anomaly_detection: Geometric fingerprinting to detect structural outliers
    enable_drift_tracking: Monitor pattern stability across batches

ParametersJSON Schema

Name	Required	Default
`mode`	No	observe
`api_key`	Yes
`require_math`	No
`customer_name`	Yes
`workflow_name`	Yes
`derived_fields`	No
`semantic_checks`	No
`derivation_rules`	No
`extracted_fields`	No
`require_coherence`	No
`formal_constraints`	No
`require_provenance`	No
`require_consistency`	No
`enable_drift_tracking`	No
`require_high_assurance`	No
`enable_anomaly_detection`	No

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Explains critical behavioral modes ('observe' vs 'enforce'), describes what each boolean flag controls (anomaly detection, drift tracking, high assurance), and clarifies that the platform validates against these rules. Missing: return value description, idempotency behavior, or error conditions for duplicate workflow_name.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Length is substantial (16 params with 0% schema coverage necessitate detail), but structure is logical: concept → prerequisites → alternatives → detailed args. Each sentence adds value, though the Args section is necessarily dense. No wasted words or repetition of schema titles.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Excellent coverage of 16 input parameters, but tool has no output schema and description fails to specify return values (e.g., does it return the blueprint ID, a confirmation object, or nothing?). For a creation operation, this omission leaves a critical gap in contextual completeness despite strong input documentation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but the Args section compensates extensively. Provides examples for extracted_fields and derived_fields, enumerates and explains all derivation_rules types (add, subtract, items_multiply, etc.) and formal_constraints types (magnitude_anchor, relative_anchor, etc.), and clarifies boolean defaults and semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb+resource ('Create a Blueprint') and immediately defines it as 'a governance contract that defines validation rules.' Distinguishes from siblings by referencing load_rule_pack and discover_patterns as alternatives for different use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use alternatives: 'If you don't know what rules to define, use load_rule_pack... or use discover_patterns.' References blueprint_guide prompt for complete schema reference, providing clear decision criteria for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_chainAInspect

Create a multi-agent sequential execution chain.

Defines a pipeline where multiple agents process data in sequence.
Each stage is validated against the Blueprint before the next stage
can proceed. Repair suggestions propagate forward through the chain.

Different from submit_chain_stage: create_chain defines the
pipeline (stages, agent names, TTL); submit_chain_stage executes
one stage of an already-created chain. Different from
handoff_audit: handoff_audit verifies a transition between stages
after they have been submitted.

The Blueprint must already exist (use create_blueprint or
load_rule_pack first). Each stage in the chain validates against
the same Blueprint.

Sibling tools: create_chain (define the pipeline), submit_chain_stage
(advance through it), handoff_audit (verify between stages).

Args:
    api_key:  GeodesicAI API key (starts with gai_)
    blueprint: workflow_name of the Blueprint governing all stages
    stages:    List of stage definitions, e.g.
               [{"stage_name": "extract", "agent_name": "PDF Scanner"},
                {"stage_name": "validate", "agent_name": "QA Agent"}]
               Minimum 2 stages required.
    ttl:       Chain timeout in seconds. After this many seconds
               without progress, the chain expires and submitted
               stages cannot be advanced. Default: 3600 (1 hour).

Returns:
    chain_id:    unique identifier for the chain (use this with
                 submit_chain_stage and handoff_audit)
    status:      initial chain status (typically "active")
    stages:      list of stage records with stage_name, agent_name,
                 and per-stage status
    next_stage:  name of the first stage to submit
    ttl_seconds: echoed back for caller reference

ParametersJSON Schema

Name	Required	Description	Default
`ttl`	No
`stages`	Yes
`api_key`	Yes
`blueprint`	Yes

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description discloses important behavioral traits including the validation checkpoint between stages and forward propagation of repair suggestions. It omits critical execution details such as the return value structure, error handling behavior, and whether the operation is synchronous.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description follows a logical structure with a front-loaded purpose statement followed by behavioral elaboration and an Args section. The text is appropriately sized with no redundant sentences, efficiently using vertical space for the parameter examples.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex orchestration tool with no output schema, the description adequately covers the conceptual model and most input parameters but leaves significant gaps regarding the return value (e.g., chain ID, status) and the undocumented `api_key` parameter.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Given 0% schema coverage, the description effectively compensates by providing semantic meaning for three parameters: blueprint governance, stage definitions with JSON examples, and TTL semantics. It notably omits any description of the `api_key` parameter, leaving authentication requirements undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb and resource ('Create a multi-agent sequential execution chain') and clarifies the mechanism ('pipeline where multiple agents process data in sequence'). However, it does not explicitly differentiate this tool from siblings like `create_blueprint` or `submit_chain_stage`.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implicit usage guidance by describing validation behavior and repair propagation, suggesting appropriate contexts for multi-stage workflows. It lacks explicit statements about prerequisites (e.g., blueprint must exist first) or when to prefer single-stage alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

decompose_failureAInspect

Hodge-style decomposition of validation failures.

Splits the error between original and corrected values into three
orthogonal components:
- Exact: direct rule violations (a field breaks a specific math rule)
- Co-exact: constraint boundary violations (a field is at the edge
  of valid range)
- Harmonic: systemic structural errors (the overall data shape
  is wrong)

Different from analyze_anomaly: analyze_anomaly explains why a
single payload is anomalous given the platform's structural
fingerprints. decompose_failure compares two known states (an
original and its correction) and tells you which kind of failure
accounts for the difference. Use decompose_failure when you have
a known-correct version to diff against; use analyze_anomaly when
you only have the suspicious payload.

Use this after a validation failure to understand the structural
nature of the error, not just which fields differ.

Args:
    api_key:           GeodesicAI API key (starts with gai_)
    original_values:   The original field values (numeric
                       key-value pairs)
    corrected_values:  The corrected/expected field values
    derivation_rules:  Math rules. Optional if blueprint provided.
    formal_constraints: Constraints. Optional if blueprint provided.
    blueprint:         Load rules from this Blueprint instead of
                       passing them inline. Caller must own the
                       Blueprint.

Returns:
    primary_cause:          one of "exact" / "co_exact" / "harmonic"
    exact:                  details of direct rule violations
    co_exact:               details of constraint boundary violations
    harmonic:               details of systemic structural errors
    contributing_fields:    map of field_name → contribution magnitude
    total_error_magnitude:  scalar L2 norm of the full error

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`blueprint`	No
`original_values`	Yes
`corrected_values`	Yes
`derivation_rules`	No
`formal_constraints`	No

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It adequately explains the conceptual decomposition model (three orthogonal error components) but lacks operational transparency regarding side effects, idempotency, whether results are cached, or rate limiting implications of the GeodesicAI API dependency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately structured with the conceptual overview front-loaded, followed by the Args documentation. The technical terminology is dense but necessary for the domain. The Args list format is efficient, though the 'Args:' prefix style slightly duplicates structured schema information without adding value beyond the list items themselves.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of an output schema and the complexity of the mathematical operation, the description is insufficient regarding return values. While it states the tool 'Identifies the primary failure type and contributing fields,' it does not describe the structure of the decomposition result (e.g., whether it returns objects keyed by 'exact', 'co-exact', 'harmonic' or another format).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Excellent compensation for 0% schema description coverage. The Args section provides semantic meaning for all six parameters, including critical format hints (api_key starts with 'gai_'), data type expectations ('numeric key-value pairs'), and conditional logic ('optional if blueprint provided') that the schema completely lacks.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool's specialized purpose using domain-specific terminology ('Hodge-style decomposition') and enumerates the three specific error components (Exact, Co-exact, Harmonic) it identifies. However, it does not explicitly differentiate this from siblings like `analyze_anomaly` or clarify its position in a validation workflow relative to `validate`.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like `validate`, `analyze_anomaly`, or `repair`. It does not specify prerequisites (e.g., 'use this after validation fails') or contraindications for its use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_api_keyAInspect

Permanently delete one of the caller's API keys.

DESTRUCTIVE — agents using the deleted key will receive auth
errors immediately. The Blueprint a key was tied to (if any) is
NOT affected; only the credential is revoked. To delete a
Blueprint and all its keys, use delete_blueprint.

The target key can be specified two ways:
  - As the full key string (gai_...).
  - As a key_id (SHA-256 hash from list_api_keys).

Args:
    api_key: GeodesicAI account-level API key (starts with gai_).
    key_to_delete: Either the full API key string OR the key_id
        from list_api_keys.
    confirm: Must be true to actually delete. If false, returns
        a preview without deleting. Default: false.

Returns:
    status:   "ok" | "preview" | "ERROR"
    deleted:  metadata about the deleted key (only on ok)
    message:  human-readable summary

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`confirm`	No
`key_to_delete`	Yes

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description fully discloses destructive behavior, immediate auth errors, recoverability, effects on Blueprint, and confirm parameter behavior. Includes return format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is relatively long but every sentence adds value. Structured with clear sections for args and returns. Slightly verbose but efficient for complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all aspects: purpose, parameters, behavior, return values, recovery, and sibling differentiation. No gaps given no output schema and no annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but description documents all three parameters with format constraints, defaults, and semantic meaning beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool permanently deletes an API key, with specific verb 'delete' and resource 'API key'. It distinguishes from sibling tools like rotate_api_key and delete_blueprint.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly compares with rotate_api_key and delete_blueprint, explaining when to use each. Also notes recovery steps and prerequisites (api_key must be account-level).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_blueprintAInspect

Permanently delete a Blueprint and all of its API keys.

DESTRUCTIVE — cannot be undone. Cascading effects:
- The Blueprint's template_config.json is removed from disk.
- All Blueprint-scoped API keys for this workflow are deleted.
  Any agents using those keys will start receiving auth errors
  on their next call.
- The Blueprint is removed from the platform's template registry.

Account-level keys are NOT affected. Only the per-Blueprint keys
minted at create time (or via this Blueprint's UI) are revoked.

Use list_blueprints first to confirm the workflow_name. The caller
must own the Blueprint — cross-account deletion is rejected.

Different from update_blueprint: update_blueprint replaces the
config in place and keeps the API keys; delete_blueprint removes
everything.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    workflow_name: Name of the Blueprint to delete (the same value
        used as 'blueprint' in validate)
    confirm: Must be set to true to actually delete. If false, the
        tool returns a preview of what would be deleted without
        performing the deletion. Default: false.

Returns:
    status:           "ok" | "preview" | "ERROR"
    deleted:          workflow_name that was removed (only on ok)
    keys_revoked:     number of Blueprint API keys revoked
    message:          human-readable summary

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`confirm`	No
`workflow_name`	Yes

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description carries the full transparency burden. It thoroughly discloses the destructive nature, cascading effects (template_config.json removal, API key revocation, registry removal), and what is not affected (account-level keys). The confirm parameter behavior is also detailed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (Destructive, Different from update_blueprint, Args, Returns) and front-loaded with the main purpose. However, it is somewhat lengthy, with some redundancy (e.g., effects listed twice in different sections). Still, it earns its length with valuable detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (3 parameters, destructive, optional preview, multiple cascading effects), the description is very complete. It covers prerequisites, side effects, parameter behavior, and return values. No output schema exists, but returns are described in detail.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the tool description compensates by explaining each parameter: api_key starts with 'gai_', workflow_name is the same as used in 'validate', confirm defaults to false and when true performs deletion, otherwise gives a preview. This adds significant meaning beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Permanently delete a Blueprint and all of its API keys.' It uses a specific verb ('delete'), names the resource ('Blueprint'), and adds scope ('all of its API keys'). It distinguishes from 'update_blueprint' by contrasting what each does.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly gives when-to-use guidance: 'Use list_blueprints first to confirm the workflow_name.' It also specifies prerequisites ('The caller must own the Blueprint') and contrasts with 'update_blueprint' to indicate when not to use this tool. The confirm parameter's dual behavior (preview vs deletion) is explained.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_patternsAInspect

Feed a batch of structured data to discover patterns deterministically.
No Blueprint required — discovers rules from the data itself.

Analyzes the batch using motif discovery, structural routing, and
geometric fingerprinting. Returns candidate validation rules with
confidence scores, structural type classifications, and anomaly counts.
Source data is never stored — only statistical summaries persist.

Discovered rules can be promoted into Blueprints using approve_rule.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    documents: List of structured data objects to analyze
    blueprint: Namespace for this discovery session

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`blueprint`	No	default
`documents`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries significant weight. It crucially discloses 'Source data is never stored — only statistical summaries persist' and explains the deterministic nature of the analysis. It also details the return structure. Missing minor operational details like rate limits or execution duration.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The narrative portion is efficiently front-loaded with key constraints (no Blueprint needed) and privacy guarantees. However, the Args section contains a redundant duplicate entry for 'api_key', which wastes tokens and creates minor confusion.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description adequately covers the return value structure, workflow integration (approve_rule), data handling policies, and parameter meanings. Sufficient for an agent to invoke correctly, though error handling details would improve it further.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description compensates effectively by explaining all three parameters: api_key format (starts with gai_), documents semantics (structured data objects), and blueprint purpose (namespace). Deduction for redundant listing of api_key in the Args section.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool 'discovers patterns deterministically' using specific techniques (motif discovery, structural routing, geometric fingerprinting) and defines the output (candidate validation rules with confidence scores). It clearly distinguishes from siblings by stating 'No Blueprint required' and referencing the specific promotion path via 'approve_rule'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear workflow context by stating no Blueprint is required upfront and explicitly naming 'approve_rule' as the mechanism to promote discovered rules. However, it lacks explicit guidance on when to choose this over similar analysis tools like 'analyze_anomaly' or 'structural_types'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forecastAInspect

Deterministic forward reasoning — show what valid states are reachable.

Given the current data state, generates candidate next states by
applying Blueprint rules, then ranks them by structural quality
(lowest drift, highest confidence, shortest path, or lowest risk).

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: Current data state
    blueprint: Blueprint to use for rules and constraints
    max_depth: How many steps ahead to forecast (1-10)
    max_branches: Maximum branches per step (1-10)
    rank_by: Ranking criterion — "drift" (most stable), "confidence" (highest geometric confidence), "shortest" (fewest changes), "risk" (lowest risk)

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`rank_by`	No	drift
`blueprint`	No	default
`max_depth`	No
`max_branches`	No
`structured_data`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses deterministic nature and ranking algorithm (drift, confidence, shortest path, risk). With no annotations provided, the description carries full burden but omits whether the operation is read-only or has side effects, and doesn't mention error conditions or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with conceptual summary first, then mechanism, then parameter details. Minor deduction for duplicate 'api_key' entry in Args section. Given zero schema coverage, the length is appropriate and every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex 6-parameter tool with no annotations, no output schema, and many siblings, the description adequately covers the conceptual model, parameter semantics, and ranking behavior. Minor gap in not describing the return structure or error scenarios despite lacking output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the Args section comprehensively compensates by documenting all 6 parameters including format hints ('starts with gai_'), valid ranges ('1-10'), and enum semantics for rank_by options. Exceptional compensation for schema deficiency.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent clarity with 'Deterministic forward reasoning' and 'generates candidate next states by applying Blueprint rules.' Specifically contrasts with sibling tools like 'counterfactual' (hypothetical/backward) and 'repair' (fixing current state) by emphasizing forward/reachable state generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage context ('Given the current data state') and prerequisites (Blueprint rules, API key), but lacks explicit when-to-use vs siblings like 'counterfactual' or 'solve'. No guidance on when forecasting is preferred over other prediction tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

geometric_confidenceAInspect

Compute a composite geometric confidence score from validation
signals. No Blueprint required — works on any validate result.

Combines six weighted signals into a single confidence score:
- Surface distance (how close to the constraint manifold)
- Geometric health (projection quality, regulator, closure)
- Anomaly score (structural fingerprint deviation)
- Stability score (batch drift)
- Motif compliance (pattern violations)
- Motif gate (enforcement decision)

Returns confidence level (high/medium/low) and a recommendation.

Different from analyze_anomaly and check_drift: those tools
perform new analysis on raw data. geometric_confidence is
post-hoc — it digests an already-computed state_vector and
returns a single confidence number. Use this when you have a
state_vector from a prior validate / get_execution_trace call
and want a one-line summary of structural quality. Use
analyze_anomaly when you need to know why something is
anomalous; use check_drift when you need to compare against
historical observations.

Args:
    api_key:      GeodesicAI API key (starts with gai_)
    state_vector: State vector dictionary, typically the
                  state_vector field from a validate or
                  get_execution_trace result.

Returns:
    confidence:      float in [0, 1]
    level:           "high" / "medium" / "low"
    recommendation:  text recommendation for the caller
    signals:         per-signal contribution breakdown
    signal_weights:  weights used in the composite

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`state_vector`	Yes

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It explains the algorithmic behavior (six weighted signals with detailed descriptions) and return format (confidence level + recommendation), but omits operational details like error conditions, rate limits, or idempotency that would be necessary for full transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a one-line summary, constraint note, detailed bullet points for the six signals, return value description, and parameter documentation. The bullet points earn their place by explaining the composite score components, though the Args section repeats information that could be inferred.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description documents the return values (confidence level high/medium/low and recommendation). It covers the tool's purpose, input requirements, internal methodology, and typical usage context, making it sufficient for an agent to invoke correctly despite the minimal schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage, but the description compensates effectively via the Args section. It documents the api_key format ('starts with gai_') and crucial context for state_vector ('from validate results'), providing the semantic meaning missing from the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool computes a 'composite geometric confidence score' and lists six specific validation signals it combines. The 'No Blueprint required' note helps distinguish it from blueprint-dependent siblings like create_blueprint or validate_and_repair.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides context that the tool 'works on any validate result' and is 'typically called with the state_vector from a validate result,' establishing a workflow relationship with the validate tool. However, it lacks explicit when-not-to-use guidance or comparison to similar analysis tools like analyze_anomaly or check_drift.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_execution_traceAInspect

Run validation and return the detailed execution trace.

Shows the exact sequence of validation nodes that ran, whether
each was deterministic, and the runtime of each node. Use for
debugging, compliance audits, or understanding exactly what the
platform checked.

Different from validate: validate returns the verdict (PASS / FAIL
/ REVIEW) and the state vector summary. get_execution_trace returns
everything validate does PLUS the per-node trace records. Use
validate for normal operation; use get_execution_trace when you
need to see inside the pipeline (debugging, audit prep, latency
analysis).

The trace is the same whether validation passes or fails — every
node that ran is recorded with its inputs, outputs, and timing.

Args:
    api_key:         GeodesicAI API key (starts with gai_)
    structured_data: The data to trace validation for
    blueprint:       Blueprint to validate against. Caller must
                     own the Blueprint.

Returns:
    status:              "PASS" / "FAIL" / "REVIEW" / "ERROR"
    determinism_hash:    cryptographic hash of inputs + rules
    trace:               ordered list of node records, each with:
                           node_name, node_type, deterministic (bool),
                           runtime_ms, inputs, outputs
    node_count:          number of nodes in the trace
    deterministic_count: how many nodes were deterministic
    state_vector:        same state_vector validate returns

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`blueprint`	No	default
`structured_data`	Yes

Tool Definition Quality

A3.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries the full burden. It successfully describes the output content (execution sequence, determinism flags, runtime metrics) but fails to disclose operational traits like whether this consumes API quota, writes audit logs, or is safe to call repeatedly (read-only vs side effects).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear paragraph breaks separating the action, output details, usage guidance, and parameter documentation. It avoids fluff while conveying necessary information, though the Args section could benefit from noting parameter optionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description adequately explains what the trace contains but leaves gaps regarding the return structure format, error handling behavior (what happens if the API key is invalid?), and whether the trace is returned synchronously or async.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the Args section compensates effectively by documenting all three parameters with useful context (API key format 'starts with gai_', purpose of structured_data and blueprint). Docked one point for failing to mention that 'blueprint' is optional with a default value of 'default'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'Run[s] validation and return[s] the detailed execution trace' and specifies what the trace contains (sequence, determinism, runtime). However, it could more explicitly distinguish this from the sibling 'validate' tool by clarifying that this returns node-level debugging details versus a simple pass/fail result.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides specific use cases ('Use for debugging, compliance audits, or understanding exactly what the platform checked'), but lacks explicit guidance on when to use this versus the sibling 'validate' tool, or whether this should be used in production workflows versus troubleshooting only.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

handoff_auditAInspect

Audit a handoff between two chain stages. Returns a context capsule
with verified facts from the prior stage and checks structural
compatibility of proposed data for the next stage.

Use this between chain stages to ensure Agent B receives only
verified data from Agent A, and that nothing was mutated in transit.

Sibling tools: create_chain (define the pipeline), submit_chain_stage
(advance through it), handoff_audit (verify between stages).

The context capsule contains:
- Verified fields and their values from the prior stage
- Determinism hash proving the prior stage's results
- Blueprint constraints the next stage must satisfy
- Compatibility verdict if proposed_data is provided

Args:
    api_key: GeodesicAI API key (starts with gai_)
    chain_id: Chain identifier from create_chain
    from_stage: Stage name that completed (Agent A)
    to_stage: Stage name about to start (Agent B)
    proposed_data: Optional data Agent B intends to submit — checked for compatibility

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`chain_id`	Yes
`to_stage`	Yes
`from_stage`	Yes
`proposed_data`	No

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by explaining the tool's behavior: it returns a context capsule with specific contents (verified fields, determinism hash, blueprint constraints, compatibility verdict). It clarifies that proposed_data is optional and checked for compatibility. However, it doesn't mention authentication requirements beyond the api_key parameter or any rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with purpose statement, usage guidance, capsule contents, and parameter explanations. Every sentence adds value, though the parameter section could be slightly more concise. The information is front-loaded with the core purpose first.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 5 parameters, 0% schema coverage, no annotations, and no output schema, the description does an excellent job explaining purpose, usage, behavior, and parameters. The main gap is lack of output format details beyond listing capsule contents - without an output schema, the agent doesn't know the exact structure of the returned data.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description fully compensates by explaining all 5 parameters in the Args section. It provides meaningful context for each parameter: api_key specifies format requirements, chain_id references another tool, from_stage/to_stage explain their roles in the handoff, and proposed_data clarifies its optional nature and purpose.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('audit', 'returns', 'checks') and resources ('handoff between two chain stages', 'context capsule', 'verified facts', 'structural compatibility'). It distinguishes from siblings by focusing on inter-stage verification rather than creation, analysis, or execution tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'Use this between chain stages to ensure Agent B receives only verified data from Agent A, and that nothing was mutated in transit.' It provides clear context about the workflow timing and purpose, distinguishing it from tools that operate within stages rather than between them.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_api_keysAInspect

List all API keys owned by the calling account.

Returns a masked representation of each key plus a stable key_id
(SHA-256 hash) that can be used with rotate_api_key and
delete_api_key. Full key strings are NEVER returned by this tool.

Each entry includes:
  - api_key: masked key string (e.g. "gai_***...REe0")
  - key_id: SHA-256 hash, usable as the target for rotate/delete
  - type: "account" or "blueprint"
  - intent: "All tools (account key)" or the Blueprint workflow_name
  - customer: associated customer/project name
  - created: ISO timestamp

Args:
    api_key: GeodesicAI account-level API key (starts with gai_).
        Blueprint-scoped keys cannot list keys.

Returns:
    status: "ok" | "ERROR"
    keys: list of key records with masked api_key and key_id
    total: number of keys returned

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes

Tool Definition Quality

A4.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes the returned data structure and the condition that Blueprint-scoped keys appear only for owned Blueprints. Lacks explicit statement of being read-only, but is implied. No annotations provided, so description carries burden adequately.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise and well-structured with bullet points for return fields and separate sections for args/returns. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with one parameter and no output schema, the description covers purpose, input requirements, return structure, and usage context completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The only parameter api_key is fully described: it's a GeodesicAI API key starting with 'gai_', must be account-level. This adds substantial meaning beyond the schema's bare title.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it lists all API keys owned by the calling account, with explicit details on what fields are returned. Distinguishes from sibling tools like rotate/delete.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use (to audit keys before rotation/deletion) and constraints (api_key must be account-level). Provides clear alternatives: rotate_api_key and delete_api_key.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_blueprintsAInspect

List Blueprints owned by the calling account.

Returns each Blueprint's name, workflow identifier, mode, and
field/rule/constraint counts. Use the workflow_name as the
'blueprint' parameter when calling validate.

Blueprint modes — important for agents that summarize results:
  - "observe" (Observation mode): the platform validates data
    against the Blueprint's rules and returns PASS, FAIL, or
    REVIEW with repair suggestions. Validation is detection-only;
    the caller decides what to do with the result.
  - "enforce" (Execution mode): the platform performs the same
    validation AND authorizes downstream side effects. Side
    effects are blocked if validation fails. In this mode, rules
    that compute derived values (like totals) publish the
    platform's computed result as authoritative, so downstream
    consumers see the canonical value rather than the agent's.

Both modes run the same rule checks. The difference is what
happens after validation, not during it.

Scope: only Blueprints created by the calling account are returned.
Use validate / repair / etc. with the workflow_name to operate on them.

Args:
    api_key: GeodesicAI API key (starts with gai_)

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses return structure ('name, workflow identifier, mode, and field/rule/constraint counts') despite lack of output schema. Documents auth format ('starts with gai_'). Does not explicitly confirm read-only status or pagination behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Efficient four-sentence structure: purpose, return values, usage guidance, and parameter documentation. Every sentence earns its place. Minor deduction for embedded 'Args:' formatting which is slightly informal but still clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple listing tool with 1 parameter and no output schema, the description is complete. It explains the return payload, documents the single parameter, and provides integration context with sibling tools. No significant gaps remain given the tool's simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage (only title present). Description compensates by documenting the api_key parameter's purpose ('GeodesicAI API key') and format hint ('starts with gai_'), adding essential semantic meaning missing from the structured schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific verb ('List') + resource ('Blueprints') + scope ('all available governance configurations'). Distinguishes from sibling 'create_blueprint' by contrasting list vs. create operations, and references 'validate' tool to clarify output usage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit integration guidance: 'Use the workflow_name as the 'blueprint' parameter when calling validate.' This clarifies how to use output with a specific sibling tool. However, lacks explicit 'when not to use' guidance or filtering alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

load_rule_packAInspect

Load a prebuilt Blueprint template for fast onboarding.

Rule Packs are ready-made governance configurations for common use cases.
Call with no pack_id to list all available packs. Call with a pack_id
to load the full configuration, then use create_blueprint to save it.

Available packs include templates for: invoice governance, timecard/payroll
governance, legal document governance, purchase order governance, and
insurance claims governance. Each includes field definitions, derivation
rules, constraints, and agent conditioning instructions.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    pack_id: ID of the rule pack to load. Omit to list available packs.

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`pack_id`	No

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the dual-mode behavior (list vs. load), clarifies that this tool does not persist data ('then use create_blueprint to save it'), and details the content structure of returned packs ('field definitions, derivation rules, constraints'). Minor gap: does not explicitly state read-only nature or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with purpose front-loaded, followed by concept explanation, usage patterns, examples, and parameter details. Minor redundancy exists ('Omit to list available packs' appears twice), but information density remains high with no wasted sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a governance-platform tool with no output schema, the description provides excellent domain context by enumerating specific pack types (invoice, payroll, legal, etc.) and explaining pack contents. It adequately prepares the agent to understand return values even without a formal output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Given 0% schema description coverage, the description fully compensates by providing crucial semantic details: api_key format hint ('starts with gai_') and pack_id behavioral semantics ('Omit to list available packs'). It transforms the schema's bare types into actionable parameter guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb-resource pair ('Load a prebuilt Blueprint template') and immediately distinguishes the domain ('fast onboarding', 'governance configurations'). It effectively differentiates from sibling tool create_blueprint by stating this tool only loads configurations while create_blueprint is needed to save them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly documents the two operational modes: 'Call with no pack_id to list all available packs' versus 'Call with a pack_id to load the full configuration.' It also establishes the workflow sequence by directing users to use create_blueprint as the next step after loading, providing clear when-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reject_ruleBInspect

Reject a discovered candidate rule so it will not be promoted into a Blueprint.

Use after running discover_patterns when reviewing the candidate rules
it produced. Pair with approve_rule: approve high-confidence rules you
want enforced, reject the rest. Rejected rules are marked in the
discovery session and will not appear in subsequent approval calls or
be promotable into Blueprints. The action is recorded against the
discovery session namespace; it does not affect any existing Blueprint.

Use this when:
  - A discovered rule is mathematically valid but business-irrelevant
  - The rule has low confidence and you don't want it surfacing again
  - You want to clean up the candidate list before promoting approvals

Args:
    api_key: GeodesicAI API key (starts with gai_)
    rule_id: ID of the discovered rule (from discover_patterns results)
    blueprint: Discovery session namespace (must match the one used for discover_patterns)

Returns:
    status: "rejected" on success
    rule_id: The rejected rule's ID
    blueprint: The discovery session namespace

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`rule_id`	Yes
`blueprint`	No	default

Tool Definition Quality

B3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. While 'Reject' implies a state change, the description does not clarify if this is permanent, reversible, or what the side effects are (e.g., if rejected rules are deleted or archived). No auth requirements beyond the api_key parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise with no filler. The 'Args:' format efficiently delivers parameter documentation, though front-loading behavioral context before the argument list would improve structure slightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a state-changing operation with no annotations and no output schema, the description lacks critical behavioral context (what happens upon rejection, error conditions). The parameter documentation partially compensates for poor schema coverage but is insufficient for a complete operational picture.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description effectively compensates by documenting all three parameters: api_key includes format hint ('starts with gai_'), rule_id clarifies it targets a 'discovered rule', and blueprint adds semantic context ('Discovery session namespace').

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Reject') and target resource ('discovered candidate rule'). However, it does not explicitly differentiate from sibling tool 'approve_rule' or explain the workflow context of candidate rules.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance provided on when to use this tool versus 'approve_rule' or other alternatives. No prerequisites or workflow steps mentioned (e.g., whether discovery must be complete first).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

repairAInspect

Compute the nearest valid point on the constraint manifold.

Given structured data with errors, missing values, or inconsistencies,
returns field-by-field repair suggestions with geometric confidence scores.
Uses algebraic projection for linear constraints and iterative geodesic
projection for nonlinear constraints.

Can accept rules directly or load them from a Blueprint.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: The data to repair (key-value pairs)
    derivation_rules: Math rules (optional if blueprint is provided)
    formal_constraints: Constraints (optional if blueprint is provided)
    blueprint: Load rules from this Blueprint instead of specifying them directly

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`blueprint`	No
`structured_data`	Yes
`derivation_rules`	No
`formal_constraints`	No

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full behavioral disclosure burden. It successfully explains the algorithmic approach ('algebraic projection for linear constraints and iterative geodesic projection for nonlinear constraints') and output format. However, it omits operational details such as error handling, idempotency, or side effects of the repair computation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is structured logically with a high-level mathematical definition, functional explanation, algorithmic details, and parameter documentation. Every sentence contributes value. It is slightly dense with technical terminology ('constraint manifold,' 'geodesic projection') but appropriate for the domain complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, mathematical domain) and lack of output schema, the description adequately covers the core functionality and parameter semantics. It mentions the API key requirement implying authentication needs. It could be improved by describing the output structure or error scenarios, but the 'field-by-field repair suggestions' provides essential return value context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage, but the description compensates excellently through the 'Args:' section. It adds critical semantic details: api_key format ('starts with gai_'), structured_data structure ('key-value pairs'), and conditional requirement logic ('optional if blueprint is provided'). This fully compensates for the schema's lack of descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool computes 'the nearest valid point on the constraint manifold' and handles 'structured data with errors, missing values, or inconsistencies.' It specifies the output as 'field-by-field repair suggestions with geometric confidence scores.' However, it does not explicitly differentiate from the sibling tool 'validate_and_repair'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implicit usage guidance by explaining that derivation_rules/formal_constraints are 'optional if blueprint is provided,' indicating mutual exclusivity between direct rules and blueprint loading. However, it lacks explicit guidance on when to use this tool versus siblings like 'validate_and_repair' or 'validate,' and provides no 'when-not-to-use' criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

repair_pathAInspect

Find the shortest path from an invalid state to a valid one.

Given data that fails validation, computes a sequence of minimal
field changes that would bring the data into compliance with the
Blueprint's rules and constraints. Returns the path as an ordered
list of intermediate states.

Different from repair: repair returns the single nearest valid
point on the constraint manifold (a one-shot projection).
repair_path returns the trajectory between invalid and valid
states — useful when you want to see what's changing step by
step, or when there are multiple ways to repair the data and you
want to evaluate alternatives. Use repair when you just need the
fix; use repair_path when you need to explain or audit the fix.

Args:
    api_key:         GeodesicAI API key (starts with gai_)
    structured_data: Current (invalid) data state
    blueprint:       Blueprint defining the valid constraint space.
                     Caller must own the Blueprint.
    max_depth:       Maximum repair steps to search (1-10).
                     Higher values find paths through more complex
                     repairs but take longer.
    rank_by:         Ranking criterion for selecting among valid
                     paths. One of:
                       "shortest"    — fewest changes
                       "drift"       — most stable trajectory
                       "confidence"  — highest geometric confidence
                       "risk"        — lowest risk

Returns:
    path:            ordered list of intermediate states, each
                     with field values and structural quality metrics
    path_length:     number of steps in the returned path
    terminal_state:  the final valid state at the end of the path
    rank_score:      score of the chosen path under rank_by
    alternatives:    summary of paths considered but not chosen

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`rank_by`	No	shortest
`blueprint`	No	default
`max_depth`	No
`structured_data`	Yes

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses behavioral constraints like max_depth range (1-10) and rank_by options, and mentions 'minimal' changes implying optimization. However, it omits safety-critical details like whether this performs mutations or is read-only analysis, and what happens when no path exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with a front-loaded summary sentence, contextual elaboration, and a clean Args section. No redundant or wasted text; every sentence contributes to understanding the tool's function or parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for the input side given the parameter documentation, but incomplete regarding outputs. With no output schema provided and a complex return type implied (a path/sequence), the description should specify what the tool returns (e.g., a repair plan or sequence of operations).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Excellent compensation for 0% schema description coverage. The Args section documents 4 of 5 parameters with meaningful semantics: structured_data (invalid state), blueprint (constraint space), max_depth (search limit with valid range), and rank_by (criterion with enumerated options). Only api_key is undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool finds a 'shortest path' and computes a 'sequence of minimal field changes' to fix invalid data. Strong verb+resource combination. However, it does not explicitly differentiate from the sibling 'repair' tool, which could cause selection ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context ('Given data that fails validation') establishing when to use it, but lacks explicit guidance on when to choose this over siblings like 'repair' or 'validate_and_repair', and does not state prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rotate_api_keyAInspect

Rotate one of the caller's API keys. The old key stops working
immediately; a new key with the same scope is issued.

The new key inherits the old key's scope:
  - An account-level key rotates to another account-level key.
  - A Blueprint-scoped key rotates to a key for the same Blueprint.

The target key can be specified two ways:
  - As the full key string (gai_...) — useful if you just received it.
  - As a key_id (SHA-256 hash from list_api_keys) — useful when you
    only have the masked version. Either is accepted.

Args:
    api_key: GeodesicAI account-level API key (starts with gai_).
        Must be an account key.
    key_to_rotate: Either the full API key string OR the key_id
        (SHA-256 hex hash) returned by list_api_keys.

Returns:
    status:    "ok" | "ERROR"
    api_key:   the new key string (copy this immediately)
    type:      "account" | "blueprint"
    intent:    Blueprint folder name, or empty for account keys

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`key_to_rotate`	Yes

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that old key stops working immediately, new key inherits same scope, and agents will get auth errors if not updated. Missing details on permissions, rate limits, or reversibility, but adequate given no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with Args and Returns sections, front-loaded purpose, no redundant sentences. Efficient yet thorough.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers behavior, parameters, return values, and usage context. Includes warning about immediate deactivation and need to copy new key. Lacks error scenarios or idempotency, but sufficient for a key rotation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Adds crucial meaning beyond schema: api_key must be account-level (starts with gai_), can be same as key_to_rotate; key_to_rotate can be any owned key. Explains scope inheritance. Compensates for 0% schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (rotate), the resource (API key), and distinguishes from siblings like delete_api_key and list_api_keys by specifying rotation while preserving scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases (compromised key, scheduled rotation) and warns about immediate old key deactivation and agent updates. Lacks explicit 'when not to use' or alternatives but is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

structural_typesBInspect

Get auto-discovered structural type classifications from a discovery session.

After running discover_patterns, returns the structural categories the
platform identified in the data — without being told what categories
exist. Each category includes document count, distinguishing fields,
and domain hints inferred from the data shape.

This is a read-only retrieval. If discover_patterns has not been run
against the given blueprint namespace (or the session has expired),
returns an empty type list with status="no_session".

Use after discover_patterns when you want to understand how the
platform grouped your data before deciding which patterns to promote
via approve_rule.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    blueprint: Discovery session namespace (must match the namespace
               used in discover_patterns)

Returns:
    status: "ok" or "no_session"
    structural_types: list of {type_id, document_count, distinguishing_fields, domain_hint}
    total_documents: total document count across all types

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`blueprint`	No		default

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully describes the return payload structure (categories with counts, fields, hints) and the unsupervised nature of the discovery ('without being told what categories exist'). However, it lacks disclosure on mutability, caching behavior, or error states when the blueprint session is invalid.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized at four sentences, front-loaded with the action ('Get'), followed by workflow context, return value details, and parameter info. The Args section is informal but efficient given the lack of schema descriptions. No redundant or wasteful text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description adequately describes what the tool returns. However, with 0% schema coverage and two parameters (one required), documenting only one parameter leaves the tool under-specified. The workflow relationship with discover_patterns is well-covered, but parameter documentation is incomplete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage. The description compensates partially by documenting 'blueprint' as 'Discovery session namespace' in the Args section, but completely omits the required 'api_key' parameter. With zero schema coverage and a required parameter undocumented, this is a significant gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'auto-discovered structural type classifications' and distinguishes itself from the sibling 'discover_patterns' by specifying this is used after that step to get results. It specifies the resource (structural categories) and what attributes are returned (document counts, distinguishing fields, domain hints).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear workflow context with 'After running discover_patterns,' establishing the prerequisite sequence. It implies this tool is for retrieval of previously discovered data rather than performing discovery itself, though it could explicitly state what happens if called before discovery completes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

submit_chain_stageAInspect

Submit data for a chain stage. The platform validates the data
using the chain's Blueprint, then advances the chain if validation passes.

The response includes the next stage info and any accumulated
repair suggestions from prior stages.

Sibling tools: create_chain (define the pipeline), submit_chain_stage
(advance through it), handoff_audit (verify between stages).

Args:
    api_key: GeodesicAI API key (starts with gai_)
    chain_id: Chain identifier from create_chain
    stage: Stage name to submit for
    structured_data: Data for this stage

ParametersJSON Schema

Name	Required	Description	Default
`stage`	Yes
`api_key`	Yes
`chain_id`	Yes
`structured_data`	Yes

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It successfully discloses the validation logic, conditional state change (advancement only if validation passes), and response contents (next stage info, accumulated repair suggestions). Missing error handling details and idempotency/safety characteristics.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured docstring format with clear separation between behavioral description (2 sentences) and Args documentation. Zero redundant text; every sentence conveys distinct information about the tool's operation or parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description admirably documents return value structure ('next stage info', 'accumulated repair suggestions'). References prior stages and chain workflow context. Would benefit from noting what happens when validation fails.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Excellent compensation for 0% schema description coverage. All 4 parameters are documented: api_key includes format hint ('starts with gai_'), chain_id references sibling tool create_chain for provenance, and structured_data is contextualized as the data being validated.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb-resource combination ('Submit data for a chain stage') with specific behavioral elaboration (validation against Blueprint, conditional advancement). Distinguishes from validation-only tools by emphasizing the advancement mechanism.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies workflow position through 'advance the chain' and 'from prior stages', plus references to create_chain for chain_id provenance. However, lacks explicit contrast with validate or repair siblings (e.g., 'use this instead of validate when you need to progress the chain').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_blueprintAInspect

Update an existing Blueprint's configuration in place.

Only fields you pass are updated; fields you omit keep their
current values. To clear a list field (e.g. remove all rules),
pass an explicit empty list [].

Existing API keys for this Blueprint are preserved — agents using
those keys continue working after the update. Ownership stamps
are also preserved; you cannot transfer Blueprint ownership.

The workflow_name itself cannot be renamed. To rename, create a
new Blueprint with the new name and delete the old one.

Different from create_blueprint: create_blueprint creates a new
Blueprint and mints a fresh API key. update_blueprint modifies
an existing one and returns no new key.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    workflow_name: Name of the Blueprint to update (must already exist)
    customer_name: New customer/project name. Pass None to keep current.
    mode: "observe" or "enforce". Pass None to keep current.
    extracted_fields: New list of agent-extracted fields. Pass None to
        keep current; pass [] to clear.
    derived_fields: New list of platform-derived fields. None or [].
    derivation_rules: New list of derivation rules. See blueprint_guide
        prompt for schema. None or [].
    formal_constraints: New list of constraints. See blueprint_guide
        prompt for schema. None or [].
    semantic_checks: New list of semantic checks. None or [].
    require_math: Override math validation flag. None to keep current.
    require_consistency: Override consistency flag. None to keep.
    require_coherence: Override coherence flag. None to keep.
    require_provenance: Override provenance flag. None to keep.
    require_high_assurance: Override high-assurance flag. None to keep.
    enable_anomaly_detection: Override anomaly flag. None to keep.
    enable_drift_tracking: Override drift flag. None to keep.

Returns:
    status:            "ok" | "ERROR"
    blueprint:         workflow_name that was updated
    fields_changed:    list of config keys that were modified
    field_count:       new total of extracted + derived fields
    rule_count:        new total of derivation rules
    constraint_count:  new total of formal constraints

ParametersJSON Schema

Name	Required	Description	Default
`mode`	No
`api_key`	Yes
`require_math`	No
`customer_name`	No
`workflow_name`	Yes
`derived_fields`	No
`semantic_checks`	No
`derivation_rules`	No
`extracted_fields`	No
`require_coherence`	No
`formal_constraints`	No
`require_provenance`	No
`require_consistency`	No
`enable_drift_tracking`	No
`require_high_assurance`	No
`enable_anomaly_detection`	No

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully covers behavioral traits: partial update, list clearing, preservation of keys/ownership, no rename capability, and return of no new key. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long but well-structured with clear sections and front-loaded purpose. Each sentence adds value, though the parameter list is verbose. Slightly more conciseness could be achieved, but it is justified by the parameter count.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 16 parameters, no output schema, and no annotations, the description is remarkably complete: it explains return fields, behavioral nuances, and parameter usage. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema description coverage, the description documents every parameter with clear semantics: what each field does, how to keep current value, and for lists how to clear vs keep. This adds substantial value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Update an existing Blueprint's configuration in place' and explicitly distinguishes from create_blueprint by contrasting behavior regarding API key generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains partial update semantics, how to clear list fields, preservation of API keys and ownership, the prohibition of renaming workflow_name, and contrasts with create_blueprint.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validateAInspect

Validate structured data against a Blueprint's rules. Returns PASS, FAIL, or REVIEW.

The platform checks mathematical accuracy (do the numbers add up?),
structural consistency (do the fields satisfy all constraints?), and
semantic plausibility (do the values make sense in context?).

Every result includes a determinism hash — the same input with the same
Blueprint always produces the same result. Auditable, replayable, legally defensible.

A Blueprint is required for meaningful validation. Without one, use
create_blueprint or load_rule_pack to define your governance rules first.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: The data to validate (key-value pairs)
    blueprint: Name of the Blueprint to validate against. Use list_blueprints to see options.

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`blueprint`	No	default
`structured_data`	Yes

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden. It successfully explains the three validation dimensions (mathematical accuracy, structural consistency, semantic plausibility), return values (PASS/FAIL/REVIEW), and determinism guarantees (hash for auditability). Minor gap: doesn't mention rate limits, error handling, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections: purpose, validation types, determinism guarantee, and prerequisites. The determinism paragraph ('Auditable, replayable, legally defensible') conveys compliance-critical behavior efficiently. Slight verbosity in the determinism explanation, but justified given legal/defensibility context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description documents return values (PASS/FAIL/REVIEW) and the determinism hash. It contextualizes the tool within the broader Blueprint workflow (referencing create_blueprint, load_rule_pack, list_blueprints). Minor gap: doesn't describe error response format or partial failure behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (no parameter descriptions in JSON schema), but the description fully compensates via the Args section. It documents all three parameters: api_key includes format hint (starts with gai_), structured_data clarifies structure (key-value pairs), and blueprint explains how to discover valid values (use list_blueprints).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a precise action-statement: 'Validate structured data against a Blueprint's rules.' It specifies the verb (validate), target resource (structured data), and scope (Blueprint rules). It distinguishes from siblings like repair or analyze_anomaly by focusing on rule-based validation rather than anomaly detection or remediation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states prerequisites and alternatives: 'A Blueprint is required for meaningful validation. Without one, use create_blueprint or load_rule_pack to define your governance rules first.' Also references list_blueprints for discovering valid blueprint values, creating a clear decision tree for the agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_repairAInspect

Validate structured data and automatically compute repairs if it fails.
Single call that combines validate + repair.

Different from validate: validate returns only the verdict; if the
data fails, you'd then call repair separately. validate_repair
returns the verdict AND the repaired payload in one call. Different
from repair: repair always returns repair suggestions regardless
of whether the input was valid; validate_repair only computes
repairs when validation actually fails.

If PASS: returns the validated data with determinism hash.
If FAIL: returns the failure details AND a repaired payload with
field-by-field corrections and confidence scores. The agent can
inspect the repairs and resubmit the corrected data.
If REVIEW: returns the flagged data with review reasoning.

This is the recommended starting point for most agent integrations.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: The data to validate (key-value pairs)
    blueprint: Name of the Blueprint to validate against. Use list_blueprints to see options.

ParametersJSON Schema

Name	Required	Default
`api_key`	Yes
`blueprint`	No	default
`structured_data`	Yes

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavior: returns verdict, repaired payload, field-by-field corrections, confidence scores, and determinism hash. It explains all three outcomes without omissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with paragraphs and bullet points, front-loaded with the purpose. It is moderately long but every sentence is informative. Minor redundancy (e.g., repetition of 'validate_repair' name) could be trimmed, but overall concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers return values for all outcomes, parameter details, and related tools. It is missing explicit error handling or rate limits, but for a validation/repair tool with no output schema, it provides sufficient context for an agent to use effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema coverage, the description adds meaning for all parameters: api_key format ('starts with gai_'), structured_data as 'key-value pairs', blueprint as 'Name of the Blueprint' with a hint to list_blueprints. This goes well beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Validate structured data and automatically compute repairs if it fails' and contrasts with sibling tools 'validate' and 'repair', clearly distinguishing its combined behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides clear guidance on when to use ('recommended starting point'), explains the three possible outcomes (PASS, FAIL, REVIEW), and tells the agent what to do with the results (inspect repairs, resubmit). It also differentiates from related tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_replayAInspect

Verify that two execution replay contracts represent the same deterministic result.

This is the programmatic proof of GeodesicAI's core promise: same input + same
rules = same result, every time. Given two replay contracts (e.g. from the
original execution and a re-run), this tool compares all component hashes and
reports whether the executions are byte-identical.

Use this to:
- Prove to an auditor that a decision from March 3rd matches a re-run today.
- Detect when a rule change has altered execution behavior (input hash matches
  but canonical trace hash differs → the rules diverged).
- Confirm a Blueprint migration didn't change any observable outcomes.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    contract_a: A replay contract dict (the `replay_contract` field from a
                prior validate/execute_task response)
    contract_b: Another replay contract dict to compare against contract_a

Returns:
    replay_match: bool — True if the top-level replay_hash matches (fully identical)
    contract_version_match: bool
    matches: dict of field_name → value, for every field that agreed
    mismatches: dict of field_name → {expected, actual}, for every field that disagreed
    summary: plain-English one-liner describing the result

Interpretation of mismatches:
    - input_payload_hash: the two runs were fed different data
    - template_version: the Blueprint was upgraded between runs
    - solver_registry_hash: the platform itself changed between runs
    - canonical_trace_hash: same inputs and rules but different execution path
        (should never happen under determinism; indicates a platform bug)
    - graph_hash: DAG topology changed between runs

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`contract_a`	Yes
`contract_b`	Yes

Tool Definition Quality

A4.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes what the tool does (compares hashes, reports matches/mismatches) and includes an 'Interpretation of mismatches' section that explains behavioral outcomes (e.g., platform bugs, input differences). However, it lacks details on rate limits, error handling, or authentication beyond the api_key parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded, starting with a clear purpose statement followed by usage guidelines, parameter details, return values, and mismatch interpretation. Every sentence adds value without redundancy, making it efficient and easy to parse for an AI agent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description comprehensively covers return values (listing all fields and their types) and provides deep context on interpreting results. Given the tool's complexity (3 parameters with nested objects) and lack of annotations, the description is complete enough for an agent to understand inputs, outputs, and behavioral implications.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Given 0% schema description coverage, the description fully compensates by providing detailed semantics for all parameters. It explains api_key format ('starts with gai_'), defines contract_a and contract_b as replay contract dicts from prior responses, and clarifies their roles in comparison. This adds significant value beyond the minimal schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('verify', 'compare') and resources ('two execution replay contracts'), distinguishing it from siblings like 'check_drift' or 'validate'. It explicitly defines the tool as a programmatic proof of determinism, making its function unambiguous and distinct from other tools in the server.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage scenarios in a bulleted list ('Use this to:'), including specific cases like proving to auditors, detecting rule changes, and confirming migrations. It clearly indicates when to use this tool versus alternatives by focusing on deterministic verification, unlike siblings that handle tasks like anomaly analysis or rule approval.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?