Skip to main content
Glama

Server Details

Deterministic AI governance platform. Validates agent outputs, discovers patterns, solves math.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.5/5 across 31 of 31 tools scored. Lowest: 3.7/5.

Server CoherenceA
Disambiguation5/5

Each tool has a clearly distinct purpose, with detailed descriptions that differentiate even similar operations like validate, validate_repair, and get_execution_trace. There is no ambiguity in intent.

Naming Consistency5/5

All tool names follow a consistent verb_noun snake_case pattern (e.g., create_blueprint, list_blueprints, delete_api_key). No mix of conventions or unclear naming.

Tool Count2/5

With 31 tools, the count exceeds the 25+ threshold for 'too many' per the rubric. While each tool has a defined role, the surface is overloaded compared to typical server scopes.

Completeness5/5

The tool set covers the full lifecycle of governance: blueprint creation, validation, repair, anomaly analysis, pattern discovery, chain execution, and account management. No obvious gaps for the stated domain.

Available Tools

31 tools
account_statusAInspect
Report the calling account's plan, key usage, and limits.

Use this to introspect what the caller is allowed to do. Agents that hit
rate limits or key-count caps can call this to explain the limit to the
human and suggest upgrading if needed.

Args:
    api_key: GeodesicAI API key (starts with gai_)

Returns:
    plan: The user's current plan — one of pilot, trial, tier1, tier2, beta, enterprise
    plan_label: Human-readable plan name (e.g. "Personal Access", "Small Business")
    account_key_count: Number of account-level API keys currently issued
    account_key_limit: Maximum account keys allowed on this plan
    blueprint_count: Number of Blueprints owned by this user
    blueprint_limit: Maximum Blueprints allowed on this plan
    email: The user's email address (for reference in support)
    user_id: Stable user identifier
    trial_days_remaining: Days left on trial, if plan == "trial"; else null
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description must carry burden. It discloses that it is a reporting tool (read-only) and describes return fields. But it doesn't explicitly state non-destructive nature or authentication requirements beyond the api_key parameter. Adequate but not rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Very concise: one-sentence purpose, one usage sentence, then Args and Returns sections. Front-loaded with purpose. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, description fully documents return fields with explanations. Tool is simple (1 param). Description covers purpose, usage, parameters, and returns completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% for the single parameter api_key. Description adds value by specifying 'starts with gai_' and that it is a GeodesicAI API key, which is not in schema. Compensates well for low coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'Report' and resource 'calling account's plan, key usage, and limits'. It also says 'introspect what the caller is allowed to do'. Differentiates from siblings as no other tool does account status reporting.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use this to introspect what the caller is allowed to do' and suggests using when hitting rate limits or key-count caps to explain and suggest upgrades. Provides clear context for when to use, though no explicit exclusions or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_anomalyAInspect
Deep anomaly analysis with geometric proof. No Blueprint required.

Explains WHY data is anomalous using three independent methods:
1. Structural fingerprinting — distance from the learned manifold
2. Cluster analysis — deviation from structural type centroids
3. Twist-compression obstruction — fundamental constraint conflicts

Returns a human-readable geometric proof of anomaly, not just a flag.

Args:
    structured_data: The data to analyze for anomalies
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
structured_dataYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, description carries full burden and succeeds in disclosing the three specific analytical methods employed (structural fingerprinting, cluster analysis, twist-compression obstruction) and output format (human-readable geometric proof). Minor gap: doesn't explicitly confirm read-only safety or computational cost of 'deep' analysis.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear front-loading of purpose ('Deep anomaly analysis with geometric proof'), followed by methodology enumeration and return value description. The 'Args:' section is somewhat abrupt and isolated from the narrative flow, slightly disrupting readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema exists, the description adequately explains return values (human-readable geometric proof). Addresses the complex analytical nature of the tool. Minor deduction for failing to mention authentication requirements (api_key) or expected structure of the nested structured_data input.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, requiring the description to fully compensate. It documents 'structured_data' in the Args section but completely omits 'api_key', leaving 50% of parameters undocumented. The nested nature of structured_data (additionalProperties: true) is also not explained.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action (deep anomaly analysis) with distinctive methodology (geometric proof) and output format. Explicitly distinguishes from simpler flagging tools via 'not just a flag' and differentiates from blueprint-dependent siblings with 'No Blueprint required.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage context (when you need explanatory proof vs. simple detection) and notes prerequisite absence (no Blueprint). However, lacks explicit comparison to similar siblings like check_drift, validate, or check_feasibility regarding when to prefer this deeper analysis.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

approve_ruleAInspect
Promote a discovered rule into Blueprint-compatible format.

After running discover_patterns, use this to approve high-confidence
rules for inclusion in a Blueprint.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    rule_id: ID of the discovered rule (from discover_patterns results)
    blueprint: Discovery session namespace
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
rule_idYes
blueprintNodefault
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Adds valuable context about format conversion ('Blueprint-compatible format') and side effect ('inclusion in a Blueprint'). However, lacks disclosure on idempotency, reversibility, or what happens if the rule is already approved.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with clear purpose statement, followed by workflow context, then structured Args documentation. No redundant or filler text; every sentence provides actionable information for tool selection and invocation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given zero annotations and no output schema, the description successfully documents the parameter semantics and workflow integration. Minor gap: does not describe return values or success/failure behavior, which would help the agent handle the response.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but the Args section fully compensates by documenting all 3 parameters with rich semantics: api_key includes format hint ('starts with gai_'), rule_id specifies provenance ('from discover_patterns results'), and blueprint clarifies concept ('Discovery session namespace').

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific action ('Promote into Blueprint-compatible format') and resource ('discovered rule'). Clearly positions this as an approval action that follows pattern discovery, implicitly distinguishing it from sibling 'reject_rule' and fitting it into the 'discover_patterns' workflow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('After running discover_patterns') and selection criteria ('high-confidence rules'). References the prerequisite sibling tool. Lacks explicit mention of 'reject_rule' as the alternative for low-confidence rules, though this is implied.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

authorize_executionAInspect
Decide whether an action should be allowed to proceed.

Runs full validation, then applies the Blueprint's execution gate.
Returns a simple allow/block decision with reasoning.

Use this when your agent is about to take a real-world action (payment,
filing, API call, data write) and needs a deterministic go/no-go.

Different from validate: validate says "is this data correct?"
authorize_execution says "should this action happen?"

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: The data associated with the action
    blueprint: Blueprint governing this action type
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintYes
structured_dataYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses internal behavior ('Runs full validation, then applies the Blueprint's execution gate') and output format ('Returns a simple allow/block decision with reasoning'). Minor gap: does not clarify if calling this tool has side effects (e.g., logging the decision) or if it is idempotent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections: purpose statement, behavioral details, usage context with concrete examples, sibling differentiation, and parameter documentation. Every sentence provides distinct value; no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Strong completeness given constraints (no annotations, no output schema, 0% schema coverage). Describes return value ('allow/block decision with reasoning') despite lack of output schema. Could improve by mentioning error conditions (e.g., invalid blueprint) or whether the decision is recorded.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, requiring full compensation. The 'Args' section documents all three parameters: api_key includes format hint ('starts with gai_'), structured_data explains it contains 'the data associated with the action', and blueprint notes it 'governs this action type'. Deducted one point because structured_data could clarify expected schema/shape.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific action ('Decide whether an action should be allowed to proceed'), identifies the resource (Blueprint's execution gate), and explicitly distinguishes from sibling tool 'validate' ('Different from validate: validate says 'is this data correct?' authorize_execution says 'should this action happen?'').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('when your agent is about to take a real-world action (payment, filing, API call, data write) and needs a deterministic go/no-go') and provides clear differentiation from the 'validate' alternative, framing the decision as validation vs. authorization.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_driftAInspect
Check whether the data pattern has shifted since previous observations.

Works with or without a Blueprint. Monitors structural stability across
a stream of data. Detects regime changes when the data's geometric
embedding moves to a different region of the constraint space —
indicating the source data's structure has fundamentally changed.

Call this repeatedly as new data arrives to track drift over time.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: New data point to check for drift
    blueprint: Blueprint for geometric embedding context
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintNodefault
structured_dataYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully explains what constitutes drift ('geometric embedding moves to a different region'), but fails to disclose operational traits: whether the tool is read-only, idempotent, has side effects, or what errors might occur. The repeated call suggestion implies safety but doesn't explicitly state it.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with purpose front-loaded, conceptual explanation in the middle, and parameter documentation at the end. The geometric embedding explanation, while technical, earns its place by defining the detection mechanism. Only minor verbosity prevents a 5.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and annotations, the description should explain return values (e.g., drift detected boolean, confidence score), but it doesn't. While parameters are well-covered, the absence of output documentation and error handling details leaves a significant gap for a monitoring tool intended to be called repeatedly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema description coverage, the Args section comprehensively documents all three parameters. It adds crucial semantic context: api_key format ('starts with gai_'), structured_data's role ('New data point'), and blueprint's purpose ('geometric embedding context'). This fully compensates for the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks for data pattern shifts and specifically mentions 'since previous observations' and 'track drift over time,' which distinguishes it from point-in-time analysis tools like analyze_anomaly. It defines the specific domain (geometric embedding, constraint space) though could more explicitly contrast with siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance ('Call this repeatedly as new data arrives') and notes flexibility ('Works with or without a Blueprint'). However, it lacks explicit 'when not to use' guidance or named alternatives, leaving the agent to infer when to prefer this over analyze_anomaly or validate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_feasibilityAInspect
Check whether a set of constraints can be simultaneously satisfied.

Uses the twist-compression operator to detect structural obstructions —
fundamental conflicts in the constraint system that cannot be resolved
by adjusting any single field. Returns obstruction type, magnitude,
and the specific constraint interactions causing the conflict.

Fast pre-check before running full validation.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    constraints: List of derivation rules and formal constraints to check
    field_values: Current field values (numeric key-value pairs)
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
constraintsYes
field_valuesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully explains the detection algorithm ('twist-compression operator'), what constitutes a conflict ('fundamental conflicts...cannot be resolved by adjusting any single field'), and the return structure ('obstruction type, magnitude, and specific constraint interactions'). It lacks operational details like rate limits or error states, but covers the core behavioral logic well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with zero waste: a clear purpose statement, technical method explanation, return value specification, and usage context—all in four sentences. The Args section follows as structured documentation. Every sentence earns its place by adding distinct information not present in the schema or title.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 3-parameter tool with no output schema and no annotations, the description achieves completeness by documenting all inputs (via Args), explaining the algorithmic behavior, and describing the conceptual return values ('obstruction type, magnitude') despite the lack of formal output schema. It provides sufficient information for an agent to invoke the tool correctly and interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Given 0% schema description coverage, the Args section provides critical compensatory documentation: it adds format constraints ('starts with gai_'), semantic typing ('derivation rules and formal constraints'), and value constraints ('numeric key-value pairs') that are completely absent from the schema. This is exemplary parameter documentation for low-coverage schemas.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb-resource pair ('Check whether a set of constraints can be simultaneously satisfied') and distinguishes itself from siblings like 'validate' and 'validate_and_repair' by positioning this as a 'Fast pre-check before running full validation.' The technical specificity ('twist-compression operator', 'structural obstructions') further clarifies its unique role in the toolset.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear temporal context ('Fast pre-check before running full validation') indicating when to use it in a workflow. However, while it implies the existence of full validation alternatives, it does not explicitly name sibling tools like 'validate' or describe when NOT to use this tool (e.g., when you need actual solutions rather than feasibility checks).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_realizationAInspect
Run structural realization analysis on a payload.

Embeds the payload via the Blueprint's declared embedding schema,
projects it onto the Blueprint's reference subspace, and returns
a realization score, residual, projection angle, and full report.

The Blueprint must include a `realization` configuration block
(see RealizationConfig in Platform_Agent.realization.schema). If
no realization config is present, the report status is "skipped"
and the payload is treated as unconstrained by the realization
layer.

For Blueprints using basis mode "auto", the first N payloads
bootstrap the reference subspace; until the bootstrap pool is
full, the report status is "skipped". After bootstrap, every
subsequent payload is projected against the locked subspace and
receives a real score.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: payload to analyze
    blueprint: Blueprint name (defaults to "default")

Returns:
    dict with keys:
      status:              "pass" | "review" | "skipped"
      realization_score:   float in [0, 1], higher = better fit
      residual:            ||v - P_U(v)||
      angle_degrees:       angle between v and P_U(v)
      in_subspace:         bool — residual < tolerance
      basis_mode:          "vectors" | "auto" | "uninitialized"
      basis_dimension:     k of the reference subspace
      vector_dimension:    D of the embedded payload
      report:              full RealizationReport dict (may include
                           invariance/stability stacks if enabled)
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintNodefault
structured_dataYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description fully explains the algorithm (embed, project, return scores), edge cases (bootstrap, missing config), and return values. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-organized with clear sections and bullet points for return values. It is slightly verbose but front-loads the main purpose and provides necessary detail efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description comprehensively addresses the tool's complexity: 3 parameters (including nested objects), no output schema, and no annotations. It explains all return fields, edge cases, and the projection algorithm, leaving no critical gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema description coverage, the description adds meaning to each parameter: api_key format, structured_data as payload, blueprint default. It lacks details on structured_data structure but provides sufficient context for usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it runs structural realization analysis on a payload and details the process and outputs. It is distinct from siblings like check_drift or check_feasibility by focusing on subspace projection and realization scoring.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains conditions under which results are skipped (no realization config, bootstrap mode), providing context for when the tool will return meaningful results. However, it does not explicitly compare to alternatives or state when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

counterfactualAInspect
Compare outcomes under different rule sets.

Given the same data, runs trajectory analysis under two different
sets of rules/constraints and shows how the valid state space differs.
Useful for what-if analysis: "what happens if I change this rule?"

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: Data to analyze
    blueprint: Primary Blueprint (rule set A)
    rules_b: Alternative derivation rules (rule set B)
    constraints_b: Alternative constraints (rule set B)
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
rules_bNo
blueprintNodefault
constraints_bNo
structured_dataYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses that the tool performs 'trajectory analysis' and compares 'valid state space,' which hints at the computational nature of the operation. However, it fails to specify whether this is read-only (implied by 'compare' but not stated), expected runtime, rate limits, or the specific structure of the output since no output schema exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly structured with a front-loaded purpose statement ('Compare outcomes under different rule sets') followed by behavioral details and a structured Args list. No sentences are wasted; the 'what-if analysis' phrase immediately signals the tool's analytical nature.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 5 parameters, nested objects, and no output schema, the description adequately covers all input parameters and explains the comparative logic. It conceptually describes the output ('shows how the valid state space differs'). It could be improved by describing the actual return format or structure since no output schema is available, but it meets the minimum threshold for invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Given 0% schema description coverage, the Args section fully compensates by documenting all five parameters. It adds crucial semantic context that the schema lacks, particularly the relationship between 'blueprint' (rule set A) and 'rules_b'/'constraints_b' (rule set B), and the api_key format hint ('starts with gai_').

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs comparative trajectory analysis under two rule sets to show differences in valid state space. It effectively distinguishes this as a 'what-if' analysis tool, though it could more explicitly differentiate from siblings like 'check_feasibility' or 'validate' which operate on single rule sets.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides an implied usage pattern through the 'what happens if I change this rule?' example, suggesting it's for comparing alternative configurations. However, it lacks explicit guidance on when to use this versus 'check_feasibility' or 'validate', and doesn't specify prerequisites or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_blueprintAInspect
Create a Blueprint — a governance contract that defines validation rules.

A Blueprint tells the platform what "correct" means for your data: which
fields exist, what math must hold between them, and what value ranges
are acceptable. Without a Blueprint, the platform has nothing to validate against.

If you don't know what rules to define, use load_rule_pack to start from
a prebuilt template, or use discover_patterns to find rules from your data.

Use the blueprint_guide prompt for the complete reference of all available
rule types, constraint types, and configuration options.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    customer_name: Organization or project name (used for folder naming)
    workflow_name: Unique identifier for this Blueprint (used as the 'blueprint' parameter in validate)
    mode: "observe" (platform checks agent's work) or "enforce" (platform computes derived fields)
    extracted_fields: Fields the agent extracts from source data (e.g. ["vendor", "qty", "unit_cost"])
    derived_fields: Fields computed from other fields (e.g. ["subtotal", "total"])
    derivation_rules: Math rules defining field relationships. Available types: "add" (target = a + b), "subtract" (target = a - b), "multiply" (target = a × b), "divide" (target = a ÷ b), "round" (round field to N places), "copy" (copy source to target), "items_multiply" (per-item a × b in a list), "items_sum" (sum a field across list items). Each rule requires "type" and the relevant fields. See blueprint_guide prompt for full schema.
    formal_constraints: Value bounds and ratio constraints. Available types: "magnitude_anchor" (field within min/max range, requires "field", "min", "max"), "relative_anchor" (ratio a/b within tolerance, requires "a", "b", "expected_ratio", "tolerance"), "max_action_threshold" (trigger action if field exceeds threshold, requires "field", "threshold", "action"). See blueprint_guide prompt for full schema.
    semantic_checks: Domain-specific validation checks
    require_math: Validate mathematical relationships (default true)
    require_consistency: Check internal consistency (default true)
    require_coherence: Check structural coherence (default true)
    require_provenance: Require agents to report extraction source locations
    require_high_assurance: Strictest validation — feasibility, spectral, and global consistency required
    enable_anomaly_detection: Geometric fingerprinting to detect structural outliers
    enable_drift_tracking: Monitor pattern stability across batches
ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoobserve
api_keyYes
require_mathNo
customer_nameYes
workflow_nameYes
derived_fieldsNo
semantic_checksNo
derivation_rulesNo
extracted_fieldsNo
require_coherenceNo
formal_constraintsNo
require_provenanceNo
require_consistencyNo
enable_drift_trackingNo
require_high_assuranceNo
enable_anomaly_detectionNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Explains critical behavioral modes ('observe' vs 'enforce'), describes what each boolean flag controls (anomaly detection, drift tracking, high assurance), and clarifies that the platform validates against these rules. Missing: return value description, idempotency behavior, or error conditions for duplicate workflow_name.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Length is substantial (16 params with 0% schema coverage necessitate detail), but structure is logical: concept → prerequisites → alternatives → detailed args. Each sentence adds value, though the Args section is necessarily dense. No wasted words or repetition of schema titles.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Excellent coverage of 16 input parameters, but tool has no output schema and description fails to specify return values (e.g., does it return the blueprint ID, a confirmation object, or nothing?). For a creation operation, this omission leaves a critical gap in contextual completeness despite strong input documentation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but the Args section compensates extensively. Provides examples for extracted_fields and derived_fields, enumerates and explains all derivation_rules types (add, subtract, items_multiply, etc.) and formal_constraints types (magnitude_anchor, relative_anchor, etc.), and clarifies boolean defaults and semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb+resource ('Create a Blueprint') and immediately defines it as 'a governance contract that defines validation rules.' Distinguishes from siblings by referencing load_rule_pack and discover_patterns as alternatives for different use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use alternatives: 'If you don't know what rules to define, use load_rule_pack... or use discover_patterns.' References blueprint_guide prompt for complete schema reference, providing clear decision criteria for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_chainAInspect
Create a multi-agent sequential execution chain.

Defines a pipeline where multiple agents process data in sequence.
Each stage is validated against the Blueprint before the next stage
can proceed. Repair suggestions propagate forward through the chain.

Different from submit_chain_stage: create_chain defines the
pipeline (stages, agent names, TTL); submit_chain_stage executes
one stage of an already-created chain. Different from
handoff_audit: handoff_audit verifies a transition between stages
after they have been submitted.

The Blueprint must already exist (use create_blueprint or
load_rule_pack first). Each stage in the chain validates against
the same Blueprint.

Sibling tools: create_chain (define the pipeline), submit_chain_stage
(advance through it), handoff_audit (verify between stages).

Args:
    api_key:  GeodesicAI API key (starts with gai_)
    blueprint: workflow_name of the Blueprint governing all stages
    stages:    List of stage definitions, e.g.
               [{"stage_name": "extract", "agent_name": "PDF Scanner"},
                {"stage_name": "validate", "agent_name": "QA Agent"}]
               Minimum 2 stages required.
    ttl:       Chain timeout in seconds. After this many seconds
               without progress, the chain expires and submitted
               stages cannot be advanced. Default: 3600 (1 hour).

Returns:
    chain_id:    unique identifier for the chain (use this with
                 submit_chain_stage and handoff_audit)
    status:      initial chain status (typically "active")
    stages:      list of stage records with stage_name, agent_name,
                 and per-stage status
    next_stage:  name of the first stage to submit
    ttl_seconds: echoed back for caller reference
ParametersJSON Schema
NameRequiredDescriptionDefault
ttlNo
stagesYes
api_keyYes
blueprintYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description bears the full burden. It discloses key behaviors: each stage validates against the Blueprint before proceeding, repair suggestions propagate forward, and TTL causes expiry after inactivity. However, it does not detail authentication behavior beyond mentioning api_key format, and error handling is not described. Still, it provides substantial transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the purpose. It uses clear sections for differentiation and parameters. While it is relatively long, every sentence adds value. A slight reduction in verbosity could improve conciseness, but currently it is efficient and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity, lack of output schema, and zero parameter descriptions in the schema, the description is thorough. It explains the return structure (chain_id, status, stages, next_stage, ttl_seconds) and covers prerequisites, parameter details, and differentiation from siblings. The description leaves no critical gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage, so the description must compensate fully. It does: for api_key, it notes 'starts with gai_'; for blueprint, 'workflow_name of the Blueprint'; for stages, gives an example and notes minimum 2; for ttl, explains default and purpose. This adds significant meaning beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Create a multi-agent sequential execution chain.' It uses a specific verb (create) and resource (chain), and distinguishes itself from sibling tools like submit_chain_stage and handoff_audit by explaining their different roles.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool versus alternatives: 'Different from submit_chain_stage: create_chain defines the pipeline... submit_chain_stage executes one stage... handoff_audit verifies a transition...' It also provides prerequisite: 'The Blueprint must already exist (use create_blueprint or load_rule_pack first).' This gives clear guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

decompose_failureAInspect
Hodge-style decomposition of validation failures.

Splits the error between original and corrected values into three
orthogonal components:
- Exact: direct rule violations (a field breaks a specific math rule)
- Co-exact: constraint boundary violations (a field is at the edge
  of valid range)
- Harmonic: systemic structural errors (the overall data shape
  is wrong)

Different from analyze_anomaly: analyze_anomaly explains why a
single payload is anomalous given the platform's structural
fingerprints. decompose_failure compares two known states (an
original and its correction) and tells you which kind of failure
accounts for the difference. Use decompose_failure when you have
a known-correct version to diff against; use analyze_anomaly when
you only have the suspicious payload.

Use this after a validation failure to understand the structural
nature of the error, not just which fields differ.

Args:
    api_key:           GeodesicAI API key (starts with gai_)
    original_values:   The original field values (numeric
                       key-value pairs)
    corrected_values:  The corrected/expected field values
    derivation_rules:  Math rules. Optional if blueprint provided.
    formal_constraints: Constraints. Optional if blueprint provided.
    blueprint:         Load rules from this Blueprint instead of
                       passing them inline. Caller must own the
                       Blueprint.

Returns:
    primary_cause:          one of "exact" / "co_exact" / "harmonic"
    exact:                  details of direct rule violations
    co_exact:               details of constraint boundary violations
    harmonic:               details of systemic structural errors
    contributing_fields:    map of field_name → contribution magnitude
    total_error_magnitude:  scalar L2 norm of the full error
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintNo
original_valuesYes
corrected_valuesYes
derivation_rulesNo
formal_constraintsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description must carry full burden. It discloses that the tool returns three orthogonal components and requires caller ownership of Blueprint. It does not explicitly state if the tool is read-only, but the analytical nature and return details suggest no side effects. This is adequate but could mention idempotency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear lead sentence, component list, differentiation, usage guidance, parameter list, and return description. It is appropriately sized for the tool's complexity, though the enumerated component list could be slightly more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description provides a detailed return section covering all result fields. It covers all six parameters, nested objects, usage context, and prerequisite (blueprint ownership). The description is complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, but the description fully explains each parameter in the Args section, including types, defaults, and optionality. For example, it notes that derivation_rules and formal_constraints are optional if blueprint provided. This compensates completely for the missing schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs 'Hodge-style decomposition of validation failures' with a specific verb (decompose) and resource (validation failures). It explicitly distinguishes from sibling tool analyze_anomaly, eliminating ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when-to-use guidance: 'Use decompose_failure when you have a known-correct version to diff against; use analyze_anomaly when you only have the suspicious payload.' It also recommends use after a validation failure, leaving no ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_api_keyAInspect
Permanently delete one of the caller's API keys.

DESTRUCTIVE — agents using the deleted key will receive auth
errors immediately. The Blueprint a key was tied to (if any) is
NOT affected; only the credential is revoked. To delete a
Blueprint and all its keys, use delete_blueprint.

The target key can be specified two ways:
  - As the full key string (gai_...).
  - As a key_id (SHA-256 hash from list_api_keys).

Args:
    api_key: GeodesicAI account-level API key (starts with gai_).
    key_to_delete: Either the full API key string OR the key_id
        from list_api_keys.
    confirm: Must be true to actually delete. If false, returns
        a preview without deleting. Default: false.

Returns:
    status:   "ok" | "preview" | "ERROR"
    deleted:  metadata about the deleted key (only on ok)
    message:  human-readable summary
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
confirmNo
key_to_deleteYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description fully discloses destructive behavior: immediate auth errors for agents using the deleted key, that the Blueprint is unaffected, and the return structure (status, deleted metadata, message). This goes beyond what annotations would typically provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections and bullet points. The main purpose is front-loaded. Every sentence adds value, and the length is appropriate for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool and that no output schema is provided, the description covers input parameters, behavior, return values, and alternatives. It is complete for an agent to understand and invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the description explains all three parameters in detail: api_key (account-level key starting with gai_), key_to_delete (full key or key_id), confirm (must be true to delete, default false). Also documents the return format, compensating for the lack of schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the action: 'Permanently delete one of the caller's API keys.' Distinguishes from sibling tool delete_blueprint by noting that the Blueprint itself is not affected. The verb 'delete' and resource 'API key' are specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use (to delete an API key), when-not-to-use (if you want to delete a Blueprint and its keys, use delete_blueprint), and alternative methods for specifying the target (full key string or key_id). Also explains the confirm parameter for preview mode.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

delete_blueprintAInspect
Permanently delete a Blueprint and all of its API keys.

DESTRUCTIVE — cannot be undone. Cascading effects:
- The Blueprint's template_config.json is removed from disk.
- All Blueprint-scoped API keys for this workflow are deleted.
  Any agents using those keys will start receiving auth errors
  on their next call.
- The Blueprint is removed from the platform's template registry.

Account-level keys are NOT affected. Only the per-Blueprint keys
minted at create time (or via this Blueprint's UI) are revoked.

Use list_blueprints first to confirm the workflow_name. The caller
must own the Blueprint — cross-account deletion is rejected.

Different from update_blueprint: update_blueprint replaces the
config in place and keeps the API keys; delete_blueprint removes
everything.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    workflow_name: Name of the Blueprint to delete (the same value
        used as 'blueprint' in validate)
    confirm: Must be set to true to actually delete. If false, the
        tool returns a preview of what would be deleted without
        performing the deletion. Default: false.

Returns:
    status:           "ok" | "preview" | "ERROR"
    deleted:          workflow_name that was removed (only on ok)
    keys_revoked:     number of Blueprint API keys revoked
    message:          human-readable summary
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
confirmNo
workflow_nameYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description carries the full transparency burden. It thoroughly discloses the destructive nature, cascading effects (template_config.json removal, API key revocation, registry removal), and what is not affected (account-level keys). The confirm parameter behavior is also detailed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (Destructive, Different from update_blueprint, Args, Returns) and front-loaded with the main purpose. However, it is somewhat lengthy, with some redundancy (e.g., effects listed twice in different sections). Still, it earns its length with valuable detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (3 parameters, destructive, optional preview, multiple cascading effects), the description is very complete. It covers prerequisites, side effects, parameter behavior, and return values. No output schema exists, but returns are described in detail.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the tool description compensates by explaining each parameter: api_key starts with 'gai_', workflow_name is the same as used in 'validate', confirm defaults to false and when true performs deletion, otherwise gives a preview. This adds significant meaning beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Permanently delete a Blueprint and all of its API keys.' It uses a specific verb ('delete'), names the resource ('Blueprint'), and adds scope ('all of its API keys'). It distinguishes from 'update_blueprint' by contrasting what each does.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly gives when-to-use guidance: 'Use list_blueprints first to confirm the workflow_name.' It also specifies prerequisites ('The caller must own the Blueprint') and contrasts with 'update_blueprint' to indicate when not to use this tool. The confirm parameter's dual behavior (preview vs deletion) is explained.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_patternsAInspect
Feed a batch of structured data to discover patterns deterministically.
No Blueprint required — discovers rules from the data itself.

Analyzes the batch using motif discovery, structural routing, and
geometric fingerprinting. Returns candidate validation rules with
confidence scores, structural type classifications, and anomaly counts.
Source data is never stored — only statistical summaries persist.

Discovered rules can be promoted into Blueprints using approve_rule.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    documents: List of structured data objects to analyze
    blueprint: Namespace for this discovery session
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintNodefault
documentsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses that data is not stored and that only statistical summaries persist. It also describes the analysis techniques employed. However, it does not cover potential risks or boundaries like rate limits or authentication failures beyond mentioning the API key format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is reasonably structured but somewhat verbose with multiple paragraphs. It could be more concise, though the bullet-pointed 'Args' section helps readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity and lack of output schema, the description is fairly complete. It covers what the tool does, how it works, its data handling, and how results are used. Missing details are minor (e.g., example inputs).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds meaning to all three parameters: api_key ('starts with gai_'), documents ('List of structured data objects to analyze'), and blueprint ('Namespace for this discovery session'). Since schema coverage is 0%, this is essential and well-done.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Feed a batch of structured data to discover patterns deterministically' and 'discovers rules from the data itself.' It lists specific analysis techniques and distinguishes from sibling tools like approve_rule, which promotes discovered rules.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context on when to use the tool, including that no Blueprint is required and that source data is never stored. It also mentions that discovered rules can be promoted using approve_rule, indicating an alternate tool. However, it does not explicitly state when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

forecastAInspect
Deterministic forward reasoning — show what valid states are reachable.

Given the current data state, generates candidate next states by
applying Blueprint rules, then ranks them by structural quality
(lowest drift, highest confidence, shortest path, or lowest risk).

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: Current data state
    blueprint: Blueprint to use for rules and constraints
    max_depth: How many steps ahead to forecast (1-10)
    max_branches: Maximum branches per step (1-10)
    rank_by: Ranking criterion — "drift" (most stable), "confidence" (highest geometric confidence), "shortest" (fewest changes), "risk" (lowest risk)
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
rank_byNodrift
blueprintNodefault
max_depthNo
max_branchesNo
structured_dataYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description must fully disclose behavior. It mentions deterministic forward reasoning and ranking, but does not specify whether the tool is read-only or has side effects. The description partially covers behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, with a clear introductory sentence, a brief explanation of the process, and a well-structured parameter list. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 6 parameters and no output schema. While parameters are well explained, the description lacks information about the return value or how to interpret the forecast results. This is a gap for a forecasting tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the description's 'Args' section explains all six parameters with constraints and options (e.g., api_key starts with 'gai_', max_depth 1-10, rank_by options). This adds significant meaning beyond the schema's titles and defaults.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'deterministic forward reasoning' and explains the process: generating candidate states from current data using Blueprint rules and ranking them. It distinguishes from siblings like 'counterfactual' and 'check_feasibility' by specifying its forward-reasoning nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains what the tool does but does not provide explicit guidance on when to use it over alternatives. There is no mention of prerequisites or conditions where this tool is preferable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

geometric_confidenceAInspect
Compute a composite geometric confidence score from validation
signals. No Blueprint required — works on any validate result.

Combines six weighted signals into a single confidence score:
- Surface distance (how close to the constraint manifold)
- Geometric health (projection quality, regulator, closure)
- Anomaly score (structural fingerprint deviation)
- Stability score (batch drift)
- Motif compliance (pattern violations)
- Motif gate (enforcement decision)

Returns confidence level (high/medium/low) and a recommendation.

Different from analyze_anomaly and check_drift: those tools
perform new analysis on raw data. geometric_confidence is
post-hoc — it digests an already-computed state_vector and
returns a single confidence number. Use this when you have a
state_vector from a prior validate / get_execution_trace call
and want a one-line summary of structural quality. Use
analyze_anomaly when you need to know why something is
anomalous; use check_drift when you need to compare against
historical observations.

Args:
    api_key:      GeodesicAI API key (starts with gai_)
    state_vector: State vector dictionary, typically the
                  state_vector field from a validate or
                  get_execution_trace result.

Returns:
    confidence:      float in [0, 1]
    level:           "high" / "medium" / "low"
    recommendation:  text recommendation for the caller
    signals:         per-signal contribution breakdown
    signal_weights:  weights used in the composite
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
state_vectorYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but the description clearly explains the tool's behavior: it combines six weighted signals into a score, is post-hoc (no new analysis), and returns confidence level and recommendation. Discloses that it digests an already-computed state_vector, which implies no side effects. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (signals list, differentiation, Args/Returns). Slightly verbose but every sentence adds value. Could be trimmed slightly (e.g., signal details could be condensed) but overall effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description explicitly lists returned fields (confidence, level, recommendation, signals, signal_weights). Explains input context (works on validate results) and differentiates from siblings. Complete coverage for a tool with 2 parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the description adds significant value: explains api_key format (starts with gai_) and specifies state_vector as the dictionary from validate/get_execution_trace results. This fully compensates for the lack of schema parameter descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'Compute' and resource 'composite geometric confidence score', clearly distinguishing it from sibling tools analyze_anomaly and check_drift by stating it is post-hoc on an existing state_vector. It lists the six signals and the output format.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states 'No Blueprint required — works on any validate result' and contrasts with analyze_anomaly and check_drift, providing clear when-to-use guidance: use this when you have a state_vector from validate/get_execution_trace, use analyze_anomaly for anomaly reasons, check_drift for historical comparison.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_execution_traceAInspect
Run validation and return the detailed execution trace.

Shows the exact sequence of validation nodes that ran, whether
each was deterministic, and the runtime of each node. Use for
debugging, compliance audits, or understanding exactly what the
platform checked.

Different from validate: validate returns the verdict (PASS / FAIL
/ REVIEW) and the state vector summary. get_execution_trace returns
everything validate does PLUS the per-node trace records. Use
validate for normal operation; use get_execution_trace when you
need to see inside the pipeline (debugging, audit prep, latency
analysis).

The trace is the same whether validation passes or fails — every
node that ran is recorded with its inputs, outputs, and timing.

Args:
    api_key:         GeodesicAI API key (starts with gai_)
    structured_data: The data to trace validation for
    blueprint:       Blueprint to validate against. Caller must
                     own the Blueprint.

Returns:
    status:              "PASS" / "FAIL" / "REVIEW" / "ERROR"
    determinism_hash:    cryptographic hash of inputs + rules
    trace:               ordered list of node records, each with:
                           node_name, node_type, deterministic (bool),
                           runtime_ms, inputs, outputs
    node_count:          number of nodes in the trace
    deterministic_count: how many nodes were deterministic
    state_vector:        same state_vector validate returns
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintNodefault
structured_dataYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It transparently states that the trace is the same whether validation passes or fails, and enumerates the return fields including node records. It does not mention prerequisites or side effects, but for a trace/read operation this is sufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections for purpose, usage comparison, and parameter details. It is slightly verbose but every sentence adds value. Could be tightened, but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters, no annotations, no output schema, and returns nested objects, the description is remarkably complete. It details the return structure (status, determinism_hash, trace with node records, node_count, etc.) and explains the relationship to validate. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It adds meaningful context: api_key starts with 'gai_', structured_data is 'the data to trace validation for', and blueprint indicates 'caller must own the Blueprint'. This goes beyond the schema property titles.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it runs validation and returns a detailed execution trace. It explicitly distinguishes itself from validate by noting that validate returns only verdict and state vector, while this returns those plus per-node trace records. This provides specific verb+resource and differentiates from siblings.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool versus validate: for debugging, compliance audits, or understanding what the platform checked; use validate for normal operation. It also names the alternative, giving clear guidance on context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

handoff_auditAInspect
Audit a handoff between two chain stages. Returns a context capsule
with verified facts from the prior stage and checks structural
compatibility of proposed data for the next stage.

Use this between chain stages to ensure Agent B receives only
verified data from Agent A, and that nothing was mutated in transit.

Sibling tools: create_chain (define the pipeline), submit_chain_stage
(advance through it), handoff_audit (verify between stages).

The context capsule contains:
- Verified fields and their values from the prior stage
- Determinism hash proving the prior stage's results
- Blueprint constraints the next stage must satisfy
- Compatibility verdict if proposed_data is provided

Args:
    api_key: GeodesicAI API key (starts with gai_)
    chain_id: Chain identifier from create_chain
    from_stage: Stage name that completed (Agent A)
    to_stage: Stage name about to start (Agent B)
    proposed_data: Optional data Agent B intends to submit — checked for compatibility
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
chain_idYes
to_stageYes
from_stageYes
proposed_dataNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full responsibility. It describes the output but does not explicitly state side effects (e.g., read-only, authentication requirements beyond the api_key parameter). While it implies non-destructive behavior, it lacks explicit behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (purpose, usage, siblings, output, args). Every sentence adds value without repetition or fluff. It is appropriately sized for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters, no output schema, and nested objects, the description covers input semantics, output structure (four bullet points), and usage context. It provides enough information for an agent to correctly select and invoke the tool, including when to use it and what to expect.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description's 'Args' section adds significant meaning: it describes each parameter, including the api_key format (starts with gai_), and clarifies the optional proposed_data usage. This fully compensates for the schema gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: auditing a handoff between chain stages, verifying facts and structural compatibility. It explicitly distinguishes from sibling tools by listing their roles.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool ('between chain stages') and why (to ensure verified data and no mutation). It also lists sibling tools with their purposes, providing clear alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_api_keysAInspect
List all API keys owned by the calling account.

Returns a masked representation of each key plus a stable key_id
(SHA-256 hash) that can be used with rotate_api_key and
delete_api_key. Full key strings are NEVER returned by this tool.

Each entry includes:
  - api_key: masked key string (e.g. "gai_***...REe0")
  - key_id: SHA-256 hash, usable as the target for rotate/delete
  - type: "account" or "blueprint"
  - intent: "All tools (account key)" or the Blueprint workflow_name
  - customer: associated customer/project name
  - created: ISO timestamp

Args:
    api_key: GeodesicAI account-level API key (starts with gai_).
        Blueprint-scoped keys cannot list keys.

Returns:
    status: "ok" | "ERROR"
    keys: list of key records with masked api_key and key_id
    total: number of keys returned
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It fully discloses behavior: returns masked keys only, never full string; provides key_id for subsequent operations; lists all fields returned. No side effects are needed, but it transparently states limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is well-structured with bullet points and clear sections. Each sentence adds value: purpose, behavior, field details, and parameter constraints. No waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description explains return format (status, keys, total) and each key field. It covers usage, parameter, and output comprehensively. For a simple list tool, it is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has one parameter with only a title. Description adds critical context: 'GeodesicAI account-level API key (starts with gai_). Blueprint-scoped keys cannot list keys.' Schema coverage is 0%, but description fully compensates.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool's purpose: 'List all API keys owned by the calling account.' It specifies it returns masked keys and key_id for use with rotate/delete, distinguishing it from sibling tools like delete_api_key and rotate_api_key. No sibling listing tool exists, further clarifying its unique role.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description provides explicit usage guidelines: 'Args: api_key: GeodesicAI account-level API key (starts with gai_). Blueprint-scoped keys cannot list keys.' This tells when to use (account-level) and when not (blueprint-scoped), and implies it is for listing only.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_blueprintsAInspect
List Blueprints owned by the calling account.

Returns each Blueprint's name, workflow identifier, mode, and
field/rule/constraint counts. Use the workflow_name as the
'blueprint' parameter when calling validate.

Blueprint modes — important for agents that summarize results:
  - "observe" (Observation mode): the platform validates data
    against the Blueprint's rules and returns PASS, FAIL, or
    REVIEW with repair suggestions. Validation is detection-only;
    the caller decides what to do with the result.
  - "enforce" (Execution mode): the platform performs the same
    validation AND authorizes downstream side effects. Side
    effects are blocked if validation fails. In this mode, rules
    that compute derived values (like totals) publish the
    platform's computed result as authoritative, so downstream
    consumers see the canonical value rather than the agent's.

Both modes run the same rule checks. The difference is what
happens after validation, not during it.

Scope: only Blueprints created by the calling account are returned.
Use validate / repair / etc. with the workflow_name to operate on them.

Args:
    api_key: GeodesicAI API key (starts with gai_)
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses scope (only owned blueprints), return structure, mode behaviors (observe vs enforce), and side effects (enforce mode authorizes downstream actions, derived values authoritative). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is well-structured with a clear lead sentence, bullet-like return enumeration, and separated mode details. However, some depth on modes could be concise, but overall it efficiently conveys necessary information without excessive verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description covers return fields and behavioral aspects (scope, modes, usage with validate). Lacks example usage or error handling, but provides sufficient context for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter (api_key) with 0% schema description coverage. Description adds format guidance (starts with gai_) and context (GeodesicAI API key), compensating well for the lack of schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists Blueprints owned by the calling account, specifies returned fields (name, workflow identifier, mode, field/rule/constraint counts), and distinguishes it from sibling tools by clarifying scope (owned blueprints) and usage with validate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explains when to use (listing own blueprints) and provides guidance on using the result (workflow_name as blueprint parameter for validate). It details modes (observe vs enforce) to help agents decide which context applies, but does not explicitly mention when NOT to use or list alternatives beyond validate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

load_rule_packAInspect
Load a prebuilt Blueprint template for fast onboarding.

Rule Packs are ready-made governance configurations for common use cases.
Call with no pack_id to list all available packs. Call with a pack_id
to load the full configuration, then use create_blueprint to save it.

Available packs include templates for: invoice governance, timecard/payroll
governance, legal document governance, purchase order governance, and
insurance claims governance. Each includes field definitions, derivation
rules, constraints, and agent conditioning instructions.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    pack_id: ID of the rule pack to load. Omit to list available packs.
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
pack_idNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the dual-mode behavior (list vs. load), clarifies that this tool does not persist data ('then use create_blueprint to save it'), and details the content structure of returned packs ('field definitions, derivation rules, constraints'). Minor gap: does not explicitly state read-only nature or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with purpose front-loaded, followed by concept explanation, usage patterns, examples, and parameter details. Minor redundancy exists ('Omit to list available packs' appears twice), but information density remains high with no wasted sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a governance-platform tool with no output schema, the description provides excellent domain context by enumerating specific pack types (invoice, payroll, legal, etc.) and explaining pack contents. It adequately prepares the agent to understand return values even without a formal output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Given 0% schema description coverage, the description fully compensates by providing crucial semantic details: api_key format hint ('starts with gai_') and pack_id behavioral semantics ('Omit to list available packs'). It transforms the schema's bare types into actionable parameter guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific verb-resource pair ('Load a prebuilt Blueprint template') and immediately distinguishes the domain ('fast onboarding', 'governance configurations'). It effectively differentiates from sibling tool create_blueprint by stating this tool only loads configurations while create_blueprint is needed to save them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly documents the two operational modes: 'Call with no pack_id to list all available packs' versus 'Call with a pack_id to load the full configuration.' It also establishes the workflow sequence by directing users to use create_blueprint as the next step after loading, providing clear when-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

reject_ruleAInspect
Reject a discovered candidate rule so it will not be promoted into a Blueprint.

Use after running discover_patterns when reviewing the candidate rules
it produced. Pair with approve_rule: approve high-confidence rules you
want enforced, reject the rest. Rejected rules are marked in the
discovery session and will not appear in subsequent approval calls or
be promotable into Blueprints. The action is recorded against the
discovery session namespace; it does not affect any existing Blueprint.

Use this when:
  - A discovered rule is mathematically valid but business-irrelevant
  - The rule has low confidence and you don't want it surfacing again
  - You want to clean up the candidate list before promoting approvals

Args:
    api_key: GeodesicAI API key (starts with gai_)
    rule_id: ID of the discovered rule (from discover_patterns results)
    blueprint: Discovery session namespace (must match the one used for discover_patterns)

Returns:
    status: "rejected" on success
    rule_id: The rejected rule's ID
    blueprint: The discovery session namespace
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
rule_idYes
blueprintNodefault
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavioral traits: rejected rules are marked in the session, excluded from future approvals, not promotable, and recorded against namespace without affecting existing Blueprints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Structured with clear sections (purpose, usage, args, returns) but slightly verbose. Every sentence adds value, though could be tightened slightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description includes return values (status, rule_id, blueprint) and explains the tool's role in the discovery process, making it complete for the use case.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description adds essential meaning: api_key starts with 'gai_', rule_id comes from discover_patterns, blueprint is the discovery session namespace. These details are not in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the verb 'reject' and the resource 'discovered candidate rule'. Differentiates from sibling tools like approve_rule and discover_patterns by specifying the context of use.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use (after discover_patterns, for irrelevant or low-confidence rules) and when not (pairing with approve_rule). Provides clear alternative usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

repairAInspect
Compute the nearest valid point on the constraint manifold.

Given structured data with errors, missing values, or inconsistencies,
returns field-by-field repair suggestions with geometric confidence scores.
Uses algebraic projection for linear constraints and iterative geodesic
projection for nonlinear constraints.

Can accept rules directly or load them from a Blueprint.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: The data to repair (key-value pairs)
    derivation_rules: Math rules (optional if blueprint is provided)
    formal_constraints: Constraints (optional if blueprint is provided)
    blueprint: Load rules from this Blueprint instead of specifying them directly
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintNo
structured_dataYes
derivation_rulesNo
formal_constraintsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full behavioral disclosure burden. It successfully explains the algorithmic approach ('algebraic projection for linear constraints and iterative geodesic projection for nonlinear constraints') and output format. However, it omits operational details such as error handling, idempotency, or side effects of the repair computation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is structured logically with a high-level mathematical definition, functional explanation, algorithmic details, and parameter documentation. Every sentence contributes value. It is slightly dense with technical terminology ('constraint manifold,' 'geodesic projection') but appropriate for the domain complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, mathematical domain) and lack of output schema, the description adequately covers the core functionality and parameter semantics. It mentions the API key requirement implying authentication needs. It could be improved by describing the output structure or error scenarios, but the 'field-by-field repair suggestions' provides essential return value context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage, but the description compensates excellently through the 'Args:' section. It adds critical semantic details: api_key format ('starts with gai_'), structured_data structure ('key-value pairs'), and conditional requirement logic ('optional if blueprint is provided'). This fully compensates for the schema's lack of descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool computes 'the nearest valid point on the constraint manifold' and handles 'structured data with errors, missing values, or inconsistencies.' It specifies the output as 'field-by-field repair suggestions with geometric confidence scores.' However, it does not explicitly differentiate from the sibling tool 'validate_and_repair'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implicit usage guidance by explaining that derivation_rules/formal_constraints are 'optional if blueprint is provided,' indicating mutual exclusivity between direct rules and blueprint loading. However, it lacks explicit guidance on when to use this tool versus siblings like 'validate_and_repair' or 'validate,' and provides no 'when-not-to-use' criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

repair_pathAInspect
Find the shortest path from an invalid state to a valid one.

Given data that fails validation, computes a sequence of minimal
field changes that would bring the data into compliance with the
Blueprint's rules and constraints. Returns the path as an ordered
list of intermediate states.

Different from repair: repair returns the single nearest valid
point on the constraint manifold (a one-shot projection).
repair_path returns the trajectory between invalid and valid
states — useful when you want to see what's changing step by
step, or when there are multiple ways to repair the data and you
want to evaluate alternatives. Use repair when you just need the
fix; use repair_path when you need to explain or audit the fix.

Args:
    api_key:         GeodesicAI API key (starts with gai_)
    structured_data: Current (invalid) data state
    blueprint:       Blueprint defining the valid constraint space.
                     Caller must own the Blueprint.
    max_depth:       Maximum repair steps to search (1-10).
                     Higher values find paths through more complex
                     repairs but take longer.
    rank_by:         Ranking criterion for selecting among valid
                     paths. One of:
                       "shortest"    — fewest changes
                       "drift"       — most stable trajectory
                       "confidence"  — highest geometric confidence
                       "risk"        — lowest risk

Returns:
    path:            ordered list of intermediate states, each
                     with field values and structural quality metrics
    path_length:     number of steps in the returned path
    terminal_state:  the final valid state at the end of the path
    rank_score:      score of the chosen path under rank_by
    alternatives:    summary of paths considered but not chosen
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
rank_byNoshortest
blueprintNodefault
max_depthNo
structured_dataYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations, so description carries full burden. It details the algorithm (computes sequence of minimal changes), explains parameters like max_depth and rank_by, and mentions authorization. However, it does not address error handling or edge cases (e.g., already valid data).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured: one-line summary, elaboration, differentiation, then Args and Returns sections. No fluff, efficient use of language.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description includes a detailed Returns section covering all return fields. With 5 parameters all explained, the description is fully self-contained for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but the description's Args section covers all 5 parameters with constraints (e.g., rank_by enum values, max_depth range 1-10) and context. This adds significant meaning beyond the raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with a clear verb+resource ('Find the shortest path') and elaborates on the function. It explicitly distinguishes from the sibling tool 'repair', making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool vs 'repair' ('Use repair when you just need the fix; use repair_path when you need to explain or audit the fix.') and includes a prerequisite ('Caller must own the Blueprint').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rotate_api_keyAInspect
Rotate one of the caller's API keys. The old key stops working
immediately; a new key with the same scope is issued.

The new key inherits the old key's scope:
  - An account-level key rotates to another account-level key.
  - A Blueprint-scoped key rotates to a key for the same Blueprint.

The target key can be specified two ways:
  - As the full key string (gai_...) — useful if you just received it.
  - As a key_id (SHA-256 hash from list_api_keys) — useful when you
    only have the masked version. Either is accepted.

Args:
    api_key: GeodesicAI account-level API key (starts with gai_).
        Must be an account key.
    key_to_rotate: Either the full API key string OR the key_id
        (SHA-256 hex hash) returned by list_api_keys.

Returns:
    status:    "ok" | "ERROR"
    api_key:   the new key string (copy this immediately)
    type:      "account" | "blueprint"
    intent:    Blueprint folder name, or empty for account keys
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
key_to_rotateYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description fully discloses behavior: immediate deactivation of old key, scope inheritance, and two input options. It also avoids contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured (sections, bullet points) and informative. It could be slightly more concise, but the clarity outweighs minor redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers all necessary aspects: behavior, input formats, and return fields. No output schema exists, but returns are documented. Sibling tools are distinct, making this complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, the description compensates fully by explaining both parameters: api_key must be an account key, key_to_rotate accepts either full key string or key_id. It adds constraints and usage details beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Rotate one of the caller's API keys' and explains the effect (old key stops, new key issued). It distinguishes from siblings like delete_api_key and list_api_keys by focusing on rotation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use (to rotate a key) and how to specify the target key (full key or key_id). It does not explicitly mention when not to use, but the sibling names suggest alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

structural_typesAInspect
Get auto-discovered structural type classifications from a discovery session.

After running discover_patterns, returns the structural categories the
platform identified in the data — without being told what categories
exist. Each category includes document count, distinguishing fields,
and domain hints inferred from the data shape.

This is a read-only retrieval. If discover_patterns has not been run
against the given blueprint namespace (or the session has expired),
returns an empty type list with status="no_session".

Use after discover_patterns when you want to understand how the
platform grouped your data before deciding which patterns to promote
via approve_rule.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    blueprint: Discovery session namespace (must match the namespace
               used in discover_patterns)

Returns:
    status: "ok" or "no_session"
    structural_types: list of {type_id, document_count, distinguishing_fields, domain_hint}
    total_documents: total document count across all types
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintNodefault
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, but description explicitly states 'read-only retrieval' and describes behavior when discover_patterns not run or session expired. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise and front-loaded with purpose, then context, usage, parameter details, and return format. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description fully explains return structure and includes edge cases (no session). Complete for a retrieval tool with 2 parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage 0%, but description adds value: api_key format hint and blueprint explanation. Complements schema for both parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it gets auto-discovered structural type classifications from a discovery session, with specific verb and resource. Distinguishes from siblings like discover_patterns and approve_rule.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use after discover_patterns' and 'before deciding which patterns to promote via approve_rule'. Also explains behavior when session is missing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

submit_chain_stageAInspect
Submit data for a chain stage. The platform validates the data
using the chain's Blueprint, then advances the chain if validation passes.

The response includes the next stage info and any accumulated
repair suggestions from prior stages.

Sibling tools: create_chain (define the pipeline), submit_chain_stage
(advance through it), handoff_audit (verify between stages).

Args:
    api_key: GeodesicAI API key (starts with gai_)
    chain_id: Chain identifier from create_chain
    stage: Stage name to submit for
    structured_data: Data for this stage
ParametersJSON Schema
NameRequiredDescriptionDefault
stageYes
api_keyYes
chain_idYes
structured_dataYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses key behaviors: validation using the chain's Blueprint, chain advancement on success, and response contents. But it lacks details on failure modes, idempotency, or authentication requirements beyond the API key.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: 5 sentences plus a parameter list. It front-loads the purpose, explains the process, mentions the response, and provides sibling context with no extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 params, nested object, no output schema, no annotations), the description covers the main workflow and parameters but omits important completeness aspects: error handling, state prerequisites, and a precise specification of the response structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description compensates well by explaining all four parameters, including the API key format, chain ID source, stage purpose, and that structured_data is data. Some details are missing (e.g., valid stage names), but overall adds significant meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Submit data for a chain stage' and explains the validation and advancement process. It distinguishes itself from siblings like 'create_chain' and 'handoff_audit', providing a clear role in the pipeline.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions sibling tools and their roles, giving context on when to use this tool relative to others. However, it does not explicitly state when to use or avoid this tool, nor does it outline prerequisites or error conditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

update_blueprintAInspect
Update an existing Blueprint's configuration in place.

Only fields you pass are updated; fields you omit keep their
current values. To clear a list field (e.g. remove all rules),
pass an explicit empty list [].

Existing API keys for this Blueprint are preserved — agents using
those keys continue working after the update. Ownership stamps
are also preserved; you cannot transfer Blueprint ownership.

The workflow_name itself cannot be renamed. To rename, create a
new Blueprint with the new name and delete the old one.

Different from create_blueprint: create_blueprint creates a new
Blueprint and mints a fresh API key. update_blueprint modifies
an existing one and returns no new key.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    workflow_name: Name of the Blueprint to update (must already exist)
    customer_name: New customer/project name. Pass None to keep current.
    mode: "observe" or "enforce". Pass None to keep current.
    extracted_fields: New list of agent-extracted fields. Pass None to
        keep current; pass [] to clear.
    derived_fields: New list of platform-derived fields. None or [].
    derivation_rules: New list of derivation rules. See blueprint_guide
        prompt for schema. None or [].
    formal_constraints: New list of constraints. See blueprint_guide
        prompt for schema. None or [].
    semantic_checks: New list of semantic checks. None or [].
    require_math: Override math validation flag. None to keep current.
    require_consistency: Override consistency flag. None to keep.
    require_coherence: Override coherence flag. None to keep.
    require_provenance: Override provenance flag. None to keep.
    require_high_assurance: Override high-assurance flag. None to keep.
    enable_anomaly_detection: Override anomaly flag. None to keep.
    enable_drift_tracking: Override drift flag. None to keep.

Returns:
    status:            "ok" | "ERROR"
    blueprint:         workflow_name that was updated
    fields_changed:    list of config keys that were modified
    field_count:       new total of extracted + derived fields
    rule_count:        new total of derivation rules
    constraint_count:  new total of formal constraints
ParametersJSON Schema
NameRequiredDescriptionDefault
modeNo
api_keyYes
require_mathNo
customer_nameNo
workflow_nameYes
derived_fieldsNo
semantic_checksNo
derivation_rulesNo
extracted_fieldsNo
require_coherenceNo
formal_constraintsNo
require_provenanceNo
require_consistencyNo
enable_drift_trackingNo
require_high_assuranceNo
enable_anomaly_detectionNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully covers behavioral traits: partial update, list clearing, preservation of keys/ownership, no rename capability, and return of no new key. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long but well-structured with clear sections and front-loaded purpose. Each sentence adds value, though the parameter list is verbose. Slightly more conciseness could be achieved, but it is justified by the parameter count.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 16 parameters, no output schema, and no annotations, the description is remarkably complete: it explains return fields, behavioral nuances, and parameter usage. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema description coverage, the description documents every parameter with clear semantics: what each field does, how to keep current value, and for lists how to clear vs keep. This adds substantial value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Update an existing Blueprint's configuration in place' and explicitly distinguishes from create_blueprint by contrasting behavior regarding API key generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains partial update semantics, how to clear list fields, preservation of API keys and ownership, the prohibition of renaming workflow_name, and contrasts with create_blueprint.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validateAInspect
Validate structured data against a Blueprint's rules. Returns PASS, FAIL, or REVIEW.

The platform checks mathematical accuracy (do the numbers add up?),
structural consistency (do the fields satisfy all constraints?), and
semantic plausibility (do the values make sense in context?).

Every result includes a determinism hash — the same input with the same
Blueprint always produces the same result. Auditable, replayable, legally defensible.

A Blueprint is required for meaningful validation. Without one, use
create_blueprint or load_rule_pack to define your governance rules first.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: The data to validate (key-value pairs)
    blueprint: Name of the Blueprint to validate against. Use list_blueprints to see options.
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintNodefault
structured_dataYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden. It successfully explains the three validation dimensions (mathematical accuracy, structural consistency, semantic plausibility), return values (PASS/FAIL/REVIEW), and determinism guarantees (hash for auditability). Minor gap: doesn't mention rate limits, error handling, or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections: purpose, validation types, determinism guarantee, and prerequisites. The determinism paragraph ('Auditable, replayable, legally defensible') conveys compliance-critical behavior efficiently. Slight verbosity in the determinism explanation, but justified given legal/defensibility context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description documents return values (PASS/FAIL/REVIEW) and the determinism hash. It contextualizes the tool within the broader Blueprint workflow (referencing create_blueprint, load_rule_pack, list_blueprints). Minor gap: doesn't describe error response format or partial failure behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (no parameter descriptions in JSON schema), but the description fully compensates via the Args section. It documents all three parameters: api_key includes format hint (starts with gai_), structured_data clarifies structure (key-value pairs), and blueprint explains how to discover valid values (use list_blueprints).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a precise action-statement: 'Validate structured data against a Blueprint's rules.' It specifies the verb (validate), target resource (structured data), and scope (Blueprint rules). It distinguishes from siblings like repair or analyze_anomaly by focusing on rule-based validation rather than anomaly detection or remediation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states prerequisites and alternatives: 'A Blueprint is required for meaningful validation. Without one, use create_blueprint or load_rule_pack to define your governance rules first.' Also references list_blueprints for discovering valid blueprint values, creating a clear decision tree for the agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

validate_repairAInspect
Validate structured data and automatically compute repairs if it fails.
Single call that combines validate + repair.

Different from validate: validate returns only the verdict; if the
data fails, you'd then call repair separately. validate_repair
returns the verdict AND the repaired payload in one call. Different
from repair: repair always returns repair suggestions regardless
of whether the input was valid; validate_repair only computes
repairs when validation actually fails.

If PASS: returns the validated data with determinism hash.
If FAIL: returns the failure details AND a repaired payload with
field-by-field corrections and confidence scores. The agent can
inspect the repairs and resubmit the corrected data.
If REVIEW: returns the flagged data with review reasoning.

This is the recommended starting point for most agent integrations.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    structured_data: The data to validate (key-value pairs)
    blueprint: Name of the Blueprint to validate against. Use list_blueprints to see options.
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
blueprintNodefault
structured_dataYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavior: returns verdict, repaired payload, field-by-field corrections, confidence scores, and determinism hash. It explains all three outcomes without omissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with paragraphs and bullet points, front-loaded with the purpose. It is moderately long but every sentence is informative. Minor redundancy (e.g., repetition of 'validate_repair' name) could be trimmed, but overall concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers return values for all outcomes, parameter details, and related tools. It is missing explicit error handling or rate limits, but for a validation/repair tool with no output schema, it provides sufficient context for an agent to use effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema coverage, the description adds meaning for all parameters: api_key format ('starts with gai_'), structured_data as 'key-value pairs', blueprint as 'Name of the Blueprint' with a hint to list_blueprints. This goes well beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Validate structured data and automatically compute repairs if it fails' and contrasts with sibling tools 'validate' and 'repair', clearly distinguishing its combined behavior.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides clear guidance on when to use ('recommended starting point'), explains the three possible outcomes (PASS, FAIL, REVIEW), and tells the agent what to do with the results (inspect repairs, resubmit). It also differentiates from related tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

verify_replayAInspect
Verify that two execution replay contracts represent the same deterministic result.

This is the programmatic proof of GeodesicAI's core promise: same input + same
rules = same result, every time. Given two replay contracts (e.g. from the
original execution and a re-run), this tool compares all component hashes and
reports whether the executions are byte-identical.

Use this to:
- Prove to an auditor that a decision from March 3rd matches a re-run today.
- Detect when a rule change has altered execution behavior (input hash matches
  but canonical trace hash differs → the rules diverged).
- Confirm a Blueprint migration didn't change any observable outcomes.

Args:
    api_key: GeodesicAI API key (starts with gai_)
    contract_a: A replay contract dict (the `replay_contract` field from a
                prior validate/execute_task response)
    contract_b: Another replay contract dict to compare against contract_a

Returns:
    replay_match: bool — True if the top-level replay_hash matches (fully identical)
    contract_version_match: bool
    matches: dict of field_name → value, for every field that agreed
    mismatches: dict of field_name → {expected, actual}, for every field that disagreed
    summary: plain-English one-liner describing the result

Interpretation of mismatches:
    - data_merkle_root: the two runs were fed different data
        (data_field_diff localizes exactly which fields changed)
    - rules_hash: the Blueprint's rules/constraints/thresholds differ
    - template_version: the Blueprint was upgraded between runs
    - solver_registry_hash: the platform itself changed between runs
    - canonical_trace_hash: same inputs and rules but different execution path
        (should never happen under determinism; indicates a platform bug)
    - graph_hash: DAG topology changed between runs
ParametersJSON Schema
NameRequiredDescriptionDefault
api_keyYes
contract_aYes
contract_bYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries full burden. It describes the comparison logic, return fields, and mismatch interpretations. Does not mention auth validation or error conditions, but overall transparent about the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with sections for usage, parameters, returns, and interpretation. Each sentence adds value, but it is somewhat verbose. Still effective for a complex tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (3 parameters, nested objects, no output schema), the description is thorough. It explains return structure, fields like replay_match, and provides detailed mismatch interpretation covering all key fields.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, but the description provides detailed parameter descriptions: api_key format starts with gai_, contract_a/b are dicts with field reference to replay_contract. This adds significant value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Verify that two execution replay contracts represent the same deterministic result.' It explains the concept of determinism and distinguishes the tool from siblings by its focus on replay contracts and byte-identical comparison.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use cases like proving to an auditor or detecting rule changes. Includes interpretation of mismatches, guiding the agent on what results mean. Does not explicitly contraindicate use, but the listed use cases are clear and actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources