Flatland

by com.flatlandfi

Ownership verified

Server Details

Financial modeling engine for AI agents. Typed P&Ls, scenario analysis, and Excel export.

Status: Healthy
Last Tested: 2026-06-23 01:41
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A3.7/5.0

Tool DescriptionsA

Average 4.1/5 across 32 of 32 tools scored. Lowest: 2.9/5.

Server CoherenceA

Disambiguation4/5

Most tools have distinct purposes, but 'compile' and 'compile_scenario' are ambiguous, and 'add_driver' could be confused with 'add_computed' or 'add_cumulative' despite clarified scopes.

Naming Consistency4/5

Follows verb_noun pattern with 'flatland_' prefix, but there are minor deviations like 'flatland_sensitivity' (noun) and 'flatland_compile' vs 'flatland_compile_scenario'.

Tool Count3/5

32 tools is on the heavy side for a typical server, but the complexity of financial modeling justifies many specialized operations, though some redundancy exists.

Completeness4/5

Covers CRUD for models, scenarios, drivers, plus advanced features like sensitivity analysis, linting, and exports. Minor gaps like missing rename or merge operations are workaroundable.

Available Tools

35 tools

flatland_add_aliasAdd AliasAInspect

Add an alias driver — a named pointer to another driver (NAMESPACE-002).

Aliases enable cross-namespace composition. An alias has no value and no formula; it resolves at compile time to its target's computed value.

Use cases:

A namespace (e.g. ndo.finance) needs to reference drivers owned by another namespace (e.g. ndo.rd.headcount). Declare an alias in the consuming namespace; use the bare alias name in local formulas.
Create a stable external name for an internal implementation driver.

The alias's depends_on is one-hop (the target), not transitive. Chains of aliases (alias → alias → concrete) resolve flat via the toposort.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Local identifier for the alias driver in the consuming namespace (used as a bare name in local formulas).
`target`	Yes	Canonical key of the driver this alias points to (e.g. 'ndo.rd.headcount'); resolves to the target's computed value at compile time.
`namespace`	No	Optional dotted-path ownership tag for the alias itself (e.g. 'ndo.finance'); the alias's canonical key becomes '{namespace}.{name}'.
`description`	No	Optional free-text description of the alias.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.3/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With only a title annotation (no readOnlyHint, destructiveHint), the description fully discloses behavioral traits: alias has no value/formula, resolves at compile time, depends_on is one-hop non-transitive, and chains resolve via toposort. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with a clear definition, followed by bullet-pointed use cases. It is sufficiently detailed without being verbose. Slight reduction in the second paragraph could improve conciseness, but overall structure is effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of alias semantics and the presence of an output schema, the description covers the core behavior and use cases adequately. However, it lacks prerequisites (e.g., target must exist, namespace validity) and error handling details. The sibling list is large, but the description sets clear expectations for this specific tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, and each parameter already has detailed semantic descriptions (e.g., name as local identifier, target as canonical key). The tool description reinforces use cases but does not add new semantic detail beyond the schema. Per guidelines, baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool adds an alias driver as a named pointer to another driver, distinguishing it from sibling tools like flatland_add_driver by emphasizing that an alias has no value or formula and resolves at compile time. The use cases reinforce the core purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides two explicit use cases that guide when to use an alias (cross-namespace composition, stable external naming). It implies when not to use (e.g., for drivers with values/formulas) but does not explicitly contrast with alternatives like flatland_add_computed. The one-hop depends_on note adds context, but direct 'when not to use' statements are missing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_add_computedAdd Computed DriverAInspect

Add a computed driver to the active model. The formula is parsed and dependency edges are extracted. Assertions must be objects with 'condition' (e.g. '>= 0', '< 1000000') and 'label' (description) keys.

NDO-DOGFOOD pre-flight #1 (2026-05-06): namespace is the dotted-path ownership tag. When set, the canonical key must be f"{namespace}.{local}". Within-namespace bare formula refs resolve through _rebuild_edges (see NAMESPACE-001 / NAMESPACE-001.5).

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Driver identifier; the local segment must match [A-Za-z_][A-Za-z0-9_]* (no digit-leading names).
`tags`	No	Optional list of free-text tags for categorizing the driver.
`type`	Yes	Flatland type for the computed result: a known type (Currency, Percentage, Ratio, Count, Duration, Rate) or any non-empty open-enum string.
`label`	No	Optional human-readable display label; defaults to the driver name when empty.
`formula`	Yes	Formula referencing other drivers by name; supports + - * / ** , if(), min, max, abs, round, sum (e.g. 'revenue - costs').
`namespace`	No	Optional dotted-path ownership tag (e.g. 'ndo.rd'); the canonical key becomes '{namespace}.{name}' and bare formula refs resolve within it.
`assertions`	No	Optional guardrails; each an object with 'condition' (e.g. '>= 0', '< 1000000') and 'label' (description).
`description`	No	Optional free-text description of what this computed driver represents.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses several behavioral details: formula parsing and dependency edge extraction, assertions format requirements, and namespace handling (canonical key, within-namespace resolution). Given the lack of readOnlyHint or destructiveHint annotations, this adds significant transparency. However, it does not cover error conditions, return values, or side effects like model saving.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately concise with two sentences for the main purpose and a paragraph for namespace specifics. The NDO-DOGBOOD note adds necessary technical detail but could be considered verbose. It is front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 8 parameters (3 required) and existing output schema, the description covers formula evaluation, dependency extraction, assertions format, and namespace behavior. It assumes an 'active model' which is context-dependent but acceptable. No mention of prerequisites or error handling; however, the output schema exists for return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% parameter descriptions. The description adds extra context beyond the schema, especially for the 'namespace' parameter (explaining dotted-path ownership and canonical key) and 'assertions' (clarifying condition/label format). This enhances parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Add a computed driver to the active model,' clearly identifying the action (add) and resource (computed driver). The term 'computed' distinguishes it from sibling tools like flatland_add_alias or flatland_add_driver, but it does not explicitly contrast them. Overall, the purpose is clear and specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for computed drivers (formula-based), but does not explicitly state when to use this tool versus alternatives like flatland_add_alias or flatland_add_cumulative. No when-not or exclusion criteria are provided. The guidance is implied rather than explicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_add_cumulativeAdd Cumulative LineAInspect

Declare a cumulative / running line ONCE and let Flatland lower it.

This is the single authoring surface for cumulative math (running cash, retained earnings, runway). The author states what (a running total of a flow); Flatland generates the verbose per-period scalar chain. Never hand-roll period-suffixed drivers — declare the line here.

Two kinds (engine-agnostic authoring tokens, NOT runtime functions):

kind="cumsum" — a running balance: name = cumsum(flow). Build the per-period flow as the reserved family {flow}__p1 … {flow}__pN first, then declare the line. Optional opening= adds an opening balance into period 1. Optional flow_sign ("positive"/"negative") auto-attaches an advisory sign assertion on the terminal cumulative. The declared name resolves to the ending balance (terminal period).
kind="runway" — first period a cumulative goes below zero: name = first_period_below_zero(over). over is the declared name of an existing cumsum line. A companion boolean {name}__exceeds_horizon (1 = never ran out, 0 = ran out) is generated so you can distinguish "out at month N" from "never out".

The generated interior drivers ({name}__p{k}) are engine internals — never reference them in your own formulas (the macro rejects it). Reference only the declared name. This black-box property is what lets a future native period-axis cumsum replace the lowering with zero rework.

Never feed a stock (a cumsum output) into another cumsum — that double-accumulates a balance and is refused at ingestion.

ParametersJSON Schema

Name	Required	Description	Default
`flow`	No	For kind='cumsum': base name of the per-period flow family ('{flow}__p1 … {flow}__pN') to accumulate. Required for cumsum.
`kind`	Yes	The cumulative kind: 'cumsum' (running balance of a flow) or 'runway' (first period a cumulative goes below zero).
`name`	Yes	Identifier for the declared cumulative line; resolves to the terminal (ending) period value. Must not use the reserved '<name>__p<k>' interior shape.
`over`	No	For kind='runway': the declared name of an existing cumsum line to measure runway over. Required for runway.
`type`	No	Flatland type for the generated drivers (e.g. 'Currency'). Defaults to 'Currency'.	Currency
`label`	No	Optional human-readable display label for the cumulative line.
`opening`	No	For kind='cumsum': optional name of a driver supplying an opening balance added into period 1.
`flow_sign`	No	For kind='cumsum': optional 'positive' or 'negative' to auto-attach an advisory sign assertion on the terminal cumulative.
`description`	No	Optional free-text description of the cumulative line.
`terminal_assertion`	No	Optional assertion on the terminal value; an object with 'condition' (e.g. '> 0') and 'label' (e.g. a solvency floor).

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With minimal annotations (only title), the description fully discloses that the tool generates per-period scalar chains, resolves name to ending balance, and that interior drivers are engine internals never to be referenced.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and each sentence adds value, but it is somewhat lengthy. However, front-loading with main purpose and clear sections offset this.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (10 parameters, two kinds) and the presence of an output schema, the description is fully comprehensive, covering all usage rules, constraints, and future-proofing considerations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Even though schema coverage is 100%, the description adds significant meaning beyond parameter descriptions, explaining the semantic differences between cumsum and runway, the role of flow, opening, flow_sign, and over.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool declares a cumulative/running line and contrasts with hand-rolling period-suffixed drivers, distinguishing it from siblings like flatland_add_driver.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when-to-use (declaring cumulative math) and when-not-to-use (avoid double-accumulation, never feed a cumsum into another cumsum), and offers alternatives like hand-rolling.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_add_driverAdd DriverAInspect

Add an assumption driver to the active model. Assertions must be objects with 'condition' (e.g. '> 0', '>= 0.10', '< 1.0') and 'label' (description) keys.

NDO-DOGFOOD pre-flight #1 (2026-05-06): namespace is the dotted-path ownership tag for multi-agent models (e.g. "ndo.rd"). When set, the canonical key must be f"{namespace}.{local}". namespace=None keeps legacy un-namespaced behavior (default).

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Driver identifier; the local segment must match [A-Za-z_][A-Za-z0-9_]* (no digit-leading names like '3pl_fee').
`tags`	No	Optional list of free-text tags for categorizing the driver.
`type`	Yes	Flatland type for the value: a known type (Currency, Percentage, Ratio, Count, Duration, Rate) or any non-empty open-enum string (e.g. 'Mass(kg)').
`label`	No	Optional human-readable display label; defaults to the driver name when empty.
`value`	Yes	The assumption's literal value, validated against the declared type (e.g. a number for Currency/Count).
`namespace`	No	Optional dotted-path ownership tag for multi-agent models (e.g. 'ndo.rd'); the canonical key becomes '{namespace}.{name}'.
`assertions`	No	Optional guardrails; each an object with 'condition' (e.g. '> 0', '>= 0.10', '< 1.0') and 'label' (description).
`description`	No	Optional free-text description of what this assumption represents.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description adds details about namespace and assertions beyond annotations (which are minimal). However, it lacks information on side effects, permissions, or what happens if a driver already exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two paragraphs, first sentence states purpose, then details. Could be slightly more concise, but overall well-structured and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite 8 parameters and high schema coverage, description misses context about the 'active model', prerequisites, and does not leverage the output schema. Some gaps remain for a comprehensive understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds value by specifying the structure of assertions ('condition' and 'label' keys) and clarifying namespace behavior (e.g., 'namespace=None keeps legacy un-namespaced behavior').

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Add an assumption driver to the active model', specifying the verb (add) and resource (assumption driver). This distinguishes it from sibling tools like flatland_add_alias, flatland_add_computed, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. No exclusion criteria, prerequisites, or context about the 'active model' are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_bulk_addBulk Add DriversAInspect

Add multiple drivers (assumptions and computed) atomically. If any fail validation, none are added. Each driver dict needs: name, category ('assumption' or 'computed'), type. Assumptions need 'value'. Computed need 'formula'. Optional: label, assertions (list of {condition, label}), tags, description, namespace.

NDO-DOGFOOD pre-flight #1 (2026-05-06): Each spec may carry an optional namespace (dotted-path ownership tag). When set, the spec's canonical key must be f"{namespace}.{local}". Bare formula refs resolve to f"{namespace}.{ref}" if the qualified key exists in the batch or the existing model — same tie-break as add_computed.

ParametersJSON Schema

Name	Required	Description	Default
`drivers`	Yes	List of driver specs added atomically; each needs name, category ('assumption' or 'computed'), type. Assumptions need 'value'; computed need 'formula'. Optional: label, assertions ({condition, label}), tags, description, namespace.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses atomic rollback on validation failure. With no annotations, it partially clarifies behavior, but omits details on side effects, overwrite semantics, or performance implications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with a clear opening. The pre-flight note adds length but provides necessary context; overall it balances detail and brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has an output schema (not shown), so return values need not be described. The description covers validation behavior, required/optional fields, and namespace handling, making it fairly complete for its purpose.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds structure (required fields per driver, optional fields) and namespace resolution details, providing value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool adds multiple drivers atomically, distinguishing it from sibling single-add tools like flatland_add_driver. Title 'Bulk Add Drivers' reinforces this.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains atomic behavior and required fields, implying it's for batch operations. However, it does not explicitly contrast with single-driver tools or state when to prefer this over individual adds.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_compileCompile ModelA

Read-only

Inspect

Compile the model under a scenario. Returns all values, assertions, and warnings.

ParametersJSON Schema

Name	Required	Description	Default
`scenario`	No	Name of the scenario to compile under; 'base' for the unmodified model. Defaults to 'base'.	base

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the safe, read-only nature is clear. The description adds useful detail about the return content (all values, assertions, and warnings), which goes beyond annotations but does not disclose other behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a concise two sentences, front-loading the purpose and return information with no extraneous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-only tool with one parameter, the description covers the basics. However, given the existence of a similarly named sibling (flatland_compile_scenario), the description is incomplete as it fails to differentiate usage contexts.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with a clear description for the 'scenario' parameter. The tool description does not add new meaning beyond the schema, so a baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool compiles a model under a scenario and returns values, assertions, and warnings. However, it does not differentiate from the sibling tool 'flatland_compile_scenario', which has a similar name and likely overlapping functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like flatland_compile_scenario, flatland_validate, or flatland_sensitivity. The agent receives no context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_compile_scenarioCompile ScenarioA

Read-only

Inspect

Compile a specific scenario. Apply overlay, recompute, return full output.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Name of the existing scenario to apply and compile; returns the full recomputed output.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.5/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true, implying no state mutation, but description says 'recompute' which could imply mutation. No clarification on side effects, permissions, or whether output is saved. Contradiction noted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with immediate verb-resource identification, followed by concise details. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a one-parameter tool with output schema, the description is sufficient but lacks explanation of 'overlay' and whether the operation is persisted. Could be more complete about return value structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear param description. The tool description adds process context ('apply overlay, recompute'), complementing the schema well.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Compile a specific scenario' with specific verb and resource, and adds details 'Apply overlay, recompute, return full output,' distinguishing it from siblings like flatland_compile and flatland_create_scenario.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives such as flatland_compile. No prerequisites, exclusions, or context provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_create_modelCreate ModelCInspect

Create a new empty financial model.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Display name for the new model (e.g. 'NDO Operating Model'); may contain spaces.
`grain`	No	Period granularity, e.g. 'monthly'. Defaults to 'monthly'.	monthly
`currency`	No	ISO currency code for Currency-typed drivers (e.g. 'USD', 'EUR'). Defaults to 'USD'.	USD
`period_end`	No	Last period of the model horizon as 'YYYY-MM' (e.g. '2026-12').	2026-12
`description`	No	Optional free-text description of the model's purpose.
`period_start`	No	First period of the model horizon as 'YYYY-MM' (e.g. '2026-01').	2026-01

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations beyond the title, the description bears full responsibility for behavioral disclosure. It only states the action ('Create'), leaving out critical details such as whether the operation is destructive, requires permissions, or has side effects. The description does not add context beyond the tool name.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no redundant words. It is front-loaded with the action and resource. However, it is overly terse for a tool with six parameters and may benefit from more detail without adding significant length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (6 parameters, no output schema in the description, and a creation action), the description is incomplete. It does not explain return values, constraints (e.g., name uniqueness), or any setup requirements. The presence of an output schema does not mitigate the lack of contextual information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so all parameters are documented in the input schema. The description adds no additional semantic meaning beyond what the schema provides. According to the guidelines, a baseline of 3 is appropriate when description coverage is high.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Create') and the resource ('a new empty financial model'), distinguishing it from sibling tools that create scenarios or perform other operations. However, it lacks specificity compared to high-scoring examples like the 'get_calls' description that explicitly mentions scope and alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidelines are provided on when to use this tool versus alternatives (e.g., flatland_create_scenario, flatland_load_model). There is no mention of prerequisites, such as the need for initialization or that the model name must be unique. The guidance is entirely implicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_create_scenarioCreate ScenarioBInspect

Create a named scenario with driver value overrides.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Name for the new scenario (e.g. 'downside', 'aggressive_growth').
`overrides`	Yes	Sparse map of assumption driver canonical key -> overriding value; only assumptions (not computed drivers) may be overridden.
`description`	No	Optional free-text description of the scenario.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are minimal (only title) so description carries full burden. It does not disclose side effects, permissions, or other behavioral traits beyond 'create'. For a write operation, more context like whether it overwrites or requires specific access would be helpful.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise, a single sentence that front-loads the core purpose. There is no redundancy, but it could slightly expand on key constraints without losing brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and 100% parameter documentation, the description is adequate but minimal. It does not cover potential pitfalls like naming conflicts or side effects on existing scenarios. Average completeness for a creation tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description adds little beyond what the schema provides. It echoes 'name' and 'overrides' but does not explain constraints on the overrides object beyond what is in the schema. Baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'create' and the resource 'scenario' with the specific aspect of 'driver value overrides'. It distinguishes from sibling tools like flatland_compile_scenario and flatland_delete_scenario.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description does not provide explicit guidance on when to use this tool versus alternatives, nor does it mention prerequisites or conditions for use. The usage is implied but not elaborated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_delete_modelDelete ModelA

Destructive

Inspect

Delete a saved model from disk.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Name (file stem) of the saved model to permanently delete from disk.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide destructiveHint=true. Description adds 'from disk' but does not explain irreversibility, permissions, or side effects beyond what annotations convey.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded verb and object, no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple destructive tool with one parameter and an output schema, the description is complete enough. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with description for 'name'. Description adds no additional meaning beyond the schema's parameter description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action (delete) and resource (saved model) with specificity. It distinguishes from siblings like flatland_delete_scenario by focusing on models.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like flatland_save_model or flatland_list_models. Agent must infer context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_delete_scenarioDelete ScenarioB

Destructive

Inspect

Delete a named scenario.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Name of the scenario to delete; the 'base' scenario cannot be deleted.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide destructiveHint: true, and the description simply states 'Delete'. No additional behavioral traits (e.g., irreversibility, permissions, side effects) are disclosed beyond the annotation. Minimal added value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, front-loaded, and concise. However, it could include more context without becoming verbose. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple destructive tool with one parameter and an output schema, the description is adequate but lacks context about consequences or return value. More detail on post-deletion state would be beneficial.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with the parameter 'name' having a clarifying note about the 'base' scenario. The overall description adds no extra meaning beyond what the schema provides, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Delete a named scenario' clearly states the verb (delete) and resource (named scenario), distinguishing it from siblings like flatland_create_scenario. It is specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. The constraint that 'base' scenario cannot be deleted is mentioned in the schema, but there is no context on prerequisites or when-not-to-use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_diff_scenariosDiff ScenariosA

Read-only

Inspect

Compare two scenarios. Returns changed assumptions, output deltas, and optional attribution.

ParametersJSON Schema

Name	Required	Description
`target`	No	Optional canonical key of an output driver; when given, the diff includes attribution of its delta to changed assumptions.
`to_scenario`	Yes	Name of the scenario to compare to.
`from_scenario`	Yes	Name of the baseline scenario to compare from.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations declare readOnlyHint=true, indicating a safe read operation. The description adds value by explaining the output: 'changed assumptions, output deltas, and optional attribution,' which goes beyond the annotations. There is no contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence of 7 words, front-loading the key action and output. Every word earns its place with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has 3 parameters and an output schema (indicated by context). The description covers basic purpose and output but lacks details like prerequisites (scenarios must exist), error handling, or when to use this over other scenario tools. Some gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with all three parameters described in the schema. The description mentions 'optional attribution' relevant to the 'target' parameter but adds minimal new meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Compare two scenarios. Returns changed assumptions, output deltas, and optional attribution.' It uses a specific verb (compare) and identifies the resources (two scenarios), distinguishing it from sibling tools like flatland_compile or flatland_sensitivity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It does not mention prerequisites, exclusions, or context for choosing this over sibling tools like flatland_compile or flatland_sensitivity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_disable_driverDisable DriverA

Destructive

Inspect

Disable a driver by setting its value to 0 (assumptions) or marking it. Downstream formulas still reference it but it contributes nothing. Useful when you want to neutralize a driver without breaking the graph.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Canonical key of the driver to neutralize (assumption value set to 0, or computed formula replaced with 0) while keeping it in the graph.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes that downstream formulas still reference the driver but it contributes nothing, adding context beyond the destructiveHint annotation. Explains two possible behaviors (set to 0 or marking). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the action, every sentence adds value. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of output schema and high schema coverage, the description is complete. It explains the effect on the graph and downstream formulas, leaving no gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already describes the parameter 'name' well. The description does not add extra semantics beyond the schema, meeting the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the verb 'disable', the resource 'driver', and the effect of neutralizing it without breaking the graph. Distinguishes from siblings like 'remove_driver'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides a clear usage context: 'Useful when you want to neutralize a driver without breaking the graph.' Does not explicitly mention when not to use, but implies alternative via sibling context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_explainExplain DriverA

Read-only

Inspect

Trace the upstream compute chain for a driver. Returns the DAG path (values, types, relationship labels) that feeds into it. Free — does not count against your answer quota.

ParametersJSON Schema

Name	Required	Description	Default
`driver`	Yes	Canonical key of the driver to explain (e.g. 'ebitda', 'ndo.rd.velocity').
`max_depth`	No	Maximum hops upstream to trace. Clamped server-side to 10.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.9/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so safety is covered. Description adds value by disclosing the tool is free and does not count against answer quota, which is important cost behavior beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose. Concise but could be slightly more structured. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists and simple tool, description is sufficient. It mentions return type and free nature. No major gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameter descriptions. Description adds no extra meaning beyond what schema already provides for 'driver' and 'max_depth'. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool traces the upstream compute chain for a driver, specifies return includes DAG path with values, types, and relationship labels. Distinguishes from siblings like flatland_trace_upstream by emphasizing 'for a driver' and the free quota note.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or when-not-to-use guidance. The free quota hint implies cost context but does not differentiate from similar sibling tools (flatland_trace_upstream, flatland_trace_downstream). Usage context is implied but lacks exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_exportExport to HTMLAInspect

Export the active model as a self-contained HTML report.

The HTML file is investor-ready, works offline in any browser, and requires no Excel license. It includes:

Key financial outputs (terminal computed drivers)
Assertion health (pass/fail guardrails)
Sensitivity tornado (top drivers by impact on the primary KPI)
Scenario comparison (if scenarios exist)
Key assumptions (ranked by sensitivity)
Full driver table (collapsible)
Embedded IR JSON for machine consumption

The sensitivity analysis is run automatically (unbilled — this is a final-mile deliverable). The target KPI is auto-detected from standard financial driver names (ebitda, net_income, mrr, runway, etc.).

ParametersJSON Schema

Name	Required	Description	Default
`name`	No	Optional file stem (alphanumeric/underscore/hyphen) for the self-contained .html written to ~/.flatland/exports/; omit to use a safe slug of the model's display name.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are minimal (only title), so description carries burden. Discloses automatic sensitivity analysis and KPI detection, but does not explicitly state read-only nature or file overwrite behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with bullet points, front-loaded main purpose, but slightly verbose with extensive list of contents.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers output contents and key behaviors well, but lacks mention of prerequisites (e.g., model must be loaded) or error handling.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Adds meaning beyond schema by specifying output location (~/.flatland/exports/) and default file name behavior, enhancing single optional parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Export the active model as a self-contained HTML report' with specific verb and resource, and distinguishes from sibling flatland_export_excel.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implicitly suggests use for investor-ready, offline reports without Excel license, but lacks explicit when-to-use vs alternatives or when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_export_excelExport to ExcelAInspect

Export the active model as an institutional-grade .xlsx file.

Note: flatland_export (HTML) is the recommended export for sharing with investors, boards, or team members — it works in any browser without Excel. Use flatland_export_excel when you specifically need live Excel formulas for manual iteration in a spreadsheet application.

The exporter is a deterministic compiler: it walks the IR DAG and emits one Excel cell per IR driver, with computed drivers rendered as live Excel formulas (not values) that reference workbook-scope defined names. The resulting file recalculates correctly in Excel/Numbers/Google Sheets/LibreOffice with no manual rewiring.

The file follows institutional conventions: blue font + light-blue fill for inputs, black font + light-gray fill for formulas, currency formatted as $#,##0 with parens for negatives and dash for zero, percentages as 0.0%, all assertions surfaced as PASS/FAIL rows in a dedicated sheet, and a Checks sheet that verifies compilation, assertions, and formula transpilation succeeded.

Every export is auto-scored against Flatland's frozen v1 institutional rubric (see experiments/excel-export-skill/rubric.md). The rubric_score field in the response is a 0-100 integer; an institutional-grade output scores 90+.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No	Optional file stem (alphanumeric/underscore/hyphen) for the .xlsx written to ~/.flatland/exports/; omit to use a safe slug of the model's display name.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.6/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are minimal (only title), so the description carries full burden. It details deterministic compilation, IR DAG walking, formula rendering, workbook-scope names, cross-app recalculation, visual conventions for inputs/formulas, assertions sheet, checks sheet, and auto-scoring rubric. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with purpose and usage guidelines, then detailed behavior. It is well-structured but somewhat lengthy; every sentence is relevant.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists (not shown but noted), so return values are covered. Description covers behavior, conventions, scoring, and response field rubric_score. Complete for a complex export tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Single optional parameter 'name' with 100% schema coverage. The description adds that omitting it uses a safe slug of the model's display name, which is implicit from the schema's default and description. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool exports the active model as an institutional-grade .xlsx file. It distinguishes from the sibling flatland_export (HTML) by specifying that the HTML version is recommended for sharing, while this one is for when live Excel formulas are needed.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance: use flatland_export (HTML) for sharing with investors/boards/team members, and flatland_export_excel when you need live Excel formulas for manual iteration. Provides clear context and alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_export_pdfExport to PDFAInspect

Export the active model as an investor-ready PDF (deterministic, no LLM).

The PDF is a faithful projection of the HTML narrative report — same headline outputs, assumptions, scenarios, sensitivity, "so what" close, and proof layer (content-hash receipt + assertions-pass + causal appendix) — in a print-optimized light layout that partners can forward.

Free: the PDF path consumes the already-compiled IR and never recompiles on export (CFO metering). The only metered work is the same single sensitivity answer the HTML report runs, routed through the billing gate (degrades if the quota is exhausted; the export itself never 402s).

Requires a headless renderer (LibreOffice) on the server. When unavailable, returns a renderer_unavailable error rather than failing silently.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No	Optional file stem (alphanumeric/underscore/hyphen) for the .pdf written to ~/.flatland/exports/; omit to use a safe slug of the model's display name.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.2/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description provides extensive behavioral details beyond minimal annotations (only title). It discloses determinism, no LLM, no recompilation, CFO metering only for sensitivity, dependency on headless renderer, and error behavior (renderer_unavailable error). This fully informs the agent about side effects and constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat long but each sentence provides valuable information (e.g., content, cost, dependencies, error handling). It is front-loaded with the core action and key traits. Minor redundancy exists with the parameter description, but overall it is efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (export with cost model, renderer dependency, error cases) and the presence of an output schema, the description covers all critical aspects: what the PDF contains, when it's free, when it fails, and requirements. It is comprehensive enough for safe invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'name' is already well-described in the input schema (coverage 100%). The tool description repeats the schema's explanation about optional file stem and default slug, adding no new meaning. Baseline 3 is appropriate as the schema carries the semantic load.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool exports the active model as an investor-ready PDF, specifying it is deterministic and involves no LLM. It distinguishes itself from sibling tools like flatland_export and flatland_export_excel by highlighting the PDF-specific, investor-ready, deterministic nature.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies the tool is used when an investor-ready, deterministic, print-optimized PDF is needed, and mentions it's free but requires a headless renderer. However, it does not explicitly state when not to use it or compare to alternatives like flatland_export_excel or flatland_export, leaving the agent to infer usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_get_driverGet DriverA

Read-only

Inspect

Get a single driver's full spec including upstream/downstream.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Canonical key of the driver to inspect; returns its full spec plus upstream/downstream edges.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true. Description adds that it returns 'full spec including upstream/downstream', providing useful context beyond the annotation. No destructive behavior hinted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, front-loaded sentence with no wasted words. Perfectly concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With output schema present, description need not detail return values. It adequately describes the tool's purpose and output scope. No mention of errors, but not required for a simple get.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Parameter 'name' has 100% schema coverage, and the description explains it returns the driver's full spec plus edges, adding meaning beyond the schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states verb 'Get' and resource 'single driver' with scope 'full spec including upstream/downstream'. Distinguishes from siblings like flatland_list_drivers or trace tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or when-not-to-use guidance, but purpose implies use for inspecting a specific driver's full spec. Lacks exclusions or alternative tool references.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_get_graphGet GraphA

Read-only

Inspect

Return the full dependency structure as an adjacency list with node metadata.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already include readOnlyHint=true, so the description does not need to restate read-only behavior. It adds context about the return format (adjacency list with node metadata), but does not disclose potential size limits, pagination, or other behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single sentence of 12 words that communicates the essential purpose and output format without any extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple nature (no parameters, read-only), the description adequately covers the tool's purpose. The presence of an output schema (not shown in input) likely describes the return format further, so the description is sufficient for selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero parameters with 100% coverage, so the baseline is 4. The description does not need to add parameter details, and it correctly focuses on the return value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'Return' and clearly states the resource 'full dependency structure as an adjacency list with node metadata'. This effectively distinguishes it from sibling tools that perform mutations or different queries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like flatland_trace_upstream or flatland_trace_downstream. The sibling tool list is long, and the description provides no context about which tool fits different use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_impact_previewImpact PreviewAInspect

Preview the exact downstream impact of proposed driver changes without persisting them. Metered — counts as 1 answer.

ParametersJSON Schema

Name	Required	Description	Default
`target_drivers`	No	Optional list of output driver keys to include in output_deltas. If omitted, all changed drivers are returned.
`proposed_changes`	Yes	List of {driver, new_value} pairs to preview. The model is NOT mutated.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations lack readOnlyHint or destructiveHint, so the description carries full burden. It states non-persistence and metering, providing some behavioral context, but does not cover error handling, auth needs, or limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise, front-loaded sentences with no filler. Every word adds value: purpose, non-persistence, and metering.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given output schema exists, the description adequately covers purpose and metering. However, it could better differentiate from sibling tools like flatland_trace_downstream, which also explores downstream effects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for both parameters. The tool description does not add additional meaning beyond what the schema already provides, warranting the baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool previews exact downstream impact of proposed driver changes without persisting them, distinguishing it from mutation tools. The metering note adds specificity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates this is for previewing changes without persistence and mentions metering, implying use before committing changes. However, it does not explicitly contrast with sibling tools like flatland_trace_downstream.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_initInitialize SessionA

Read-only

Inspect

Initialize a Flatland session. MUST be called before any other tool.

Returns skills (workflow instructions), templates, config schema, and best practices that teach the AI agent how to use the engine effectively.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true (consistent with initialization). Description adds value by specifying its return: skills, templates, config schema, and best practices. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states purpose and requirement, second lists return contents. No waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no parameters, read-only) and presence of an output schema, the description fully covers what the agent needs to know about initialization.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters in the schema, so description doesn't need to add parameter info. Schema coverage is 100%, baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool initializes a Flatland session and must be called before any other tool. This distinguishes it from all sibling tools that operate after initialization.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'MUST be called before any other tool,' providing unambiguous when-to-use guidance. No exclusions needed.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_list_driversList DriversA

Read-only

Inspect

List all drivers in the active model with their types, categories, values/formulas. Lightweight inspection without compiling.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true, and the description adds value by explaining it is lightweight and does not require compiling. It also specifies the returned information (types, categories, values/formulas). This goes beyond annotations to provide context about cost and behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that efficiently conveys the tool's purpose, scope, and key characteristic (lightweight, no compile). Every word contributes value, and it is appropriately front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple listing tool with no parameters and an existing output schema, the description covers all necessary aspects: what it does, what it returns, and its non-compile nature. It is complete and sufficient for an agent to select and invoke the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, so the description does not need to explain them. The input schema is empty with 100% coverage. The description adds meaning about the output, which is sufficient for a parameter-less tool. Baseline 4 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists all drivers in the active model, specifying the included details (types, categories, values/formulas). It distinguishes from siblings like flatland_get_driver (single driver) and compile tools by emphasizing 'lightweight inspection without compiling.'

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for quick inspection, but does not explicitly state when to use this tool versus alternatives (e.g., flatland_get_driver for a specific driver). It provides context but lacks explicit when-to-use or when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_list_modelsList ModelsA

Read-only

Inspect

List all saved models.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true; description adds no extra behavioral context beyond listing, but does not contradict annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded, zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a zero-parameter, read-only list tool, the description is sufficient; output schema exists but is not detailed in the description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist; schema coverage is 100% (empty), so description carries no burden for parameter details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'List' and resource 'saved models', clearly distinguishing from sibling tools like flatland_create_model, flatland_delete_model.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use vs alternatives, but usage is implied as a straightforward list operation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_list_scenariosList ScenariosA

Read-only

Inspect

List all scenarios with their override counts.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Output Schema

ParametersJSON Schema

Name	Required	Description
`result`	Yes

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the agent knows it's safe. The description adds behavioral context by specifying that the output includes 'override counts', which is beyond the annotation. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence with no wasted words. It conveys purpose and key detail efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (zero parameters, read-only, output schema exists), the description is sufficient. It clearly states what the tool returns. However, it could mention if there is a limit on the number of scenarios returned or ordering, which would be helpful for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero parameters, so schema coverage is 100%. The description does not need to explain parameters. Baseline score of 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'List all scenarios with their override counts' clearly states the action (list), resource (scenarios), and a distinguishing detail (override counts). It effectively differentiates from sibling tools like flatland_create_scenario, flatland_delete_scenario, and flatland_diff_scenarios.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when needing to list scenarios, but does not explicitly state when not to use or mention alternatives like flatland_list_models. However, the context of siblings makes the intended use clear, so it is nearly complete.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_load_modelLoad ModelAInspect

Load an existing model from disk by name.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Name (file stem) of the saved model to load and make active; as returned by flatland_list_models.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description notes that loading makes the model active, which is a behavioral effect. However, it does not disclose potential side effects like overwriting unsaved changes, or whether permissions are required. Annotations are minimal and not contradicted.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no unnecessary words. It is front-loaded and conveys the core purpose efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single parameter, output schema present), the description is largely complete. Could mention that the model becomes active for subsequent operations, but current coverage is adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage for the single parameter 'name', and the description adds valuable context by linking it to the output of flatland_list_models, clarifying the expected value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (load), resource (model), and source (from disk by name). It effectively distinguishes from siblings like flatland_create_model (create) and flatland_save_model (save).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (when a saved model exists and needs to become active), but does not provide explicit guidance on when not to use or alternatives like creating a new model. Sibling tools exist but no reference is made.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_remove_driverRemove DriverA

Destructive

Inspect

Remove a driver. By default fails if other drivers reference it. Set cascade=True to also remove all downstream dependents.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Canonical key of the driver to remove.
`cascade`	No	When True, also remove all downstream dependents; required to remove a driver other drivers reference. Defaults to False.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare destructiveHint=true. Description adds context about default fail behavior and cascade removal of downstream dependents. However, it doesn't fully detail the recursive nature of cascade or permissions required, so moderate added value.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste. First sentence defines purpose, second explains cascade behavior. Efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers core removal behavior and main edge case (referenced drivers). Output schema is present but not detailed here. Minor gaps like rollback or confirmation are acceptable for a simple tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage. The description largely restates the schema for cascade=True. Baseline of 3 is appropriate as the description adds little beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'Remove a driver' – a specific verb and resource. Distinguishes from siblings like flatland_disable_driver, flatland_get_driver, etc. The cascade option further clarifies scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes default failure when referenced and cascade=True to remove dependents. Lacks explicit mention of when to use flatland_disable_driver as an alternative, but the guidance is clear and actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_save_modelSave ModelA

Idempotent

Inspect

Persist the current in-memory model to disk as JSON in the store directory (~/.flatland/models/). Pass name to save under an alternate file stem (e.g. name="scenario_v2" writes scenario_v2.json); omit to save under the model's own name. After first save, model auto-saves on every mutation to prevent data loss.

name is restricted to alphanumeric/underscore/hyphen to prevent filesystem-path injection. The save location is always inside the validated store directory — filesystem paths are not accepted.

ParametersJSON Schema

Name	Required	Description	Default
`name`	No	Optional alternate file stem (alphanumeric/underscore/hyphen only) to save under, e.g. 'scenario_v2'; omit to save under the model's own name.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond the idempotentHint annotation, the description discloses auto-save on mutation, filesystem path injection prevention, and the fixed store directory. This adds valuable behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured, front-loading the primary action and then adding necessary constraints and auto-save detail. No superfluous sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional parameter and an existing output schema, the description adequately covers the action, auto-save, and security. It lacks mention of the output, but the schema fills that gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema covers the parameter (100% coverage), but the description adds an example usage, the effect of omission, and character restrictions. This clarifies the parameter's purpose beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'persist' and the resource 'model' along with the file format (JSON) and location. This distinguishes it from sibling tools like flatland_load_model or flatland_delete_model.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the `name` parameter and mentions auto-save behavior, giving practical guidance. However, it does not explicitly contrast with export or compile tools, missing clear exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_seam_lint_cross_namespace_writesSeam Lint: Cross-Namespace WritesA

Read-only

Inspect

Seam linter: flag drivers writing into a namespace they do not own (read-only).

Always checks the structural invariant that a driver's declared namespace matches its canonical key (a corrupt-IR tripwire). When ownership (namespace -> owner-id) is supplied, also flags writer-drivers landing in a namespace the manifest never declared an owner for. Never mutates; never compiles.

ParametersJSON Schema

Name	Required	Description	Default
`ownership`	No	Optional manifest mapping namespace -> owner-id; also flags writer-drivers landing in a namespace no owner was declared for. Always checks the namespace/canonical-key invariant regardless.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotation declares readOnlyHint=true, and the description adds that it 'never mutates; never compiles', providing additional behavioral context beyond the annotation. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no wasted words. The first sentence immediately states the tool's purpose, and the second elaborates on behavior. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite having no required parameters and an output schema (not shown in the tool definition but indicated by context), the description fully explains the tool's behavior and what it checks. No gaps for a linter tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds extra context about the ownership parameter, explaining its role and the default behavior (always checks namespace/canonical-key invariant). This goes beyond the schema description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Seam linter: flag drivers writing into a namespace they do not own' and specifies the two invariants checked. This distinguishes it from sibling lint tools like flatland_seam_lint_dangling_imports and flatland_seam_lint_single_writer, which have different focuses.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool: to check cross-namespace write invariants. It does not explicitly compare to alternatives, but the context of sibling tools implies it is one of several linters. The description could be improved by contrasting with other seam lint tools, but the purpose is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_seam_lint_dangling_importsSeam Lint: Dangling ImportsA

Read-only

Inspect

Seam linter: flag alias 'imports' whose target driver is missing (read-only).

An alias driver is a cross-namespace pointer (the model's import); a target that does not exist in the model is a dangling import that would surface as unresolved_alias at compile. Never mutates; never compiles.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond the readOnlyHint annotation, the description explains that missing targets cause 'unresolved_alias' at compile and emphasizes immutability. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences: purpose, definition, and behavior. No redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no parameters and minimal annotations, the description fully explains the tool's function, behavior, and relevance. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist; description correctly avoids parameter details. Baseline score of 4 applies per scoring rules for 0 parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('flag alias imports whose target driver is missing') and the resource, distinguishing it from sibling linters like cross_namespace_writes and single_writer.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states 'Never mutates; never compiles,' indicating a read-only lint check. Does not explicitly list when-not-to-use, but the context of linting is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_seam_lint_single_writerSeam Lint: Single WriterA

Read-only

Inspect

Seam linter: flag namespaces written by more than one owner (read-only).

ownership optionally maps namespace -> owner-id. When omitted (single tenant / no manifest) every namespace is its own writer and no violations are reported; the writer inventory is still returned. With a manifest, a namespace whose writer-drivers resolve to >1 owner — or to an owner the manifest never declared — is a violation. Never mutates; never compiles.

ParametersJSON Schema

Name	Required	Description	Default
`ownership`	No	Optional manifest mapping namespace -> owner-id; flags namespaces written by more than one owner or by an undeclared owner. Omit for single-tenant (no violations reported).

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond the readOnlyHint annotation, the description explicitly states 'Never mutates; never compiles.' It also clarifies that violations only occur when a manifest is provided, adding behavioral context that annotations alone don't cover. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences long, front-loads the purpose, and every sentence adds key information. No redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool is simple with one optional parameter. The description covers its behavior thoroughly, and given an output schema exists (not shown), no further explanation of return values is needed. Complete for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema covers the 'ownership' parameter with 100% description, but the description adds context about single-tenant vs. manifest scenarios and the effect of omitting the parameter. This adds semantic value beyond the schema, earning a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lints for namespaces written by more than one owner, using specific verbs ('lint', 'flag') and a resource ('namespaces'). It distinguishes from sibling lint tools by focusing on single-writer violations, and the 'read-only' note further clarifies intent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use: to check for multiple owners via optional manifest. It details behavior with and without 'ownership', providing clear context. While it doesn't explicitly list alternatives, the sibling tools imply other linters, so the guidance is adequate but not exhaustive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_sensitivitySensitivity AnalysisA

Read-only

Inspect

Perturb each assumption ±N%, recompute, measure impact on target. Returns ranked list with elasticity.

ParametersJSON Schema

Name	Required	Description	Default
`target`	Yes	Canonical key of the output driver whose sensitivity to each assumption is measured.
`scenario`	No	Name of the scenario to evaluate under; 'base' for the unmodified model. Defaults to 'base'.	base
`perturbation`	No	Fractional perturbation applied to each assumption (e.g. 0.10 = +/-10%). Defaults to 0.10.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, so the agent knows this is a safe read operation. The description adds behavioral context: it perturbs assumptions, recomputes, and returns elasticity. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded sentence that efficiently conveys the core action and output. Every word adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input schema, output schema existence, and clear annotations, the description is sufficiently complete. It covers the action, scope, and output format. Minor gap: does not explicitly state that the model is not permanently modified, but readOnlyHint covers that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all three parameters with descriptions. The description adds general context about sensitivity analysis but does not significantly enhance parameter meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the action (perturb assumptions, recompute, measure impact) and the output (ranked list with elasticity). Distinguishes from sibling tools like flatland_compile or flatland_diff_scenarios which serve different purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives such as flatland_diff_scenarios. The description does not mention prerequisites or cases where this tool is inappropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_trace_downstreamTrace DownstreamA

Read-only

Inspect

Walk the graph forward and return all descendants.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Canonical key of the driver to trace; returns all descendants (everything downstream that depends on it).

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark the tool as read-only. The description adds behavioral context by specifying 'walk the graph forward' and 'return all descendants', clarifying the operation direction and result nature beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words. It is front-loaded with the action and immediately understandable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the existence of an output schema and that the tool is simple (read-only, one parameter), the description combined with schema provides adequate information. However, it could mention behavior on invalid input or additional constraints for completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the parameter description in the schema already explains the 'name' field well. The tool description itself adds no extra parameter details, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool walks the graph forward and returns all descendants, using specific verbs and resource. It distinctively differs from siblings like flatland_trace_upstream by indicating direction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives such as flatland_trace_upstream or when not to use it. The usage context is implied but not stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_trace_upstreamTrace UpstreamA

Read-only

Inspect

Walk the graph backward and return all ancestors (full causal chain).

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Canonical key of the driver to trace; returns all ancestors (the full causal chain feeding it).

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate the tool is read-only. The description adds that it returns 'all ancestors (full causal chain)', but does not disclose potential depth limits, performance implications, or edge cases. It adds some context but is not extensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, concise sentence that immediately communicates the action and result. No unnecessary words. Front-loaded with the core behavior.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has a single parameter and an output schema (not shown), the description provides sufficient information to understand the tool's purpose. It could be considered complete for a straightforward trace operation, but lacks any mention of output structure or special behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the parameter 'name' is already well-described in the schema. The tool description does not add additional meaning beyond what the schema provides. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('walk backward') and specifies the resource ('graph') and what is returned ('all ancestors, full causal chain'). It clearly distinguishes from the sibling 'flatland_trace_downstream' by indicating direction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for finding causal ancestors but does not provide explicit guidance on when to use this tool versus alternatives like 'flatland_trace_downstream'. No exclusions or when-not advice is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_update_driverUpdate DriverA

Idempotent

Inspect

Update an existing driver's value, formula, assertions, or metadata.

NDO-DOGFOOD pre-flight #1 (2026-05-06): namespace is an optional disambiguator. When set, name may be the bare local id ("velocity") and the tool resolves the canonical key ("ndo.rd.velocity") for lookup. Passing the canonical key directly also works. Updating the namespace itself is NOT supported — that would change the canonical key and break references; remove + re-add instead. Formula updates resolve bare refs within the existing driver's namespace, mirroring add_computed.

ParametersJSON Schema

Name	Required	Description
`name`	Yes	Canonical key of the driver to update, or the bare local id when 'namespace' is supplied to disambiguate.
`tags`	No	Replacement list of tags; omit to leave unchanged.
`type`	No	New Flatland type (known type or open-enum string); omit to leave unchanged.
`label`	No	New display label; omit to leave unchanged.
`value`	No	New value for an assumption driver, validated against its type; omit to leave unchanged.
`formula`	No	New formula for a computed driver (same DSL as add_computed); omit to leave unchanged.
`namespace`	No	Optional disambiguator (e.g. 'ndo.rd') so 'name' may be the bare local id; the namespace itself cannot be changed via this tool.
`assertions`	No	Replacement list of assertions, each an object with 'condition' and 'label'; omit to leave unchanged.
`description`	No	New free-text description; omit to leave unchanged.

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare idempotentHint=true, and the description does not contradict this. It adds behavioral context: namespace cannot be changed, formula resolution mirrors add_computed. Does not discuss side effects on dependent drivers, but the idempotent annotation mitigates concerns.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is front-loaded with purpose, followed by a focused note on namespace behavior. Each sentence adds value, though slightly verbose. Good structure for parsing.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all 9 parameters with schema, includes namespace resolution and formula mirroring. Output schema exists, so return values are handled. Lacks mention of prerequisite (driver existence) and error conditions, but is mostly complete given the complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline at 3. The description adds meaning by explaining namespace disambiguation, the effect of omitting parameters (leave unchanged), and the relationship between formula and add_computed. Exceeds baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Update an existing driver's value, formula, assertions, or metadata.' It specifies the verb (update) and resource (driver), and distinguishes from sibling add/remove tools by focusing on modification. No tautology or misleading statements.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance on namespace resolution and warns against updating the namespace itself, recommending remove/re-add instead. However, it does not broadly contrast with other update-like siblings such as disable_driver or remove_driver.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

flatland_validateValidate ModelA

Read-only

Inspect

Run type checks and assertions without full recomputation. Quick health check.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Output Schema

ParametersJSON Schema

Name	Required	Description
No output parameters

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already convey readOnlyHint=true, so the read-only nature is known. The description adds value by specifying that it runs type checks and assertions, and that it avoids full recomputation. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with two short sentences that immediately convey the core action. Every word adds value, and it is well front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite the simple nature (no parameters), the description is complete: it explains what the tool does and its lightweight nature. The presence of an output schema means return values are handled elsewhere, so no omission.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has zero parameters, so the schema coverage is 100% trivially. The description does not need to add parameter info. Baseline for 0 parameters is 4, and the description is sufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Run type checks and assertions without full recomputation. Quick health check.' It uses specific verbs and resources, distinguishing it from siblings like flatland_compile which performs full recomputation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (quick health check) and hints at alternatives ('without full recomputation' suggests full compilation is another option). It does not explicitly state when not to use or list alternatives, but the context from sibling tools fills the gap.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?