Negotiation Copilot for Agents (SNHP)

Name: Negotiation Copilot for Agents (SNHP)
Author: ryuxik

by io.github.ryuxik

Server Details

Math-optimal negotiation moves for AI agents, in plain dollars (single + multi-issue).

Status: Unhealthy
Last Tested: 2026-07-22 00:34
Transport: Streamable HTTP
URL
Repository: ryuxik/snhp
GitHub Stars: 0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

B3.1/5.0

Tool DescriptionsB

Average 3.8/5 across 34 of 34 tools scored. Lowest: 2.1/5.

Server CoherenceB

Disambiguation3/5

Many tools have overlapping purposes, especially the multiple negotiation and auction tools. While descriptions are detailed, an agent could easily confuse `gt_negotiate_turn` with `gt_negotiation_buy_next_offer` or the various auction tools. Some tools like `gt_a2a_*` are distinct, but overall there is moderate ambiguity.

Naming Consistency2/5

Naming is inconsistent: some tools use the `gt_` prefix while others use `nextmove_` or no prefix. Within `gt_`, there is variation like `gt_negotiate_*` and `gt_negotiation_*`. Patterns such as snake_case and descriptive names are mixed, making the set feel chaotic.

Tool Count3/5

34 tools is on the high side for a negotiation copilot, potentially overwhelming. However, the broad domain covering negotiation, auctions, matching, and peer verification partially justifies the count. Many tools appear redundant, so a smaller, more curated set might be more effective.

Completeness4/5

The server covers a wide range of negotiation and mechanism design scenarios: single-issue, multi-issue, auctions, posted price, matching, and peer verification. Minor gaps exist, such as lack of tools for arbitration or conflict resolution, but the core functionality is well-covered.

Available Tools

34 tools

gt_a2a_build_peer_proofAInspect

STEP 1 of the A2A flow — sign a per-negotiation peer proof, LOCALLY.

Bind your registered identity (operator_attestation_jwt from gt_a2a_register_operator) to THIS negotiation_id and role ('seller'/'buyer') using your operator private key (base64 of the raw 32-byte Ed25519 key). The key never leaves this process — run this MCP server on your own host. The proof is short-lived and can't be replayed to another negotiation or role. NEXT: send it to the counterparty, get theirs, then gt_a2a_open_session with both.

ParametersJSON Schema

Name	Required	Description	Default
`role`	Yes
`operator_id`	Yes
`ttl_seconds`	No
`negotiation_id`	Yes
`private_key_b64`	Yes
`operator_attestation_jwt`	Yes

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses that the key never leaves the process, the proof is short-lived (implied ttl), cannot be replayed, and is signed locally using the operator private key. However, it does not describe failure modes or error handling, which keeps it from a perfect score.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded paragraph of about 66 words. It efficiently conveys purpose, key parameters, security context, and next steps without any wasteful or redundant sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description would ideally describe the return value (likely a JWT). However, it clearly places the tool in a multi-step flow, explains what it produces (a peer proof), and tells the agent what to do next. It lacks output format details, which slightly reduces completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description adds meaning for key parameters: operator_attestation_jwt (source), negotiation_id (binding), role (seller/buyer), private_key_b64 (encoding). It does not explain operator_id or explicitly mention ttl_seconds beyond 'short-lived', so not fully comprehensive.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'sign a per-negotiation peer proof, LOCALLY' with clear verb and resource. It identifies as 'STEP 1 of the A2A flow', distinguishing it from sibling tools like gt_a2a_open_session and gt_a2a_register_operator, which are other steps in the same flow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly provides sequence: 'STEP 1... NEXT: send it to the counterparty, get theirs, then gt_a2a_open_session with both.' This tells the agent when to use this tool and what follows, effectively differentiating it from other A2A tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_a2a_next_offerAInspect

STEP 4 of the A2A flow — your next move, using the SESSION's verified peer_mode.

role is 'seller' or 'buyer'. The recommender uses the session's server-derived peer_mode (not a self-asserted flag), so the cooperation premium applies only on a genuinely verified session. my_reservation and the offer histories are in NORMALIZED utility [0,1] (map dollars the way gt_negotiate_turn does, or use gt_negotiate_turn/gt_negotiate_bundle for the math and this path for the premium + settlement). NEXT, once you agree on a price: gt_a2a_settle.

ParametersJSON Schema

Name	Required	Description	Default
`role`	Yes
`prior`	No
`session_id`	Yes
`pareto_knob`	No
`my_reservation`	Yes
`deadline_rounds`	No
`my_offer_history`	Yes
`opponent_offer_history`	Yes

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It reveals behavioral traits like using server-derived peer_mode and normalized utility scores, but does not disclose output structure, destructiveness, rate limits, or error behavior, leaving significant gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single focused paragraph of ~80 words, front-loading the purpose and key context. Every sentence adds value, though it could be more structured with bullet points for readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 params, no annotations, no output schema, many siblings), the description provides moderate context (A2A flow, normalized utilities) but lacks details on output, optional parameters, and error handling, leaving the agent with unanswered questions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It explains that 'my_reservation' and offer histories are in normalized [0,1] utility, covering 5 of 8 parameters partially. Parameters like 'prior', 'pareto_knob', and 'deadline_rounds' are not described, leaving gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is 'STEP 4 of the A2A flow' for making the next offer using a verified session and normalized utilities. It distinguishes from sibling tools like gt_negotiate_turn/bundle by noting this path is for premium+settlement, but could be more explicit about when to use each sibling tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly positions the tool as step 4 and directs the user to use gt_negotiate_turn/bundle for the math and gt_a2a_settle after agreement. It provides clear context for when to use this tool, though it lacks 'when not to use' statements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_a2a_open_sessionAInspect

STEP 3 of the A2A flow — verify BOTH peer proofs and open a session.

Submit the seller_proof and buyer_proof (yours + the counterparty's from STEP 2). Returns session_id and peer_mode. peer_mode is server-derived and is True ONLY if both proofs verify (at/above require_level), are for their roles, are unexpired, and name DISTINCT operators — so the premium can't be claimed by lying. Pass require_level='domain' to demand domain-verified counterparties. NEXT: gt_a2a_next_offer with the returned session_id.

ParametersJSON Schema

Name	Required	Default
`buyer_proof`	Yes
`seller_proof`	Yes
`require_level`	No	self
`negotiation_id`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It explains the return values (session_id, peer_mode) and describes the verification logic: both proofs must verify at/above require_level, be for correct roles, unexpired, and from distinct operators. It also explains the effect of the require_level parameter ('domain' demands domain-verified counterparties). This provides good behavioral insight beyond mere input/output.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise: about four sentences, front-loaded with the step number and core action. Every sentence adds value—verification logic, return values, next step. No redundant or extraneous information. It is well-structured for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 4 parameters with nested objects and no output schema, the description provides essential context: expected input (proofs from step 2), behavior (verification conditions), output (session_id and peer_mode), and next step. It lacks detailed field descriptions for the nested objects, but for a flow step tool, it is sufficiently complete. The description could mention error scenarios or edge cases, but overall meets the needs for an AI agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must add meaning. It explains seller_proof and buyer_proof as 'yours + the counterparty's from STEP 2', and describes require_level with enumeration values. However, it does not detail the structure of the proof objects (nested with additionalProperties), nor explain negotiation_id beyond implying it connects to previous steps. The description adds some semantics but not comprehensive parameter guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it is 'STEP 3 of the A2A flow — verify BOTH peer proofs and open a session.' The verb 'verify' and 'open' clearly indicate the action, and 'session' is the resource. It distinguishes from siblings by positioning in the A2A flow and naming the next tool, making purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description instructs to submit seller_proof and buyer_proof from STEP 2, and indicates the next step (gt_a2a_next_offer). It implies usage context as part of a sequence, but does not explicitly state when not to use it or alternative tools. Sibling tools like gt_a2a_build_peer_proof suggest this follows proof building, but the description could be clearer about prerequisites or scenarios to avoid.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_a2a_register_operatorAInspect

USE THE A2A FLOW ONLY WHEN the counterparty ALSO runs SNHP; it unlocks a cooperation premium (more joint surplus between verified peers) plus a signed, settleable AP2 deal record. Against an unknown counterparty, just use gt_negotiate_turn / gt_negotiate_bundle — none of this is needed.

THE FLOW (you and the counterparty each have an Ed25519 keypair): 0. gt_a2a_register_operator -> your signed identity attestation (this) [optional] gt_a2a_request_domain_challenge + gt_a2a_verify_domain -> upgrade to sybil-resistant domain identity

gt_a2a_build_peer_proof -> a per-negotiation proof (signs LOCALLY)
exchange proofs with the counterparty (your channel / an A2A message)
gt_a2a_open_session -> session_id + peer_mode (TRUE iff both verify)
gt_a2a_next_offer -> recommendation using the SESSION's peer_mode
gt_a2a_settle -> a signed AP2 Cart Mandate (the deal record)

This step: register operator_id with your base64 Ed25519 PUBLIC key; returns a trust-anchor-signed attestation JWT to present in peer proofs.

ParametersJSON Schema

Name	Required	Description	Default
`operator_id`	Yes
`display_name`	No
`public_key_b64`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses that this step creates a 'signed identity attestation' and returns a 'trust-anchor-signed attestation JWT'. It also describes optional domain verification steps. However, it doesn't mention any potential side effects (e.g., whether registration is persistent or one-time), but since it's a registration action, the behavior is sufficiently clear.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with purpose and usage, which is good. However, the enumerated flow (steps 0-5) includes steps beyond this tool, which adds extraneous detail. Could be more concise by focusing only on this step and referencing the full flow elsewhere.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of the A2A flow and the tool's role as step 0, the description provides sufficient context: why this tool exists, when to use it, and what it returns (a JWT). The lack of an output schema is mitigated by the description. It explains the optional domain upgrade and ties into sibling tools, making it complete for a registration step.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so description must compensate. It clarifies that 'public_key_b64' is a base64 Ed25519 PUBLIC key, which adds meaning. However, 'operator_id' and 'display_name' are not elaborated beyond the schema's type info. The description partially compensates for the missing schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Purpose is explicitly stated: 'Register your operator identity — STEP 0 of the verified-peer deal flow.' It clearly identifies the tool's action (register) and resource (operator identity), and distinguishes it from sibling tools by marking it as step 0 in a specific flow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear when-to-use guidance: 'USE THE A2A FLOW ONLY WHEN the counterparty ALSO runs SNHP' and explicitly names alternatives ('Against an unknown counterparty, just use gt_negotiate_turn / gt_negotiate_bundle'). Also explains the broader flow and optional subsequent steps.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_a2a_request_domain_challengeCInspect

Get the DNS-TXT record to publish to prove control of domain (sybil- resistant, domain-level identity).

ParametersJSON Schema

Name	Required	Description	Default
`domain`	Yes
`public_key_b64`	Yes

Tool Definition Quality

C2.7/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose any behavioral traits such as idempotency, rate limits, authentication requirements, or side effects. It only describes the high-level output.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, efficient and front-loaded with key information. However, it sacrifices necessary details for brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of annotations, output schema, and only 0% schema parameter coverage, the description fails to provide enough context. It does not explain the return value format, prerequisites, or the role of public_key_b64.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%. The description explains the 'domain' parameter implicitly but completely omits the 'public_key_b64' parameter, leaving its purpose and format unclear.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Get the DNS-TXT record') and the purpose ('to prove control of domain'), with added context of sybil-resistance. It distinguishes itself from siblings like gt_a2a_verify_domain which likely uses the result.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies it's a step in proving domain control but provides no explicit when-to-use, prerequisites, or comparison to alternatives (e.g., gt_a2a_verify_domain).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_a2a_settleAInspect

STEP 5 (final) of the A2A flow — mint the signed deal record.

Once both sides agree on agreed_price, this emits a signed AP2 Cart Mandate (a non-repudiable VC-JWT naming both verified parties) — and an Intent Mandate too if you pass buyer_max_price. Refused unless the session is peer_mode=True (both verified and distinct), so a Cart Mandate always names two real, verified parties. No escrow; the mandate is the settleable record.

ParametersJSON Schema

Name	Required	Default
`item`	No
`terms`	No
`currency`	No	USD
`session_id`	Yes
`agreed_price`	Yes
`buyer_max_price`	No

Tool Definition Quality

A3.7/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description must disclose behavioral traits. It mentions that the tool emits a signed AP2 Cart Mandate and optionally an Intent Mandate, and that it is 'Refused unless ... peer_mode=True'. It also notes 'No escrow'. However, it does not describe failure modes, idempotency, or other side effects beyond the emitted mandates. This is adequate but not comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is six sentences and front-loads the purpose. It is concise without unnecessary fluff. However, heavy use of jargon (AP2 Cart Mandate, VC-JWT) may reduce clarity for some agents, but it remains structurally efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 6 parameters with 0% schema coverage and no output schema, the description should provide more detail. It covers the main behavioral aspects and conditions but fails to document half the parameters or explain return format. This makes it only minimally complete for an AI agent relying solely on this text.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must add meaning for all 6 parameters. It explicitly explains the role of 'agreed_price' (what both sides agree on) and 'buyer_max_price' (triggers Intent Mandate), but does not mention 'session_id', 'item', 'terms', or 'currency'. This leaves significant gaps in parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as 'STEP 5 (final) of the A2A flow' and its purpose is to 'mint the signed deal record'. It specifies the outputs (Cart Mandate and optional Intent Mandate) and distinguishes from siblings by positioning itself as the final settlement step in the A2A process.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states that settlement occurs 'Once both sides agree on agreed_price' and conditions refusal unless 'peer_mode=True (both verified and distinct)'. This provides clear context for when to use. However, it does not explicitly state when not to use or compare with alternative tools, but the flow context (STEP 5) implies ordering.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_a2a_verify_domainCInspect

Verify the published DNS-TXT challenge and register domain as a domain-verified operator.

ParametersJSON Schema

Name	Required	Description	Default
`domain`	Yes
`display_name`	No
`public_key_b64`	Yes

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description should disclose side effects, failure modes, or idempotency. It only states 'verify and register', omitting what happens on failure, whether registration is idempotent, or if it modifies state beyond registration.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence, making it concise. However, it sacrifices completeness for brevity; a slightly longer description with structured info would be more helpful.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 3 parameters, no output schema, and no annotations, the description is critically incomplete. It fails to explain the tool's role in the broader workflow (e.g., that it should follow request_domain_challenge) or what the return value indicates.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no information about the three parameters (domain, display_name, public_key_b64). The agent must guess their roles from names alone, which is insufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's action: verify a DNS-TXT challenge and register the domain as a domain-verified operator. It distinguishes from siblings like request_domain_challenge (which asks for a challenge) and register_operator (which may be a different registration step). However, it could be more precise about what 'verify' entails.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., must have requested a challenge first) or when not to use it. Users are left to infer the workflow from sibling names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_auction_format_recommendationCInspect

Recommend format from {first_price, vickrey, english} given weights.

ParametersJSON Schema

Name	Required	Description	Default
`weights`	No
`n_bidders`	Yes
`seller_valuation`	Yes
`bidder_value_prior`	Yes

Tool Definition Quality

C2.1/5.0

Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. The description is misleading: it implies only 'weights' matters, but the schema requires three other parameters (bidder_value_prior, n_bidders, seller_valuation). No disclosure of return value, side effects, or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness2/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short (one sentence, 10 words), but it sacrifices clarity for brevity. It fails to convey crucial information about required inputs, making it under-specified rather than concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 parameters (3 required, nested objects) and no output schema, the description is severely incomplete. It does not explain inputs, output format, or usage context, leaving the agent with insufficient information to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so description must explain parameters. It only mentions 'weights' (an optional parameter) and ignores the three required parameters (bidder_value_prior, n_bidders, seller_valuation). No explanation of what 'weights' means or how parameters interact.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (recommend), the resource (auction format from a specific set), and mentions the key input (weights). It differentiates from sibling tools like optimal_bid or simulate by focusing on format recommendation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like gt_auction_optimal_bid or gt_auction_simulate. No when/when-not scenarios or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_auction_optimal_bidAInspect

The bid to place when you're BIDDING in an auction, in plain dollars.

USE THIS WHEN: you're a bidder and want the bid that maximizes your expected surplus without overpaying. NOT for running an auction (use gt_auction_optimal_reserve) or 1:1 haggling (use gt_negotiate_turn).

Provide: auction_format ("first_price" sealed bid, "second_price_vickrey", or "english_ascending"); my_valuation (what the item is worth to YOU, in $); n_competing_bidders (how many OTHER bidders, not counting you); and competitor_value_prior — a rough model of what rivals will pay, e.g. {"family":"uniform","params":{"low":0,"high":6000}} (or {"family":"lognorm","params":{"mu":8.5,"sigma":0.4}}). Estimate it if unknown. Returns {optimal_bid, expected_surplus, win_probability, dominant_strategy, rationale} — bid and surplus in the SAME $ you passed in.

Example: a domain worth $5,000 to you, 4 rivals who'd pay up to ~$6,000, in a sealed first-price auction -> gt_auction_optimal_bid(auction_format="first_price", my_valuation=5000, n_competing_bidders=4, competitor_value_prior={"family":"uniform","params":{"low":0,"high":6000}}) -> optimal_bid ~$4,000, win_probability ~0.48.

ParametersJSON Schema

Name	Required	Description	Default
`my_valuation`	Yes
`reserve_price`	No
`risk_aversion`	No
`auction_format`	Yes
`n_competing_bidders`	Yes
`competitor_value_prior`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It explains outputs (optimal_bid, expected_surplus, win_probability, etc.) and units (same $). It implies this is a pure computation tool with no side effects, though it doesn't explicitly state it's read-only or non-destructive. The description is quite transparent about what the tool does and returns, but could mention that it does not modify any state.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: first line defines purpose, then usage conditions, then parameter list with explanations, and finally an example. It is not overly long (about 15 lines) and every sentence adds value. Some redundancy could be trimmed (e.g., 'in plain dollars' appears twice), but overall it is efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 6 parameters, a nested object, and no output schema, the description covers the input at a high level but omits two optional parameters. It lists output fields but does not provide types or detailed structure. The example helps but doesn't compensate for missing details on reserve_price and risk_aversion. The tool is moderately complex; the description is adequate but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so description must compensate. It explains all four required parameters: auction_format (with enum), my_valuation (worth to you), n_competing_bidders (others), competitor_value_prior (with example). However, it omits the two optional parameters (reserve_price, risk_aversion) present in the schema, failing to fully compensate for schema coverage. The description adds significant value for the required params but misses the optional ones.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it computes the optimal bid for a bidder in an auction, with specific verb 'bid to place' and resource 'auction'. It explicitly distinguishes from sibling tools for running an auction (gt_auction_optimal_reserve) and haggling (gt_negotiate_turn), providing clear differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: 'USE THIS WHEN: you're a bidder...NOT for running an auction (use gt_auction_optimal_reserve) or 1:1 haggling (use gt_negotiate_turn)'. It also includes a detailed example showing exact parameter values and expected output, making usage crystal clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_auction_optimal_reserveAInspect

Set the revenue-optimal RESERVE PRICE (minimum bid you'll accept) for an auction.

USE THIS WHEN: you're running an auction or sale with multiple bidders and need the floor price that maximizes your expected revenue. NOT for one-on-one haggling (use gt_negotiate_turn for that).

Provide: n_bidders (how many bidders), seller_valuation (what the item is worth to YOU, in $), and bidder_value_prior — a rough model of what bidders will pay, e.g. {"family":"uniform","params":{"low":2000,"high":8000}}. Estimate it if unknown. Returns the reserve price and expected revenue.

Example: a painting, ~5 bidders, worth $1,000 to you, bidders likely pay $2,000–$8,000 -> gt_auction_optimal_reserve(n_bidders=5, seller_valuation=1000, bidder_value_prior={"family":"uniform","params":{"low":2000,"high":8000}}).

ParametersJSON Schema

Name	Required	Description	Default
`n_bidders`	Yes
`seller_valuation`	Yes
`bidder_value_prior`	Yes

Tool Definition Quality

A4.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations, so description carries full burden. It explains inputs and outputs (reserve price and expected revenue). Lacks explicit statement on side effects or destuctiveness, but as a calculation tool it's likely benign. Could mention assumptions or limits, but still good.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured: one-line purpose, then use case, parameter explanations, and example. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description explains return values. Covers all needed context for correct invocation: when, how, parameters, example.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, yet description thoroughly explains each parameter: n_bidders, seller_valuation, and bidder_value_prior with model and example. Adds significant value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States explicitly: 'Set the revenue-optimal RESERVE PRICE (minimum bid you'll accept) for an auction.' Clear verb+resource, and distinguishes from negotiation and other auction tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use ('running an auction or sale with multiple bidders'), when-not-to-use ('NOT for one-on-one haggling'), and direct alternative ('use gt_negotiate_turn for that'). Also includes a concrete example.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_auction_simulateCInspect

Monte Carlo auction revenue + efficiency.

ParametersJSON Schema

Name	Required	Description	Default
`seed`	No
`bidder_priors`	Yes
`n_simulations`	No
`reserve_price`	Yes
`auction_format`	Yes

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description should disclose behavioral traits. It fails to mention return values, side effects, or that results are stochastic and depend on the seed parameter. The agent has insufficient information.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with one short sentence, but it sacrifices essential information. It is not necessarily wasteful, but it is incomplete, failing to earn its place by providing adequate value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 5 parameters, 3 required, and no output schema, the description is grossly incomplete. It does not explain what the simulation returns, how to interpret results, or how to configure bidder_priors, making it hard for an agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, and the description adds no meaning to any parameter. The agent must rely solely on the schema, which lacks descriptions for properties like bidder_priors or n_simulations.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it performs Monte Carlo simulation for auction revenue and efficiency, which is a specific verb and resource. However, it does not distinguish from sibling auction tools like gt_auction_optimal_bid or gt_auction_optimal_reserve, which also relate to auction outcomes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. No context on prerequisites, use cases, or exclusions is given, leaving the agent without decision support.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_mechanism_gale_shapleyAInspect

Match two groups by their rankings so no pair would rather swap (a STABLE matching).

USE THIS WHEN: you're assigning two sides to each other by mutual preference — interns<->teams, students<->schools, mentors<->mentees — and want a result with no "blocking pair" (no person+slot that both prefer each other over what they got).

Provide proposers and receivers, each a list of {"id": name, "preferences": [ids of the OTHER side, most-wanted first]}. Receivers may add "capacity" (default 1) to accept several. Returns {matching (name -> name), unmatched_proposers, blocking_pairs (empty list = provably stable), n_proposals}. NOTE: the result is PROPOSER-optimal, so put the side you want to favor in proposers.

Example: gt_mechanism_gale_shapley( proposers=[{"id":"Ana","preferences":["Growth","Core"]}, {"id":"Ben","preferences":["Core","Growth"]}], receivers=[{"id":"Growth","preferences":["Ben","Ana"]}, {"id":"Core","preferences":["Ana","Ben"]}]) -> matching {"Ana":"Growth","Ben":"Core"}, blocking_pairs [].

ParametersJSON Schema

Name	Required	Description	Default
`proposers`	Yes
`receivers`	Yes

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description covers behavioral aspects: returns matching, unmatched_proposers, blocking_pairs, n_proposals; notes that empty blocking_pairs means stable; explains proposer-optimality. It does not mention error handling or validation, which would improve it further.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: purpose sentence, usage guidance, parameter details, return description, and a full example. Every sentence adds value, and it is concise without being too brief.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 parameters and no output schema, the description is quite complete: it explains inputs, outputs, and provides an example. Minor improvement would be to mention edge cases like incomplete preferences or invalid capacities, but overall it is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, but the description fully explains the structure of proposers and receivers: each is a list of objects with 'id' and 'preferences' fields, and receivers can have 'capacity'. This adds all needed semantics beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Match two groups by their rankings so no pair would rather swap (a STABLE matching).' It uses a specific verb and resource, and distinguishes from sibling tools that deal with auctions, negotiations, or other mechanisms.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'USE THIS WHEN' and provides concrete examples (interns-teams, students-schools). It also explains the proposer-optimal property, helping the agent decide which side to favor.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_mechanism_optimal_auction_designBInspect

Myerson revenue-optimal mechanism for asymmetric IPV. Per-bidder reserves; collapses to second-price-with-reserve under symmetric IPV.

ParametersJSON Schema

Name	Required	Default
`seed`	No
`objective`	No	revenue
`bidder_priors`	Yes
`n_simulations`	No
`seller_valuation`	Yes

Tool Definition Quality

B3/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are present, so the description must carry the full burden. It mentions per-bidder reserves and collapse to second-price, but lacks details on computational cost, side effects, or what the output represents. Behavioral traits are insufficiently disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very short—one sentence and a fragment—which is concise. However, it lacks structure (no headings or bullet points). It earns its place by being informative but could be better organized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 5 parameters, no output schema, and no annotations, the description is incomplete. It does not explain return values, how to interpret results, or any dependencies. For a tool of moderate complexity, more context is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, meaning the description provides no parameter details. It does not explain 'bidder_priors', 'seller_valuation', 'objective', 'seed', or 'n_simulations'. The description adds no value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it implements Myerson's revenue-optimal mechanism for asymmetric IPV with per-bidder reserves, and mentions the special case of symmetric IPV. It distinguishes from sibling tools like gt_auction_simulate and gt_mechanism_posted_price_optimal.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit when-to-use or when-not-to-use guidance is given. It implies usage for optimal auction design under IPV, but does not discuss alternatives or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_mechanism_posted_price_optimalAInspect

Best price (and markdown schedule) to clear a FIXED stock by a DEADLINE, in plain dollars.

USE THIS WHEN: you must sell a fixed number of units before a cutoff and demand arrives over time — event tickets, perishable inventory, end-of-life stock. NOT for 1:1 haggling (gt_negotiate_turn) or auctions (gt_auction_*).

Provide: inventory (units to sell); horizon_seconds (selling window in SECONDS — 14 days = 14243600 = 1209600); arrival_rate_per_second (expected shoppers per second = expected total shoppers / horizon_seconds); and buyer_arrival_prior — a rough model of willingness-to-pay, e.g. {"family":"uniform","params":{"low":40,"high":150}}. Returns {static_price (one good fixed price), static_expected_revenue, dynamic_schedule (list of {t_seconds, recommended_price} markdown waypoints), sellthrough_rate, rationale} — all prices in the SAME $ as your prior.

Example: 200 tickets, 14-day window, ~600 shoppers willing to pay $40-$150 -> gt_mechanism_posted_price_optimal(inventory=200, horizon_seconds=1209600, arrival_rate_per_second=600/1209600, buyer_arrival_prior={"family":"uniform","params":{"low":40,"high":150}}) -> static_price ~$112, schedule marks down $114 -> ~$76 as the deadline nears.

ParametersJSON Schema

Name	Required	Description	Default
`seed`	No
`inventory`	Yes
`n_simulations`	No
`horizon_seconds`	Yes
`buyer_arrival_prior`	Yes
`arrival_rate_per_second`	Yes

Tool Definition Quality

A4.8/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It explains what the tool computes and returns (static_price, dynamic_schedule, etc.) and constraints (fixed stock, deadline). It does not mention side effects, auth needs, or rate limits, but given the tool type (mechanism design), it is likely safe and read-only. The transparency is good but could explicitly state read-only behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: purpose sentence, usage guidelines, parameter instructions, return values, and example. Each sentence earns its place. It is front-loaded with purpose and not overly verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 params, nested object, no output schema), the description is remarkably complete. It covers all required parameters, explains the optional parameters indirectly, details the return fields, and provides a full example. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the description fully compensates by explaining inventory, horizon_seconds, arrival_rate_per_second, and buyer_arrival_prior in prose. The example shows how to compute arrival_rate_per_second. The nested object buyer_arrival_prior is explained with a concrete example. The description adds significant meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Best price (and markdown schedule) to clear a FIXED stock by a DEADLINE, in plain dollars.' It uses specific verbs (clear, sell) and identifies the resource (fixed stock, deadline). It distinguishes from sibling tools about auctions and negotiation by explicitly naming them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: 'USE THIS WHEN: you must sell a fixed number of units before a cutoff and demand arrives over time — event tickets, perishable inventory, end-of-life stock. NOT for 1:1 haggling (gt_negotiate_turn) or auctions (gt_auction_*).' It gives clear context and alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiate_bundleAInspect

Negotiate SEVERAL linked issues at once by logrolling — in plain terms.

USE THIS WHEN: a deal has more than one issue on the table and they trade off — a job offer (base + equity + signing), a SaaS contract (price + seats + term + SLA), any package deal. It concedes on the issues you care about LESS (and the other side cares about MORE) to win the ones you care about most — a trade that beats splitting every issue down the middle. For a single PRICE, use gt_negotiate_turn instead.

Provide issues: a list of {"name", "options" (the choices), "my_utility" (how good each option is to YOU — one number per option, any scale), "their_utility" (how good each option is to THEM — their preference direction)}. Optionally my_priorities ({issue_name: weight}, how much each issue matters to you) and their_offers (their packages so far as {issue_name: option}, oldest first — this is what lets it INFER their priorities). Returns {action, recommended_offer (issue -> option), message, my_utility, their_expected_utility, inferred_their_priorities, trade_logic, fit, confidence, acceptance_probability}.

Validated (separately from the single-issue +12%): returns a Pareto-efficient package that beats naive "split-every-issue-down-the-middle" bargaining by ~40% joint surplus (300 random 4-issue profiles). HONEST CAVEAT: the priority INFERENCE layered on top is weak (recovery r≈0.3) and currently adds only ~1% (and can be slightly NEGATIVE against some opponents) over the same engine run with no inference — so the proven value today is the efficient-package search, not (yet) the logrolling edge.

Optional timing refinement: pass rounds_left (bargaining rounds remaining) with compute_ms > 0 to spend that many ms of Monte-Carlo rollouts choosing WHICH package to hold for as the other side concedes over the rounds — a firmer package closes later (discounted) than a generous one. 0 = the instant closed-form package; the reply then carries a compute block. Modest by design (never worse than the closed form in-model; helps on a minority of deals).

Example: a SaaS contract — you most want a low price_per_seat, can flex on seats/term/SLA. gt_negotiate_bundle(issues=[ {"name":"price_per_seat","options":["$50","$40","$30"],"my_utility":[0,0.5,1],"their_utility":[1,0.5,0]}, {"name":"sla","options":["99%","99.9%"],"my_utility":[0,1],"their_utility":[1,0]} ...], my_priorities={"price_per_seat":0.55,"sla":0.1,...}, their_offers=[...]) -> a full package that gives ground on SLA to hold the price.

ParametersJSON Schema

Name	Required	Description	Default
`issues`	Yes
`my_batna`	No
`compute_ms`	No
`rounds_left`	No
`their_offers`	No
`my_priorities`	No
`their_batna_estimate`	No

Tool Definition Quality

A4.8/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description fully details behavior: returns Pareto-efficient package, admits weak priority inference (recovery r≈0.3, adds ~1%), explains timing refinement trade-offs. It is honest and thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but well-structured with sections (USE THIS WHEN, caveat, optional timing, example). Every sentence adds value, though it could be slightly more compact without losing clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters, no output schema, and no annotations, the description fully covers return values (action, recommended_offer, message, utilities, etc.), caveats, and usage scenarios. An example ties everything together, making it very complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 0% schema coverage, the description explains the core `issues` parameter in detail with a SaaS example, including utility mapping and trade logic. Other parameters like `my_batna` and `their_batna_estimate` are mentioned but not explained, leaving minor gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool negotiates multiple linked issues using logrolling, provides examples (job offer, SaaS contract), and distinguishes from the single-issue sibling gt_negotiate_turn. It uses specific verbs and resources, making the purpose unmistakable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'USE THIS WHEN' for deals with multiple trade-offs, contrasts with gt_negotiate_turn for single prices, and explains optional timing refinement usage. It gives clear context and alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiate_close_sessionBInspect

Close a pondering session and cancel any in-flight background speculation.

ParametersJSON Schema

Name	Required	Description	Default
`session_id`	Yes

Tool Definition Quality

B3.4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries full burden. It discloses that closing cancels in-flight speculation, but does not mention side effects like reversibility, state changes, or resource release. Adequate but minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with no wasted words. Perfectly concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple close operation, the description tells the basic action and side effect, but lacks information about return value, error conditions, and whether the session must be open. With no output schema, more context would help.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter 'session_id' exists in the schema with just type and title. The description adds no information about the parameter's meaning, source, or expected format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'close' and the resource 'a pondering session', with an additional detail about canceling in-flight speculation. This distinctly differentiates it from sibling tools like gt_negotiate_open_session.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives. It does not mention prerequisites, conditions, or when not to use (e.g., if the session is already closed).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiate_open_sessionAInspect

Open a stateful price-negotiation session that PONDERS on the other side's clock.

Unlike one-shot gt_negotiate_turn, a session remembers the running history and — after each propose/respond — speculates in the BACKGROUND over the counter-offers the other side is likely to make, pre-solving your reply to each. So while you're blocked waiting for their response, idle compute is already searching; when their counter arrives, gt_negotiate_respond often returns an instant, deeply-searched move. side='sell'/'buy', walk_away/target in dollars (same meaning as gt_negotiate_turn), compute_ms = rollout budget per move. Returns {session_id}. NEXT: gt_negotiate_propose to make your opening move.

ParametersJSON Schema

Name	Required	Default
`item`	No	this
`side`	Yes
`target`	Yes
`walk_away`	Yes
`compute_ms`	No
`rounds_left`	No

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes session history, background speculation, and faster responses. Explains parameters and return value. However, does not mention session lifecycle (need to close) or resource implications beyond compute_ms. Without annotations, description carries full burden—minor gap in lifecycle disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with strong opening line, explanatory paragraph, parameter mapping, and next-step hint. Every sentence adds value, no redundancy. Appropriate length for a complex tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Explains background computation well and specifies return {session_id}. Lacks mention of session closure (sibling tool exists) and potential pitfalls like resource limits. Given no output schema, covers core functionality but misses lifecycle context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description explains side, walk_away, target, compute_ms with examples. But item and rounds_left are not described. Adds meaning for most required parameters but not all, leaving gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it opens a stateful negotiation session, distinguishes from one-shot gt_negotiate_turn, and specifies it is the first step followed by gt_negotiate_propose. The verb 'open' and resource 'session' are specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly contrasts with gt_negotiate_turn, implies when to use (stateful, background speculation), and directs to next step (gt_negotiate_propose). Provides clear context for when this tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiate_proposeCInspect

Make your next move in a pondering session and kick off background speculation.

Returns the same dict as gt_negotiate_turn (action, recommended_price, message, compute, ...). Immediately after returning, the session searches your replies to the counter-offers it expects — on the counterparty's clock. NEXT: when they reply, gt_negotiate_respond(session_id, their_offer).

ParametersJSON Schema

Name	Required	Description	Default
`compute_ms`	No
`session_id`	Yes

Tool Definition Quality

C2.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description discloses that the tool triggers background speculation, searches replies on counterparty's clock, and returns a dict akin to gt_negotiate_turn. It does not mention authentication, rate limits, or side effects beyond implied state tracking, leaving gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is three sentences, concise overall, but the first sentence uses metaphorical language ('pondering', 'kick off') that could be more direct. No unnecessary repetition, but structure is adequate.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and only two parameters, the description is incomplete. It covers workflow but lacks parameter semantics and full behavioral details like what 'background speculation' entails or how the session state changes.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, and the description provides no meaning for the two parameters (session_id, compute_ms). The agent cannot infer their purpose or how to use them effectively from the description alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states it is for 'making your next move in a pondering session' and initiating background speculation, clearly distinguishing it from siblings like gt_negotiate_respond (used after counterparty replies). However, 'pondering session' is somewhat vague and could be more precise.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage as the next move in a negotiation session and directs to use gt_negotiate_respond after counterparty replies, providing some workflow context. However, it lacks explicit when-not or alternatives among the many negotiation sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiate_respondAInspect

Feed the counterparty's latest dollar offer and get your next move.

If their offer is roughly what the session anticipated, the deeply-searched reply is already cached and returned instantly (the reply's "_pondered" field is True); otherwise a fresh warm-started search runs. Same return shape as gt_negotiate_turn.

ParametersJSON Schema

Name	Required	Description	Default
`compute_ms`	No
`session_id`	Yes
`their_offer`	Yes

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description reveals caching behavior, the '_pondered' field, and warm-started search. It also notes the same return shape as gt_negotiate_turn. Missing details on prerequisites like an open session or side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the core action. Every sentence adds value: purpose, caching behavior, and return shape reference. No word waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description hints at return shape by referencing gt_negotiate_turn but doesn't describe that shape. Missing error conditions, prerequisites (like an open session), and any limitations. Adequate but not fully complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description must explain parameters. It clarifies 'their_offer' as a dollar amount, but 'session_id' and 'compute_ms' are not explained. The description adds minimal semantic value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Feed...and get your next move') and identifies the resource (counterparty's latest dollar offer). It clearly distinguishes from sibling tools like gt_negotiate_propose and references gt_negotiate_turn for return shape.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states when to use (after receiving counterparty's offer) and differentiates via caching behavior and reference to gt_negotiate_turn. However, it doesn't explicitly exclude scenarios or mention alternatives for proposing vs. responding.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiate_turnAInspect

Get the math-optimal next move in a price negotiation — in plain dollars.

USE THIS WHEN: you're haggling over a single PRICE across multiple back-and- forth rounds and want a better outcome than winging it. Validated edge: ~12% better head-to-head (measured on this recommender, n=20 paired LLM negotiations, 95% CI +6.5-17.4%, p<0.0001). NOT FOR: one-shot or fixed prices (it'll tell you to just negotiate directly); multi-issue bundles (use gt_negotiate_bundle — it logrolls across several linked issues); or non-price decisions like accept-vs-decline a job offer (just reason it through).

You provide only what you already know — no game theory: side "sell" or "buy" walk_away your reservation in dollars (seller=floor/minimum, buyer=ceiling/max) target your aspiration in dollars (seller=high, buyer=low) counterparty_offers their offers so far, in dollars, oldest first rounds_left (optional, default 8) roughly how many back-and-forths remain compute_ms (optional, default 0; EXPERIMENTAL) milliseconds of Monte-Carlo rollouts to spend refining the move. 0 = instant closed form. Validated to show NO realized edge over the closed form (n=400, mc_validation.py) — kept off by default as a research mechanism, not a quality improvement. The reply carries a "compute" block

You get back, in dollars: {"action": "counter"|"accept"|"walk", "recommended_price": 5387.0, "message": "...the best I can do is $5,387.00", "fit": {...}, "expected_settlement": 4943.5, "confidence": 0.62}

WORKED EXAMPLE — selling a contract, floor $4,000, hope $6,000, the buyer has bid $4,200 then $4,500: gt_negotiate_turn(side="sell", walk_away=4000, target=6000, counterparty_offers=[4200, 4500], rounds_left=6) -> counter ~$5,387 with a ready-to-send message; ACCEPT once their bid crosses the optimal target; WALK if they stay below your floor near the deadline.

Works against ANY counterparty with zero setup. (The verified-peer cooperation premium is the separate, advanced gt_a2a_* flow.)

ParametersJSON Schema

Name	Required	Default
`item`	No	this
`side`	Yes
`target`	Yes
`walk_away`	Yes
`compute_ms`	No
`rounds_left`	No
`my_previous_offers`	No
`counterparty_offers`	No

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description bears full burden. It explains the tool's behavior: it outputs a counter, accept, or walk action with a recommended price and message. It notes that compute_ms has no realized edge, and that the tool works against any counterparty. However, it does not discuss side effects, rate limits, or required permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively long but well-structured with sections for purpose, usage, parameters, output format, and a worked example. Every sentence adds value, but it could be slightly more concise, particularly in the worked example and validated edge detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 8 parameters, no output schema, and no annotations, the description is quite thorough. It includes a parameter list, output format, and a worked example. However, it does not fully explain all output fields (e.g., 'fit' and 'expected_settlement') and omits two parameters from the description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description compensates by explaining key parameters (side, walk_away, target, counterparty_offers, rounds_left, compute_ms) with context and a worked example. However, it fails to mention 'item' and 'my_previous_offers' from the schema, leaving gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool's purpose: 'Get the math-optimal next move in a price negotiation — in plain dollars.' It clearly distinguishes from siblings by specifying that it is for single-price negotiations, while gt_negotiate_bundle is for multi-issue bundles.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidelines: use when haggling over a single price across multiple rounds, not for one-shot or fixed prices, multi-issue bundles (use gt_negotiate_bundle), or non-price decisions (just reason it through). It also mentions a validated edge with confidence intervals.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiation_buy_next_offerCInspect

Buy-side next-offer recommendation with a defense bundle.

Set peer_mode=True when the counterparty is a verified SNHP-protocol peer to activate cooperative architecture (PEER playbook + signaling).

If anchor_attack_detection is in defenses, supply market_prior {mu, sigma}. Returns recommended offer + warnings + defense actions.

ParametersJSON Schema

Name	Required	Description	Default
`defenses`	No
`peer_mode`	No
`pareto_knob`	No
`market_prior`	No
`my_reservation`	Yes
`deadline_rounds`	Yes
`my_offer_history`	Yes
`seller_offer_history`	Yes

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description bears full burden. Discloses that setting peer_mode activates cooperative architecture and that market_prior is needed with anchor_attack_detection, and returns offer + warnings + defense actions. But lacks details on side effects, destruction, rate limits, auth needs, or return format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with main purpose. Fairly concise, though some details could be integrated more tightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 8 parameters, no output schema, and complex negotiation context, the description is insufficient. It explains returns at a high level but omits parameter purposes and overall workflow, leaving gaps for an agent to use it confidently.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so description must compensate. It explains peer_mode and market_prior conditions, but leaves other 6 parameters (my_reservation, deadline_rounds, offer histories, pareto_knob, defenses) unexplained. Partial compensation only.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's a 'buy-side next-offer recommendation with a defense bundle', specifying the resource (buy-side offer) and action. It implicitly differentiates from the sell-side sibling 'gt_negotiation_sell_next_offer' via 'buy-side', but does not explicitly distinguish from other negotiation tools like 'gt_negotiate_propose'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides guidance on when to set peer_mode (when counterparty is verified SNHP peer) and when to supply market_prior (if anchor_attack_detection is in defenses). However, no explicit when-not-to-use or alternatives to sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiation_declare_first_strikeBInspect

Cryptographically commit to a buyer reservation. Returns an EdDSA-signed JWT attestation the buyer shows the seller. Use SHA-256 over (reservation || nonce || salt || buyer_id || seller_id), base64url-encoded without padding, as reservation_hash.

ParametersJSON Schema

Name	Required	Description	Default
`buyer_id`	Yes
`seller_id`	Yes
`deadline_iso`	Yes
`reservation_hash`	Yes
`binding_ttl_seconds`	Yes

Tool Definition Quality

B3.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully convey behavior. It states that the tool 'commits' and returns an attestation, implying a binding action. However, it does not explicitly describe side effects like irreversibility, authorization requirements, or the need for a subsequent reveal step. The description is partially transparent but leaves important behavioral details implicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences long with no redundant information. It efficiently states the purpose, output, and a key parameter construction. While it could be more structured (e.g., listing parameters), it remains concise and front-loaded with the most important information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the cryptographic nature and the absence of an output schema, the description lacks context about the overall protocol (e.g., the need for a reveal step, the binding nature of the commitment). It does explain the hash construction and output format, but the missing process context limits completeness for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description must add value. It fully explains how to compute the `reservation_hash` parameter (SHA-256 over specific fields, base64url-encoded). For the other four parameters (buyer_id, seller_id, deadline_iso, binding_ttl_seconds), no additional semantics are provided beyond their names. This partial coverage justifies a score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action: 'Cryptographically commit to a buyer reservation.' It specifies the resource (buyer reservation), the cryptographic nature, and the output (EdDSA-signed JWT attestation). This effectively distinguishes it from sibling tools like gt_negotiation_reveal_first_strike, which likely handles the reveal step.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives such as gt_negotiate_propose or gt_negotiation_reveal_first_strike. There is no mention of prerequisites, the two-step commit-reveal process, or any context about the negotiation flow.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiation_detect_anchor_attackBInspect

Z-score the opponent's opening offer against a market prior {mu, sigma}. Recommends ignore / counter_with_market / walk_away.

ParametersJSON Schema

Name	Required	Description	Default
`market_prior`	Yes
`opponent_offer_history`	Yes

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavior. It mentions Z-scoring and recommendations but does not specify whether the tool is read-only, modifies any state, or what the exact output format is. The lack of output schema and behavioral details leaves significant ambiguity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with only 16 words across two sentences, each serving a purpose: the first describes the method, the second the output recommendation. No unnecessary words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema and sibling tools covering similar domains, the description should explain the return format of the recommendation, potential errors, and edge cases. It only mentions three possible recommendations without detail, leaving the agent uncertain about how to interpret the result.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description adds some meaning: it hints that 'market_prior' should contain 'mu' and 'sigma', and that 'opponent_offer_history' is the opponent's opening offer. However, it does not fully specify the structure of the market_prior object beyond {mu, sigma}, nor any constraints on the offer history.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs a statistical Z-score analysis of an opponent's offer against a market prior and recommends a response action. This distinguishes it from sibling negotiation tools, which are more about proposing, responding, or closing sessions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when an opening offer is received and a decision on how to counter is needed, but it does not explicitly state when to avoid using this tool or compare it to alternatives like direct offer acceptance or counter-proposal tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiation_reveal_first_strikeCInspect

Reveal a previous first-strike to obtain the binding offer.

ParametersJSON Schema

Name	Required	Description	Default
`salt`	Yes
`nonce`	Yes
`reservation`	Yes
`commitment_id`	Yes

Tool Definition Quality

C2.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description lacks behavioral details such as side effects, security implications, or whether the first-strike is consumed. Since there are no annotations, the description carries full burden for transparency but is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is very concise with a single sentence, which is efficient but may be overly terse given the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness1/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 required params, no schema descriptions, no output schema, no annotations), the description is severely incomplete. It fails to provide essential context for correct tool invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage and no parameter explanations in the description, an AI agent cannot understand the meaning of commitment_id, reservation, nonce, and salt.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (reveal) and the resource (previous first-strike), and specifies the outcome (obtain binding offer). It distinguishes from sibling tools like gt_negotiation_declare_first_strike.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No usage guidance is provided. The description does not mention when to use this tool versus alternatives, nor any prerequisites or postconditions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiation_sell_next_offerAInspect

Sell-side next-offer recommendation.

Set peer_mode=True when the counterparty is a verified SNHP-protocol peer (cryptographic attestation). Activates the cooperative architecture: max-self signaling rounds 0-1, then cubic descent toward the PEER floor (0.55). Adds bilateral cooperation premium ~+7% (CI [+2.8%, +11.8%], n=20 LLM tournament, p=0.058 borderline — N=50 confirmation pending).

Without peer_mode (single-side adoption): SNHP customer beats vanilla counterparty by +12.1% head-to-head margin (CI [+6.5%, +17.4%], p<0.0001).

pareto_knob ∈ [0, 1] (only used when peer_mode=False) interpolates between deal-rate-max (0) and H2H-margin-max (1). Returns the recommended offer (in our utility space), acceptance probability, expected payoff, and the inferred posterior over the buyer's WTP.

ParametersJSON Schema

Name	Required	Description	Default
`peer_mode`	No
`pareto_knob`	No
`my_reservation`	Yes
`buyer_wtp_prior`	No
`deadline_rounds`	Yes
`my_offer_history`	Yes
`opponent_offer_history`	Yes

Tool Definition Quality

A4.5/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses cooperative architecture, algorithm details (cubic descent, floor), performance statistics (confidence intervals, p-values), and pareto_knob interpolation. Clearly states return elements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with a clear one-line summary followed by detailed paragraphs. Information is dense but not overly verbose. Minor improvement possible by trimming statistical details or rephrasing.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters and no output schema, description provides detailed behavioral context and lists return values. However, lacks explanation for several critical parameters and does not specify the structure of the return object (e.g., type or order).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring description to clarify parameters. Explains peer_mode and pareto_knob in detail, but does not describe my_reservation, opponent_offer_history, deadline_rounds, buyer_wtp_prior, or my_offer_history, which are essential for use.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Explicitly states 'Sell-side next-offer recommendation,' clearly specifying verb and resource. Distinguishes from sibling tools like 'gt_negotiation_buy_next_offer' and 'gt_negotiate_propose' through context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit conditions: set peer_mode=True for verified SNHP-protocol peers, and details behavior with and without peer_mode. Explains pareto_knob usage only when peer_mode=False, guiding parameter selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

gt_negotiation_trust_anchor_public_keyCInspect

ASCII PEM of the server's first-strike attestation public key.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

C2.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden but only reveals the key format and role. It does not disclose whether the tool is read-only, requires a session, or has side effects. The agent cannot infer its safety profile.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (eight words), but it lacks a verb and feels incomplete. Conciseness without clarity is insufficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, no annotations, and a complex domain with 25 sibling tools, the description is far too minimal. It does not explain what happens when the tool is invoked or how it fits into a negotiation workflow.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has zero parameters, and schema coverage is 100%. The description adds no parameter information because none is needed; baseline 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description is a noun phrase ('ASCII PEM of the server's first-strike attestation public key') that does not specify an action. It reads like a definition rather than a tool function, leaving ambiguity whether the tool retrieves, validates, or sets this key.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus its many siblings. There is no mention of prerequisites, conditions, or typical invocation context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nextmove_adviseAInspect

A move inside your paid session — no additional charge (the $2 at nextmove_open covered the negotiation). Pass the FULL offer history each time, oldest first. Returns move, exact price, ready-to-send message, and the receipt (why[], context_hash, deterministic compute block).

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`my_offers`	No
`session_id`	Yes
`rounds_left`	No
`their_offers`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavior: zero additional charge, required offer history, and return structure including a deterministic compute block. It doesn't cover error handling or rate limits, but is sufficient for a negotiation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each adding critical value: cost context, usage rule, and return fields. No unnecessary words; information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations or output schema, the description provides a clear purpose, usage constraint, and return summary. It covers key aspects but omits parameter details for some fields and does not mention error scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the description adds meaning by requiring full offer history oldest first, which explains their_offers and my_offers. However, it does not explain api_key, session_id, or rounds_left, leaving gaps for half the parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: making a move inside a paid session at no extra charge. It specifies that it returns a move, price, message, and receipt, distinguishing it from sibling tools like nextmove_open and nextmove_request.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says to use this tool inside a paid session and to pass the full offer history oldest first. It implies when-not-to-use (not for opening a session) and provides clear context, though it does not explicitly list alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nextmove_catalogAInspect

The NEXTMOVE vending machine: what's on the shelf and how to pay. One SKU: a paid NEGOTIATION SESSION — $2 covers every move of one negotiation (up to 10, 7-day window), category-tuned, receipted, and deterministic (fixed 400k-rollout budget + seed: same context in, bit-identical advice out, auditable via context_hash).

Free vs paid, honestly: gt_negotiate_turn on this same server is free — generic, no receipt, no category tuning, wall-clock compute (non-deterministic). Pay when you want the tuned, auditable, replayable version with the drafted message and the receipt.

Don't see your category? nextmove_request — unmet requests decide what gets stocked next.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses determinism (bit-identical advice), auditability via context_hash, pricing, receipt, category tuning, and the 7-day/10-move window. It does not explicitly state side effects, but as a catalog query it is likely read-only. The behavioral traits are well covered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is 7 sentences across three paragraphs, front-loading the main purpose. Every sentence adds value, but it is slightly verbose. Could be tightened while retaining all information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description fully explains what the catalog contains: the paid SKU details, comparison with free alternative, and mention of category requests. It also references sibling tools for context. Complete for a catalog tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are 0 parameters in the input schema, so baseline is 4. The description adds context about what the catalog contains (SKU details, payment, etc.), which enhances understanding beyond the empty schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool as a vending machine catalog ('what's on the shelf and how to pay'), specifies the single SKU (paid negotiation session), and distinguishes it from sibling tools like nextmove_request and gt_negotiate_turn. The verb+resource is explicit and differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly contrasts free vs paid: gt_negotiate_turn is free for generic negotiation, while this tool's product is paid for tuned/auditable/replayable advice. Also directs users with unmet categories to nextmove_request. Clear when to use which.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nextmove_closeBInspect

Mark your negotiation finished (you accepted or walked). Optional — sessions also expire on their own — but closing timestamps the outcome, which helps the machine learn real round-counts per category. Tell us how it ended via nextmove_request if you like; settlement reports calibrate the engine's predictions.

ParametersJSON Schema

Name	Required	Description	Default
`api_key`	Yes
`session_id`	Yes

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must fully disclose behavioral traits. It mentions the action is optional and that sessions expire, but it does not reveal whether the tool is destructive, what happens upon closing, authentication requirements (api_key is not explained), rate limits, or if there are any side effects. This lack of information is significant for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is compact, with a clear first sentence stating the purpose and a second sentence adding optionality and related tools. While the second sentence is slightly verbose, the structure is front-loaded and efficient. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity of the tool and lack of output schema, the description covers the main purpose and provides some guidance. However, the absence of parameter descriptions and behavioral details makes it incomplete for an AI agent to use confidently. The description is adequate but has clear gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning the input schema provides only parameter titles without descriptions. The description does not explain the parameters at all; it only implicitly references session_id but does not describe api_key or session_id usage. This leaves the agent without necessary context for correct invocation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Mark your negotiation finished') and the resource ('negotiation session'). It also explains the benefit (timestamping for ML) and mentions an alternative tool, distinguishing it from sibling tools like nextmove_open and nextmove_request.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description indicates that closing is optional and sessions expire on their own, but recommends closing for ML benefits. It also points to nextmove_request for settlement reports, providing clear context and an alternative. However, it does not explicitly state when not to use the tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nextmove_openAInspect

PAID ($2 once, from your credit balance): open a negotiation session covering EVERY move of this negotiation (up to 10 moves, 7 days). category: resale | supply | retail. side: buy | sell. walk_away = your true floor (sell) / ceiling (buy) — private, never crossed. Pass their_offers to get the first move back immediately with the session. Subsequent moves: nextmove_advise with the session_id — no further charge.

ParametersJSON Schema

Name	Required	Description	Default
`seed`	No
`side`	Yes
`target`	Yes
`api_key`	Yes
`category`	Yes
`my_offers`	No
`walk_away`	Yes
`rounds_left`	No
`their_offers`	No

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses key behaviors: paid tool, private walk_away never crossed, maximum moves and timeframe. No annotations, so description carries full burden; covers most aspects except error handling or expiration details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single dense paragraph with clear structure: cost first, then purpose, then parameter roles, then flow. Slightly verbose but efficient for the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, cost, limitations, workflow, and key parameters. No output schema or return details, but the description implies immediate move response. Missing explanation for seed and rounds_left.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, description adds meaning to required params (category, side, walk_away, target) and optional their_offers. Seed and rounds_left are not explained, but overall compensation is strong.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states action ('open a negotiation session'), resource ('EVERY move of this negotiation'), and boundaries (up to 10 moves, 7 days). Distinguishes from sibling tools like nextmove_advise for subsequent moves.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mentions when to use (immediately to start), cost ($2), and how to proceed after (use nextmove_advise). Implicitly excludes other tools by describing the workflow. No explicit 'when not to use' but context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

nextmove_requestAInspect

Ask the vending machine for ANYTHING it doesn't stock — a negotiation category, a different game, a capability. Free. Every request is logged verbatim (size-capped) and decides what gets stocked next: the null-query log is how the shelf learns.

ParametersJSON Schema

Name	Required	Description	Default
`text`	Yes
`api_key`	No

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden and discloses key behaviors: requests are logged verbatim, size-capped, and used for stocking decisions. It does not cover authentication or rate limits, but the core behaviors are transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loading the main action and then explaining the logging behavior. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the main purpose and behavioral aspects but does not mention the tool's return value or error conditions. Given the absence of an output schema and annotations, additional details about what the tool returns would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It explains the 'text' parameter (the request) but completely ignores the 'api_key' parameter, leaving its purpose unclear.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: asking the vending machine for something it doesn't stock. It uses a specific verb ('ask') and resource ('vending machine'), and distinguishes from sibling tools (e.g., nextmove_catalog, nextmove_advise) by focusing on requests for unstocked items.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (when you want something not stocked) and mentions logging and learning. It does not explicitly state when not to use or compare to alternatives, but the context of 'ANYTHING it doesn't stock' provides clear guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

offer_profile_menuAInspect

Profile a seller's menu: classify every dimension FREE or LEVER — where the negotiation surface actually is.

USE THIS WHEN: you (or the agent you're buying from / selling for) have a menu of configurable items and want to know which dimensions are worth negotiating on. FREE = zero cost gradient — a costless customization (sweetness, cup choice); the buyer just gets their favorite, it is never a price lever. LEVER = changing the option moves the seller's effective cost — a real negotiation surface (which item, quantity, add-ons).

spec is the declarative JSON menu spec (the same format everywhere in SNHP — the /v1/offer/* HTTP endpoints and the JS engine accept it too):

{"name": "corner coffee cart", "dims": [ {"id": "item", "kind": "choice", "options": [ {"id": "oat-latte", "price_delta": 5.25, "unit_cost": 1.20}, {"id": "drip", "price_delta": 3.00, "unit_cost": 0.40}]}, {"id": "extras", "kind": "addon", "options": [...]}, {"id": "cup", "kind": "preference", "options": [ {"id": "for-here"}, {"id": "to-go"}]}, {"id": "pickup", "kind": "fulfillment", "options": [ {"id": "now", "immediate": true, "slot_ticks": 0}, {"id": "in-20", "immediate": false, "slot_ticks": 2}]}, {"id": "qty", "kind": "quantity", "qty_cap": 3}], "cost": ["const"]}

Dimension kinds: choice (pick exactly one), addon (pick a subset), preference (pick one, costless taste), fulfillment (timing slot), quantity (integer 1..qty_cap). Option fields: price_delta (contribution to LIST price), unit_cost, and optionally stock_limited, perishable, salvage, immediate, slot_ticks. Cost stack tokens: "const", "salvage_on_expiry" (perishables at salvage when expiring), "scarcity_shadow" (finite stock displaces list sales), {"batch_economies": {"setup": 1.0, "marginal": 0.2}}.

Optional state is the shop moment the cost model reads: {"inventory": {"cold-brew": 6}, "expected_demand": {"cold-brew": 10}, "expiring": ["croissant"], "capacity": {"2": 6}, "tick": 0}

Returns {dims: [{dim, kind, verdict, cost_spread, why}], verdicts, note}. Returns {"error": "..."} on a malformed spec.

ParametersJSON Schema

Name	Required	Description	Default
`spec`	Yes
`state`	No

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It explains the input spec format in detail, the cost stack tokens, and the output structure including possible error. It discloses the logic of classification based on cost gradients, but does not mention performance or limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but well-structured with clear sections. It front-loads the purpose and provides detailed information. Every sentence adds value, though some repetition could be trimmed. It is appropriately sized for the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 2 parameters, no output schema, and no annotations, the description is highly complete. It covers input format, optional state, output structure, and even includes an example. It fully equips the agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description must add meaning. It thoroughly explains the `spec` parameter with examples, dimension kinds, option fields, and cost stack tokens. It also explains the `state` parameter with an example. This adds significant value beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Profile a seller's menu: classify every dimension FREE or LEVER.' It explains what the tool does and the output it produces. The sibling tools include negotiation and auction tools, and this tool's unique function of classifying menu dimensions is well-distinguished.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use: 'USE THIS WHEN: you ... have a menu of configurable items and want to know which dimensions are worth negotiating on.' It explains the concepts of FREE and LEVER. While it does not provide explicit when-not or alternative tools, the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

offer_quoteAInspect

Price a buyer's cart on a menu — a Nash-split, DISCOUNT-ONLY quote (never above list).

USE THIS WHEN: you want the engine's advisory price for a configurable cart on a seller's menu — which configuration to sell/buy and at what price, given the shop's live state. The engine searches every valid configuration (or prices the pinned config), anchors the no-deal point on the buyer's best full-price menu order, and splits only the NEWLY-CREATED surplus. HARD GUARANTEE: discount-only — the price is never above the menu's list value (never_above_list: true in the response, enforced in code). A buyer who would have paid list never gets a discount out of the seller's standing margin.

spec: the same declarative JSON menu spec as offer_profile_menu. state (optional): the shop moment — {"inventory": {...}, "expected_demand": {...}, "expiring": [...], "capacity": {...}}. buyer: {"values": {"item": {"oat-latte": 6.0}}, # $/unit by dim->option "qty_decay": 0.9, # each extra unit worth 90% of the previous "outside": 0.0, # outside-option surplus in $ "balk": 0.3, # chance a walk-in bails at the queue "defer": {"2": 0.1}} # slot_ticks -> $ cost of waiting config (optional): pin the cart — {"item": "oat-latte", "extras": ["vanilla"], "pickup": "in-20", "qty": 2}; addon dims take a list, quantity an int. Omit to search every valid cart. Options: quote_lookers=False refuses buyers who'd never pay list (the incentive-compatibility floor); min_price_frac=0.8 never quotes below 80% of list; qty_appetite=True blocks upsell units the buyer values below their cost; seller_weight tilts the Nash split (0.5 = symmetric).

Returns {"outcome": "negotiated"|"at_list"|"walk", "quote": {config, price, listv, save, cost, value, seller_gain, buyer_gain, feasible, why} | null, "never_above_list": true, "advisory": true, note}. Quotes are SIMULATED advisory engine output, not a binding offer from any seller. Returns {"error": "..."} on malformed input.

ParametersJSON Schema

Name	Required	Description	Default
`spec`	Yes
`buyer`	Yes
`state`	No
`config`	No
`qty_appetite`	No
`quote_lookers`	No
`seller_weight`	No
`min_price_frac`	No

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It thoroughly discloses: discount-only enforcement, Nash-split mechanics, advisory non-binding nature, error handling, and the 'never_above_list' guarantee.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Long but well-structured: one-line purpose, usage guidance, parameter specs, return format. Could trim minor redundancy but each sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 8 params, nested objects, no output schema, the description is remarkably complete: explains return object structure, error case, and advisory nature. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description compensates fully. Each of 8 parameters is explained with examples, types, defaults, and behavior (e.g., buyer object keys, config format, boolean options).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a clear verb+resource: 'Price a buyer's cart on a menu — a Nash-split, DISCOUNT-ONLY quote (never above list).' This immediately distinguishes it from sibling tools like offer_profile_menu and negotiation tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes 'USE THIS WHEN' section that explains the exact scenario (advisory price for configurable cart). However, it does not explicitly state when NOT to use or mention alternatives like GT negotiation tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_dealAInspect

Score a settled package against the exact Pareto frontier — the SNHP leaderboard metric ("dollars left on the table") for YOUR negotiation.

Args: issues: one dict per issue: {"name": str, "options": [labels], "my_utility": [per-option value to me], "their_utility": [per-option value to them]} — both sides' TRUE per-option values. my_weights: {issue_name: weight} — my true priorities (any scale). their_weights: {issue_name: weight} — their true priorities. package: the settled deal, {issue_name: option_label}. notional: deal size in dollars for the dollars-left framing.

Returns realized joint welfare, the frontier best, the naive middle-split baseline, frontier capture, logroll capture, and dollars_left_on_table.

ParametersJSON Schema

Name	Required	Description	Default
`issues`	Yes
`package`	Yes
`notional`	No
`my_weights`	Yes
`their_weights`	Yes

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Given no annotations, the description bears full burden. It details the computation (realized joint welfare, frontier best, etc.) and returns several metrics. While it doesn't explicitly state non-destructiveness or authorization needs, the behavior is clearly a read-only scoring operation without side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the main purpose in the first sentence. The docstring format provides structured parameter descriptions. While slightly lengthy, every sentence adds value; minor redundancy could be trimmed but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters with nested objects and no output schema, the description adequately explains inputs and mentions return values (realized joint welfare, etc.). It could include more detail on output format or constraints (e.g., utility consistency), but provides sufficient context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema coverage, the description compensates thoroughly by explaining each parameter's structure and purpose, including the nested 'issues' dict with fields like 'name', 'options', 'my_utility', 'their_utility', and the role of 'my_weights', 'their_weights', 'package', and 'notional'. This adds significant meaning beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Score a settled package against the exact Pareto frontier' using a specific metric ('dollars left on the table'). It uses a specific verb ('score') and resource ('settled package'), and distinguishes itself from sibling negotiation tools that deal with proposing, responding, or opening sessions.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage after a negotiation settlement by stating 'Score a settled package', but lacks explicit guidance on when to use or when not to use this tool versus alternatives. No exclusions or alternative tools are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Server Details

Available Tools

Discussions

Your Connectors

Resources