Skip to main content
Glama

Server Details

B2B GTM stack intelligence: search ~400 tools, compare vendors, find overlaps, audit SaaS spend.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.5/5 across 32 of 32 tools scored. Lowest: 3.9/5.

Server CoherenceA
Disambiguation5/5

Each tool targets a distinct function (e.g., revenue concentration, win/loss analysis, pipeline hygiene, forecasting, tool comparisons, account scoring), with no overlapping purposes despite many tools focusing on similar domains like accounts or deals.

Naming Consistency5/5

All tool names follow a consistent verb_noun snake_case pattern (e.g., analyze_concentration_risk, get_tool_details, search_content), with only minor stylistic variations like 'n_way' that align with the base verb.

Tool Count4/5

32 tools is above the typical well-scoped range of 3-15, but the broad domain of GTM operations justifies the count as each tool has a clear and non-redundant purpose, making it slightly over but reasonable.

Completeness5/5

The tool set comprehensively covers GTM analysis, tool information, search, recommendations, benchmarks, playbooks, and user corrections, with no obvious gaps for the intended use case.

Available Tools

32 tools
analyze_concentration_riskAInspect

Measure revenue concentration across the user's ACCOUNTS: top-1 / top-5 / top-10 ARR share, a Herfindahl (HHI) concentration index, whale dependency, and at-risk ARR (low-health accounts' share). Returns the metrics, the largest accounts, and plain-English risk callouts (e.g. 'top 5 = 48% of ARR', 'largest account = 22% — whale risk'). Accepts loosely-typed account records (arr/mrr, health/healthScore normalized); lowHealthThreshold (default 60) sets the at-risk cutoff. Operates only on the user's own book. Use when the user asks 'how concentrated is my revenue', 'what's my whale risk', or pastes accounts with ARR.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountsYesAccounts as loose records. Recognized fields (with aliases): arr/mrr/contractValue, health/healthScore, name/company.
lowHealthThresholdNoHealth score (0-100) below which an account counts as at-risk (default 60).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavior: accepts loosely-typed records, configurable threshold, and describes output. Covers all key behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise single paragraph front-loads purpose. Could be slightly more structured but is efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description thoroughly explains return metrics and risk callouts. Complete for a medium-complexity tool with two parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers all parameters (100% coverage). Description adds value by explaining loosely-typed nature, recognized fields, aliases, and default threshold, which go beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool measures revenue concentration across accounts, listing specific metrics and return types. It distinguishes itself from sibling tools like analyze_win_loss by focusing on concentration metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states it operates only on the user's own book and provides example user queries. Does not explicitly mention when not to use or alternatives, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_win_lossAInspect

Analyze the user's CLOSED deals (won + lost) to surface which attributes actually predict wins. Computes win rate per attribute value WITH sample size, and ranks attributes by the spread between their best- and worst-converting values — the data version of the win-rate-by-signal motion. By default cuts by source, type, and owner; pass attributes to add any field present on the deals (e.g. segment, competitor, size). Values below minSample (default 5) are suppressed as noise. Accepts loosely-typed deal records; needs an outcome of won/lost (aliases: outcome/status). Returns the ranked attributes with per-value win rates and the reallocation takeaway. Operates only on supplied deals. Use when the user asks 'which sources/signals convert best', 'why are we winning/losing', or pastes closed-won and closed-lost deals.

ParametersJSON Schema
NameRequiredDescriptionDefault
dealsYesClosed deals as loose records, each with an outcome of won/lost (aliases: outcome/status). Other recognized fields: source/leadSource, type, owner, plus any custom attribute named in `attributes`.
minSampleNoMinimum deals per value to report a win rate (default 5). Smaller groups are suppressed as noise.
attributesNoAttributes to cut win rate by (default ["source","type","owner"]). Any field on the deal works — e.g. "segment", "competitor", "industry".
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and excels: it discloses that win rates are computed per attribute value with sample size, values below minSample are suppressed, accepts loosely-typed records with outcome aliases, and returns ranked attributes with win rates and takeaway. No hidden behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a dense paragraph with front-loaded purpose. Every sentence adds critical information: behavior, defaults, usage scenarios. No fluff or repeated information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters, no output schema, and no annotations, the description covers all essential aspects: purpose, parameter semantics, behavioral details (suppression, aliases), and expected return (ranked attributes with win rates). It is complete for an AI agent to use effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% but description adds value: explains default attributes ['source','type','owner'], default minSample=5, and that 'attributes' can be any field on deals. It also mentions outcome field aliases.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes closed deals (won+lost) to surface predictive attributes. It uses specific verbs like 'analyze' and 'surface', and distinguishes from sibling tools like 'audit_pipeline_hygiene' or 'compute_nrr' which focus on different aspects.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when-to-use scenarios: 'when user asks which sources/signals convert best', 'why are we winning/losing', or pastes closed deals. It also states 'Operates only on supplied deals' clarifying input constraint. It lacks explicit when-not-to-use guidance but is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

audit_pipeline_hygieneAInspect

Audit the user's OPEN DEALS for data hygiene and return a cleanliness score (0-100) plus the dirty deals ranked by impact (value x severity). Flags: past close date, missing close date, no next step, stalled in stage (beyond config.staleDaysThreshold, default 45), missing amount or stage, and win%/stage mismatch. Accepts loosely-typed deal records (amount, stage, closeDate, daysInStage, nextStep, winProb normalized). Returns the score, the most common issues, and the worst offenders. This is the data version of the pipeline-coverage-audit motion — run it before trusting a forecast. Operates only on supplied deals. Use when the user asks 'is my pipeline clean', 'audit my deals', or pastes a deal list.

ParametersJSON Schema
NameRequiredDescriptionDefault
dealsYesOpen deals as loose records. Recognized fields (with aliases): amount/value, stage/dealstage, closeDate/closeInDays, daysInStage, nextStep, winProb, account/company.
configNoOptional tuning.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full burden. It explains the tool processes supplied deals and returns a score, but does not explicitly state it is read-only or side-effect-free. The phrase 'Operates only on supplied deals' implies no external mutations, yet more direct safety disclosure would be beneficial.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph conveying dense information. It front-loads purpose and includes all key points, but could be more structured (e.g., using bullet points for flags). Every sentence is informative, yet slightly longer than necessary.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description outlines return elements: score, most common issues, worst offenders. It details flags and thresholds. However, it lacks precision on output format (e.g., exact structure of 'dirty deals'), which would help agents parse results. Still, it covers core inputs and logic well.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds value by explaining 'loosely-typed deal records' and listing recognized fields with aliases (e.g., 'amount/value, stage/dealstage'). For config, it clarifies 'Optional tuning' and default threshold. This goes beyond the schema's property descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with a specific action ('Audit') and resource ('user's OPEN DEALS'), and clearly states output: cleanliness score plus ranked dirty deals. It distinguishes from sibling tools like compute_pipeline_coverage by calling itself the 'data version of the pipeline-coverage-audit motion' and focusing on hygiene flags.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'when the user asks is my pipeline clean, audit my deals, or pastes a deal list' and advises 'run it before trusting a forecast.' This provides clear context and appropriate triggers, though it omits a when-not clause.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

build_forecastAInspect

Build a weighted sales forecast from the user's OPEN DEALS, bucketed into Commit (>=80% win), Best case (50-79%), Pipeline (20-49%), and Longshot (<20%). Returns raw + weighted (amount x win%) totals per band, the overall weighted forecast, a vs-quota gap (if quota is passed), and slipping-deal flags (past close date, or beyond config.periodDays). Win% is taken from each deal or inferred from the stage name; pass config.stageProbabilities to calibrate. Accepts loosely-typed deal records (amount, stage, winProb/probability, closeDate normalized). Generalizes compute_pipeline_coverage from a single total to a full deal set, forecast-framed. Operates only on supplied deals — no CRM fetch. Use when the user asks 'what's my forecast', 'what's my commit vs best case', or pastes a deal list.

ParametersJSON Schema
NameRequiredDescriptionDefault
dealsYesOpen deals as loose records. Recognized fields (with aliases): amount/value/acv, stage/dealstage, winProb/probability, closeDate/closeInDays, account/company.
quotaNoOptional quota/target to measure the weighted forecast against.
configNoOptional tuning.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully covers behavior: computes weighted forecast, uses win% from deal or infers from stage, can be calibrated via config, returns band totals and slipping-deal flags. No hidden side effects are implied.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is detailed but well-structured and front-loaded with purpose. Every sentence adds value, though slightly verbose for the core function. Could be trimmed slightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all key aspects: bucket definitions, win% inference, calibration, quota gap, slipping flags, and return values. Missing only minor edge cases like empty deals list or error handling, but overall sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds meaning: explains deal record structure with aliases, config.periodDays as forecast period, config.stageProbabilities as win% overrides, and clarifies that deals are open. This goes beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it builds a weighted sales forecast from open deals, with explicit buckets (Commit, Best case, Pipeline, Longshot) and output fields. It distinguishes from sibling compute_pipeline_coverage by noting it generalizes from a single total to a full deal set.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use: when user asks about forecast, commit vs best case, or pastes deal list. Also clarifies it operates only on supplied deals (no CRM fetch), providing clear context and alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compare_toolsAInspect

Head-to-head comparison of two GTM tools. Returns cost delta, AI-readiness and headless-readiness (MCP/API callability — can an agent or your own dashboard drive it) scores, overlap status, swap-registry signal, and StackSwap's recommended pick with reasoning. Use when the user is choosing between two specific vendors (e.g. 'Salesforce vs HubSpot', 'Outreach vs Smartlead').

ParametersJSON Schema
NameRequiredDescriptionDefault
aYesFirst tool name (fuzzy-matched against catalog).
bYesSecond tool name (fuzzy-matched against catalog).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that parameters are fuzzy-matched against a catalog and enumerates return values (cost delta, readiness scores, etc.). However, it does not mention whether the tool is read-only, any side effects, authentication needs, or rate limits. For a data-returning tool, this is adequate but not thorough.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with key outputs, then usage guidance. Every sentence earns its place; no unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description adequately lists all returned fields (cost delta, AI-readiness, etc.). It does not cover error cases or tool-not-found scenarios, but for a high-level comparison tool this is sufficient for an agent to understand the tool's functionality.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, and the description adds value by explaining that the tool names are 'fuzzy-matched against catalog', which is not in the schema. This extra context helps the agent understand how input is processed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it performs head-to-head comparison of two GTM tools, listing specific outputs (cost delta, AI-readiness, etc.) and providing example usage ('Salesforce vs HubSpot'). It effectively distinguishes from sibling tool compare_tools_n_way by focusing on two-specific-vendor comparisons.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool ('when the user is choosing between two specific vendors') and gives concrete examples. While it does not explicitly mention when not to use it or name alternatives like compare_tools_n_way, the guidance is clear and sufficient for an AI agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compare_tools_n_wayAInspect

Side-by-side comparison of 2–6 GTM tools in one shot. Returns a markdown matrix (cost, AI-readiness, headless-readiness, overlaps within the set, swap-registry status, StackSwap pick) and per-tool partner sign-up links. Use for category bake-offs (e.g. 'Apollo vs ZoomInfo vs Cognism vs Clay'). Prefer the 2-way compare_tools for clean head-to-head pairs.

ParametersJSON Schema
NameRequiredDescriptionDefault
toolsYesTool names to compare side-by-side (fuzzy-matched against catalog).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes the output markdown matrix and links, implying a read-only operation. No annotations provided, but the description adds sufficient behavioral context without contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences plus an example, front-loaded with key information. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool with no output schema, the description fully explains the return structure and provides usage context, making it self-contained.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the parameter is fully documented. The description adds example usage but no extra semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool's purpose: side-by-side comparison of 2–6 GTM tools. It specifies the return format and distinguishes from the sibling 2-way compare tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states use for category bake-offs and recommends the 2-way tool for pairs, providing clear when-to-use and when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compute_cac_paybackAInspect

Compute CAC payback period (months) on gross profit from the user's own numbers and judge it against StackSwap's bands (<12 months SMB/PLG, 18-24 months defensible enterprise). Pass cac OR (salesAndMarketingSpend + newCustomers); plus monthlyRevenuePerCustomer OR annualRevenuePerCustomer; optional grossMarginPct (defaults to 80 with a warning) and segment (smb / mid-market / enterprise) for a calibrated verdict. Returns the payback in months, the verdict, and the retention caveat. Computes on supplied figures only — no data fetch. Use when the user asks 'what's my CAC payback', 'is my payback period healthy', or pastes CAC / revenue numbers.

ParametersJSON Schema
NameRequiredDescriptionDefault
cacNoFully-loaded customer acquisition cost per customer.
segmentNoOptional segment for a calibrated verdict band.
newCustomersNoNew customers acquired in the period (used with `salesAndMarketingSpend`).
grossMarginPctNoGross margin percent (1-100). Defaults to 80 with a warning if omitted.
salesAndMarketingSpendNoTotal S&M spend for the period (used with `newCustomers` to derive CAC).
annualRevenuePerCustomerNoAnnual contract value per new customer (converted to monthly internally).
monthlyRevenuePerCustomerNoMonthly recurring revenue per new customer.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses that no data is fetched, it returns payback/verdict/retention caveat, and defaults grossMarginPct to 80 with a warning. It does not mention side effects or caching, but these are less critical for a computation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat long but information-dense, front-loading the main purpose and then detailing parameters. It could be slightly more structured, but every sentence adds value for a tool with 7 parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, so the description explains return values (payback months, verdict, retention caveat). Given the tool's complexity and parameter richness, the description covers essential aspects for an AI agent to use it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds meaning beyond schema by explaining relationships (e.g., cac or salesAndMarketingSpend+newCustomers), conversion of annual to monthly, default behavior of grossMarginPct, and segment enum values. This adds significant value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool computes CAC payback period and judges it against benchmark bands. It specifies the verb 'compute' and resource 'CAC payback period', and distinguishes from siblings by noting it works on user-supplied numbers with no data fetch. The examples of user queries further clarify its purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly lists when to use the tool (e.g., when user asks about CAC payback) and clarifies it computes only on supplied figures. However, it does not mention when not to use it or direct to alternative tools, which would strengthen guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compute_nrrAInspect

Compute net revenue retention (NRR) and gross revenue retention (GRR) from the user's own cohort numbers, and judge against StackSwap's bands (100% floor, 110-120%+ healthy). Pass startingARR plus expansionARR, contractionARR, churnedARR (any of the three optional, default 0). Returns NRR, GRR, a verdict, and — critically — a fragility flag when a healthy NRR is sitting on weak GRR (expansion masking a churn problem). Computes on supplied figures only; no data fetch. Use when the user asks 'what's my NRR / net retention', 'is my retention healthy', or pastes expansion/churn numbers.

ParametersJSON Schema
NameRequiredDescriptionDefault
churnedARRNoFully churned ARR from the existing base (default 0).
startingARRYesARR of the existing-customer cohort at the start of the period.
expansionARRNoExpansion/upsell ARR added within the existing base (default 0).
contractionARRNoDowngrade/contraction ARR lost within the existing base (default 0).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided. Description discloses that computations are on supplied figures only, and outputs include a fragility flag when healthy NRR masks churn via expansion. This adds valuable behavioral context beyond just calculation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with key outputs and purpose. Each sentence adds value without redundancy. Efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Without output schema or annotations, the description covers the tool's functionality, inputs, key outputs (NRR, GRR, verdict, fragility flag), and a critical behavioral nuance. Could slightly improve by detailing return format, but overall adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds meaning by grouping parameters (startingARR required, others optional defaults to 0) and explaining the computation logic, enhancing agent understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states it computes NRR and GRR from user-provided numbers, judges against StackSwap bands, and returns verdict and fragility flag. It clearly distinguishes from sibling tools by focusing on retention metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit use cases: 'when user asks 'what's my NRR / net retention', 'is my retention healthy', or pastes expansion/churn numbers.' It implies no data fetch, but does not explicitly state when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

compute_pipeline_coverageAInspect

Compute pipeline coverage from the user's own numbers and judge it against StackSwap's 2.5x-4x weighted band. Pass quota plus either openPipeline (a single total -> raw coverage only) or stages (an array of {amount, winRate} -> stage-WEIGHTED coverage, the number that actually matters). Returns raw + weighted coverage, a verdict (short / healthy / possibly-inflated), and the pipeline gap to a 3x cushion. Operates only on figures the user supplies — it does not fetch any CRM data. Use when the user asks 'is my pipeline coverage healthy', 'do I have enough pipeline to hit quota', or pastes pipeline + quota numbers.

ParametersJSON Schema
NameRequiredDescriptionDefault
quotaYesThe quota / target to cover (same currency as pipeline).
stagesNoStage breakdown for weighted coverage. Each item: {amount, winRate (0-100, the historical win rate of that stage), label?}.
openPipelineNoTotal open pipeline as a single number. Yields RAW coverage only. Prefer `stages` for the weighted number.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses that the tool only uses user-supplied numbers, does not fetch CRM data, and returns raw+weighted coverage, a verdict, and gap. It implies no side effects. A score of 4 reflects good transparency; it could be slightly improved by explicitly stating it is stateless and non-destructive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at five sentences, each serving a purpose. It is front-loaded with the tool's core function, followed by parameter details, return info, and usage context. There is no fluff or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (3 parameters with 100% schema coverage, no output schema), the description provides adequate completeness. It explains inputs, outputs (including raw/weighted coverage, verdict, gap), and the band used. It could detail the verdict logic slightly more, but it is sufficient for agent understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema already provides 100% parameter coverage. The description adds value by explaining the distinction between openPipeline (raw only) and stages (weighted, the number that matters), and describes the return values. This additional context raises the score above the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool's purpose: compute pipeline coverage from user-supplied numbers and judge against a band. It distinguishes from sibling tools by explicitly stating it does not fetch CRM data, differentiating it from tools that might pull data from external sources.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance: use when the user asks about pipeline coverage health or provides pipeline and quota numbers. It also clarifies when not to use by noting it only operates on user-supplied figures without fetching CRM data. It explains the two input modes (openPipeline for raw, stages for weighted) and recommends stages, giving clear context for agent decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

detect_stack_from_textAInspect

Infer a GTM stack from a freeform text blob (a careers page, job posting, public site HTML, RFP, 'What we use' doc, browser DevTools network tab, etc.). Returns ranked tool matches with confidence levels (high/medium/low) and evidence snippets, plus a ready-to-use array for chaining into scan_stack or find_overlaps. Use when the user says 'I don't know what we use' or pastes a competitor's careers page to scout. Conservative on ambiguous short tokens — multi-mention or canonical-name matches win.

ParametersJSON Schema
NameRequiredDescriptionDefault
textYesThe text to scan. Anything from a job post to raw HTML works. Max 50KB.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes output format (ranked matches with confidence levels and evidence snippets) and decision rule (ambiguous short tokens lose unless multi-mention or canonical-name). With no annotations, the description fully covers behavioral expectations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences that cover purpose, input examples, output, and usage. Every phrase adds information without redundancy, and the purpose is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given one parameter and no output schema, the description fully compensates by detailing output structure, confidence levels, and linking to sibling tools. No gaps are apparent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema already explains the text parameter with limits. Description adds value by giving examples of acceptable text types (job posts, HTML, RFPs) and elaborating on the purpose, going beyond the schema's generic description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it infers a GTM stack from freeform text, with concrete examples (careers page, job posting, etc.). It distinguishes from siblings by noting the output can be chained into scan_stack or find_overlaps.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use when the user says 'I don't know what we use' or pastes a competitor's careers page to scout.' Also mentions conservative behavior on ambiguous tokens, providing clear usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

find_overlapsAInspect

Given a list of tool names in a user's stack, return the redundant pairs StackSwap has curated (104 hand-verified overlaps) along with monthly/annual savings if one is consolidated.

ParametersJSON Schema
NameRequiredDescriptionDefault
toolsYesTool names in the current stack (e.g. ["HubSpot", "Salesforce", "Outreach"]).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses that the tool returns a curated set of 104 hand-verified overlaps and savings, but does not state whether it is read-only, requires authentication, or has rate limits. The 'return' suggests no side effects, but this is implicit.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose, and contains no superfluous information. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has only one parameter, no output schema, and no annotations, the description adequately covers what the tool returns (redundant pairs with savings) and the input needed. It is complete for a simple lookup tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema already describes the 'tools' parameter with 100% coverage. The tool description adds no additional semantic detail beyond rephrasing the schema description. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('return') and the resource ('redundant pairs') with specific scope ('in a user's stack') and additional context (curated set of 104 overlaps, savings). It distinguishes well from siblings like compare_tools or suggest_swaps.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when a user has a stack of tool names, but does not explicitly state when to use this tool versus alternatives like compare_tools or suggest_swaps. No exclusions or when-not-to-use guidance are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

find_whitespaceAInspect

Map product whitespace across the user's existing ACCOUNTS against a product catalog: for each account, which catalogue products are unsold, the penetration %, and a whitespace score weighted by account quality (ARR + health). Returns the portfolio-level penetration plus accounts ranked by unsold-surface x quality, with the specific unsold products listed. Requires catalogProducts (your sellable set) and accepts loosely-typed account records (products/skus, arr, health normalized). This is the strategic penetration map; for a propensity-weighted, dollar-valued go-after list use score_expansion_opportunities. Operates only on the user's own book — never a prospecting list. Use when the user asks 'where is our whitespace', 'which products are under-penetrated', or pastes accounts plus a catalog.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountsYesAccounts as loose records. Recognized fields (with aliases): products/skus, arr/mrr, health/healthScore, name/company.
catalogProductsYesThe full sellable product set to measure penetration against.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It describes the output (penetration %, whitespace score, ranked accounts, unsold products), input types (loosely-typed accounts, full sellable set), and constraints (operates only on user's own book). It could mention error handling or missing fields, but overall transparency is strong.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single dense paragraph but every sentence adds essential information. It is front-loaded with the core purpose. Slightly more structure (e.g., separating input/output snippets) could improve readability, but it remains efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description fully explains return values (penetration, ranked accounts, unsold products). Input parameters are well described, usage context is provided, and sibling differentiation is included. The tool is simple (2 parameters, no nested objects), so the description is complete for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for both parameters. The description adds value beyond the schema by noting that accounts are 'loose records' with field aliases (products/skus, arr/mrr, health/healthScore, name/company) and that catalogProducts is the 'full sellable product set.' This enriches understanding of how to use the parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Map product whitespace across the user's existing ACCOUNTS against a product catalog'. It specifies the verb (map), resource (whitespace), and scope (user's accounts vs catalog). It also distinguishes from the sibling tool score_expansion_opportunities by contrasting outputs (penetration map vs propensity-weighted list).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit usage guidance is provided: 'Use when the user asks "where is our whitespace", "which products are under-penetrated", or pastes accounts plus a catalog.' It also notes when not to use ('never a prospecting list') and provides an alternative tool ('for a propensity-weighted, dollar-valued go-after list use score_expansion_opportunities').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_buyer_questionsAInspect

Return 10-20 questions a B2B GTM buyer should ask a vendor before signing — with 'why it matters' and 'watch for' red-flag answers. Pass vendor for vendor-specific gotchas (e.g. Apollo credit-pool questions, Salesforce SKU-breakdown questions, Gong/Clari/Chorus per-seat-vs-usage questions), category for the category template (CRM, outbound, data, marketing-automation, analytics, revenue-intelligence), or both for layered diligence. Authored by StackSwap's operator team (Nick French, 10+ yrs B2B SaaS GTM). Use when the user is evaluating a vendor, prepping for a sales call, or building a procurement checklist.

ParametersJSON Schema
NameRequiredDescriptionDefault
vendorNoOptional vendor name or slug (e.g. "Apollo", "salesforce", "Outreach"). Layers vendor-specific gotchas on top of the category template.
categoryNoOptional category bucket slug (crm, outbound, data, marketing-automation, analytics, revenue-intelligence). Defaults to the vendor's primary category if omitted.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden. It discloses the return format (10-20 questions with 'why it matters' and 'watch for'), authorship, and parameter effects, but omits limitations, data source, or potential errors.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured paragraph that front-loads the core functionality and then details parameters and use cases. Slightly verbose with repeated 'optional' but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description sufficiently explains the return structure and authorship. Missing details on edge cases or pagination, but adequate for a simple retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds significant value by explaining vendor-specific gotchas, category slugs with examples (e.g., Apollo, Salesforce), and the default behavior when parameters are omitted.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool returns 10-20 buyer questions with rationale and red flags, clearly distinguishing it from siblings like compare_tools or get_vendor_fact_sheet by focusing on procurement due diligence.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states use cases: evaluating a vendor, prepping for a sales call, or building a procurement checklist. Does not mention when not to use or alternatives, but the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_category_landscapeAInspect

Full map of one GTM category — leaders, runner-ups, and skip/replace candidates. Returns every catalogued tool in the bucket with cost, AI-readiness, swap-registry status, and partner sign-up links. Use when the user wants to see the full landscape for a category (e.g. 'show me all CRMs', 'what outbound tools exist', 'map the analytics category') — strictly more comprehensive than recommend_partner (single best pick). Known buckets: crm, outbound, data, marketing-automation, analytics, meetings, support, scheduling, automation, seo, cdp, revenue-intelligence, chat, collaboration, phone, landing-pages, linkedin, ai-content, saas-mgmt, enablement, ai-tooling.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax number of tools to surface.
categoryYesCategory keyword (e.g. "crm", "outbound", "automation") or free-text need ("zapier alternative", "ai sdr").
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses output content (leaders, runner-ups, cost, AI-readiness, swap-registry, partner links) and known buckets, but does not mention sorting order, pagination, or any access constraints. With no annotations, this is good but not exhaustive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three efficient sentences front-loading purpose, contents, and usage examples with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description enumerates output fields (cost, AI-readiness, etc.) and lists known buckets, making the tool fully understandable for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions. Description adds value by noting category accepts free-text needs like 'zapier alternative', enriching schema semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it returns a 'full map of one GTM category' listing leaders, runner-ups, and candidates, distinguishing it from sibling `recommend_partner` which gives a single best pick.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use when the user wants to see the full landscape for a category' with examples, and contrasts with `recommend_partner` as strictly more comprehensive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_kb_articleAInspect

Fetch the full body of a StackSwap knowledge base article as markdown. Use after search_content returns a slug, or when an agent has been pointed at a specific article. Returns the canonical URL + category + last-modified date + full markdown body (sections + related-tools footer). Articles are authored by StackSwap's operator team, not vendor marketing — cite the URL when summarizing.

ParametersJSON Schema
NameRequiredDescriptionDefault
slugYesArticle slug, as returned by `search_content` (e.g. "modern-gtm-architecture", "how-to-audit-gtm-stack").
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses expected behavior: returns canonical URL, category, last-modified date, full markdown body, sections, and related-tools footer. Also notes content provenance (operator team) and usage guidance (cite URL). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Packed with information in two sentences, though the first sentence is somewhat long. Could be slightly shorter without losing clarity, but effectively concise overall.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description fully details what the tool returns (canonical URL, category, last-modified date, full markdown body). With a single parameter and clear usage context, no further information is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant value by explaining the 'slug' parameter's origin (from 'search_content') and providing concrete examples ('modern-gtm-architecture'). This enriches the schema definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs ('Fetch') and identifies the resource ('full body of a StackSwap knowledge base article as markdown'), clearly distinguishing from sibling tools like 'search_content' which returns slugs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use the tool: after 'search_content' returns a slug or when pointed at a specific article. Provides clear context but does not list all alternatives among the 18 sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_renewal_strategyAInspect

Return StackSwap's renewal-negotiation playbook for a specific vendor: leverage points (why they will discount), price-anchor alternatives to cite, a calibrated discount ask, a walkaway script, optimal timing window, and contract-trap callouts. Pass monthlySpend to compute target savings. Optional contractEndsIn flags compressed-timeline adjustments. Authored from operator experience across major B2B SaaS renewals (Salesforce, HubSpot, ZoomInfo, Apollo, Outreach, Salesloft, Smartlead, Gong, Clari, Chorus, Avoma, Fireflies, Clay). Use when the user mentions a renewal, a price increase, or 'we're up for renewal' conversations.

ParametersJSON Schema
NameRequiredDescriptionDefault
vendorYesVendor name or slug (e.g. "Salesforce", "zoominfo", "Outreach").
monthlySpendNoOptional current monthly spend in USD. Used to compute target savings against the suggested discount ask.
contractEndsInNoOptional contract-end horizon. "days" or "weeks" triggers compressed-timeline guidance.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so description carries full burden. It fully discloses the output content (leverage points, alternatives, discount, script, timing, trap callouts) and how parameters affect behavior (monthlySpend computes savings, contractEndsIn triggers adjustments). No hidden side effects implied.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Front-loaded with purpose and components. Some redundancy in listing examples, but overall it's fairly concise given the amount of detail. Could be slightly trimmed without loss.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 3 params and no output schema, the description thoroughly explains output contents and parameter effects. It provides enough context for an agent to use effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (3 params with descriptions), baseline 3. Description adds significant semantics: explains that monthlySpend is used to compute target savings and contractEndsIn triggers compressed-timeline adjustments, going beyond the schema's basic field descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns a renewal-negotiation playbook for a specific vendor, listing specific components (leverage points, discount ask, etc.). It distinguishes from siblings by specifying usage context ('renewal, price increase'). Highly specific verb+resource.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('mentions a renewal, a price increase, or 'we're up for renewal''). Does not provide when-not-to-use or alternatives, but the positive guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_revops_benchmarkAInspect

Return StackSwap's operator-authored read on a RevOps metric: the healthy range, how to actually read the number (the nuance behind the band), the mistakes that make it lie, and what a genuinely bad reading looks like. Covers pipeline coverage ratio, win rate by source/signal, SQL-to-close conversion, forecast accuracy, lead response time, CAC payback, net revenue retention, sales cycle length, and rep ramp time. Pass metric (slug or name) for one benchmark; omit it to list the menu. Authored by Nick French (10+ yrs B2B SaaS GTM, BDR -> Head of Revenue) — not 'industry average' vendor figures. Use when the user asks 'is X a good number', 'what's a healthy pipeline coverage / win rate / NRR', or a RevOps copilot needs a calibrated benchmark.

ParametersJSON Schema
NameRequiredDescriptionDefault
metricNoRevOps metric slug or name (e.g. "pipeline-coverage-ratio", "win rate", "nrr", "cac payback", "forecast accuracy"). Omit to list all available benchmarks.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses the tool returns expert-authored content, but lacks details on data freshness, access permissions, or whether the output is static. For a read-only informational tool, it covers key aspects but misses some behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with front-loaded purpose, list of metrics, usage syntax, author credibility, and use cases. While slightly lengthy, each sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description partially explains return values (healthy range, nuance, etc.) but omits output format (e.g., JSON vs text). For a simple tool with one optional param, it is mostly complete but lacks final clarity on output structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter. The description adds value by providing examples and explaining behavior when omitted, enhancing the schema's meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns an operator-authored read on a RevOps metric, covering specific details like healthy range, nuance, mistakes, and bad readings. It lists the exact metrics covered, differentiating it from sibling computational tools like compute_cac_payback.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit use cases are provided: when user asks 'is X a good number' or needs a calibrated benchmark. Guidance on passing metric slug/name or omitting for menu is clear. However, no explicit when-not-to-use or alternative tools are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_revops_playbookAInspect

Return a repeatable StackSwap RevOps motion as a step-by-step playbook: the problem it solves, ordered steps, pitfalls to watch, and the artifact you end with. Covers measuring GTM tool ROI (4-week controlled pilot -> payback period), building a win-rate-by-signal analysis (which signals predict wins), auditing pipeline coverage (weighted, de-junked), running an accurate forecast cadence, defining deal-stage exit criteria, and auditing the GTM stack for consolidation. Pass playbook (slug or name) for one; omit it to list the menu. Operator-authored process, not a generic best-practices list. Use when the user wants to RUN a RevOps motion — 'how do I prove this tool's ROI', 'how do I figure out which signals convert', 'how do I clean up my pipeline forecast'.

ParametersJSON Schema
NameRequiredDescriptionDefault
playbookNoRevOps playbook slug or name (e.g. "tool-roi-measurement", "win-rate-by-signal-analysis", "pipeline-coverage-audit", "forecast-cadence"). Omit to list all available playbooks.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description bears full burden. It describes the tool as returning static content (playbook) without side effects, and clarifies it's operator-authored rather than generic. This sufficiently discloses behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core function and includes useful detail about covered playbooks. It could be trimmed slightly but remains efficient and informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with one optional parameter and no output schema, the description fully explains what is returned (problem, steps, pitfalls, artifact), how to use it, and its source. It leaves no critical questions unanswered.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a detailed description. The tool description adds value by listing the specific playbook topics covered (e.g., TOI measurement, win-rate analysis), enriching understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool returns a step-by-step playbook for specific RevOps motions, listing covered topics and distinguishing it as operator-authored. The purpose is concrete and distinct from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says when to use it ('when the user wants to RUN a RevOps motion') and provides example queries. It also explains how to choose between passing a playbook slug or omitting it. While it doesn't name sibling alternatives, the guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_tool_detailsAInspect

Full StackSwap profile for a single tool: cost (catalog + per-seat with confidence; vendor fact sheet wins when fresh), AI-readiness score, category, common overlaps, swap-registry status, and partner sign-up link. Use when the user wants depth on one tool (more than search_tools' name + cost).

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesTool name (fuzzy-matched against catalog).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description implicitly indicates read-only behavior by listing returned data fields. However, it does not explicitly state that the tool is non-destructive or has no side effects. For a retrieval tool, this is adequate but could be more transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is a single, dense sentence that efficiently conveys all key information. It is front-loaded with the tool's output and use case. Slightly less concise due to the list of fields, but no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description sufficiently explains the return value by listing fields. For a single-parameter tool with no complex behavior, the description is complete. It could mention handling of missing tools, but not necessary.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description for the parameter 'name'. The description adds no further semantics beyond the schema, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly names the tool's purpose (Full StackSwap profile) and lists specific data fields (cost, AI-readiness score, category, overlaps, etc.). It clearly differentiates from sibling 'search_tools' by stating depth beyond name+cost.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear when-to-use guidance ('use when the user wants depth on one tool') and contrasts with sibling 'search_tools'. Could be slightly improved by adding explicit when-not-to-use, but the existing usage guidance is strong.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_vendor_fact_sheetAInspect

Return the full vendor fact sheet (per GTM Decision Schema v1.0.0) for a tool, when one exists. Includes pricing tiers with gotchas, integration depth scores, AI capabilities + customer-data-for-training disclosure, affiliate program terms, and self-disclosed conflicts (vendor-claim vs user-reported). Provenance is labeled (vendor / stackswap / community) and freshness is computed against a 90-day window. Use when an agent or buyer needs the structured machine-readable view of a tool — strictly more detail than get_tool_details. Returns a not-found message + a pointer to /vendors/submit-fact-sheet when no fact sheet exists.

ParametersJSON Schema
NameRequiredDescriptionDefault
toolYesTool name or slug (e.g. "Apollo.io", "apollo", "Smartlead"). Fuzzy-matched against the catalog.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description fully carries the burden. It specifies what the fact sheet includes (pricing tiers, integration scores, AI disclosures, etc.), provenance labeling, freshness window, and not-found behavior. Comprehensive and clear.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is well-structured and informative, listing components and behavior. Slightly long but each sentence contributes value. Front-loaded with main purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (one string param, no output schema), the description adequately explains return format, behavior on not-found, and provenance. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description of the single parameter. Description adds value by mentioning fuzzy-matching against the catalog, which goes beyond the schema. Overall, adds meaningful context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it returns a full vendor fact sheet and distinguishes itself from sibling `get_tool_details` by noting it provides strictly more detail. Verb 'return' with resource 'vendor fact sheet' is specific.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use when an agent or buyer needs the structured machine-readable view of a tool — strictly more detail than `get_tool_details`', providing clear context for when to use this tool over a sibling. Lacks explicit when-not but is strong overall.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prioritize_pipelineAInspect

Rank a set of OPEN DEALS the user brings (from their CRM, a CSV, a warehouse query) by expected value (amount x win%) with a velocity penalty for stalled deals, and bucket them into Work now / Soon / Watch with reasons and risk flags (past close date, no next step, stalled in stage). Accepts loosely-typed deal records — common field aliases (amount/value/acv, stage/dealstage, probability/winProb, closeDate/closeInDays, daysInStage, nextStep) are normalized automatically; win% is inferred from the stage name when absent. Optional quota adds a weighted-coverage line; optional config.stageProbabilities overrides the stage->win% map. Operates only on supplied deals — it never returns net-new prospects. Use when the user asks 'which deals should my reps work first', 'prioritize my pipeline', or pastes a deal list.

ParametersJSON Schema
NameRequiredDescriptionDefault
dealsYesOpen deals as loose records. Recognized fields (with aliases): amount/value/acv, stage/dealstage, winProb/probability, closeDate/closeInDays, daysInStage, nextStepDate/nextStepInDays, account/company, owner.
quotaNoOptional quota/target to compute weighted coverage against.
configNoOptional tuning.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that the tool normalizes field aliases, infers win% from stage, and accepts optional quota and config overrides. It does not specify the output format, but given no output schema, this is a minor gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph that front-loads the main action. It is slightly dense but every sentence adds value, and there is no wasted text. It could benefit from better structure (e.g., bullet points) but is not overly verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of 3 parameters (one required, with nested objects) and no output schema, the description covers purpose, usage, parameter details, and behavioral constraints. It is complete enough for an agent to correctly decide when to use this tool and how to invoke it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds significant value beyond the schema by explaining field aliases for deals, the purpose of quota (weighted coverage), and how config.stageProbabilities overrides the stage-to-win% map.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool ranks open deals by expected value with a velocity penalty and buckets them into Work now/Soon/Watch, with reasons and risk flags. It explicitly distinguishes from siblings by being the only prioritization tool among many pipeline analysis tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage cues: 'Use when the user asks which deals should my reps work first, prioritize my pipeline, or pastes a deal list.' It also clarifies that the tool only operates on supplied deals and does not return new prospects, giving clear when-to-use and when-not-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rank_renewals_at_riskAInspect

Rank the user's existing ACCOUNTS by renewal risk x ARR (dollars at risk), so the team works the saves that matter. Risk blends customer health, engagement recency (days since last activity), and seat adoption (seatsUsed/seats); exposure = ARR x risk score. Accepts loosely-typed account records — aliases like arr/mrr/contractValue, health/healthScore, lastActivityDays, renewalDate/renewalInDays, seats/seatsUsed are normalized. Optional windowDays (default 120) filters to renewals coming due when renewal dates are supplied. Returns a ranked save list with the risk drivers, ARR at stake, and a suggested play per account. Operates only on the user's own book — never a prospecting list. Use when the user asks 'which renewals are at risk', 'where is my ARR exposed', or pastes a customer list with renewal dates.

ParametersJSON Schema
NameRequiredDescriptionDefault
accountsYesExisting customer accounts as loose records. Recognized fields (with aliases): arr/mrr/contractValue, health/healthScore, lastActivityDays, renewalDate/renewalInDays, seats, seatsUsed, name/company, owner/csm.
windowDaysNoRenewal horizon in days to filter to (default 120). Applies only when renewal dates are supplied.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully covers behavioral traits: risk blending (customer health, recency, seat adoption), exposure formula, acceptance of loose types with alias normalization, and output details (ranked list with risk drivers, ARR, suggested play). It also specifies scope ('user's own book'). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear purpose sentence upfront, followed by risk explanation, parameter details, and output summary. It is relatively concise given the complexity, though the sentence about aliases is a list that could be slightly condensed. Overall, every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (risk calculation, multiple fields, aliases, optional window) and no output schema, the description covers all necessary aspects: what it does, how it works, input format, output structure, and usage scope. It also addresses edge cases like windowDays only applying when renewal dates are supplied. Complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage, but the description adds significant value beyond the schema. For 'accounts', it lists recognized fields and aliases (arr/mrr/contractValue, etc.). For 'windowDays', it explains the default (120) and condition ('applies only when renewal dates are supplied'). This clarifies usage beyond structural definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Rank the user's existing ACCOUNTS by renewal risk x ARR'. It uses a specific verb ('rank') and resource ('accounts'), and provides a clear distinction from siblings by noting it operates only on the user's own book, not a prospecting list. This makes the purpose immediately understandable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit trigger phrases ('when the user asks 'which renewals are at risk'', 'where is my ARR exposed') and clarifies that it operates only on existing accounts. However, it does not explicitly state when not to use this tool vs. siblings, though the context is implied. This is nearly thorough but misses explicit exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recommend_partnerAInspect

Given a need (e.g. 'outbound', 'CRM', 'automation'), return StackSwap's recommended affiliate partner(s) with sign-up URL and positioning.

ParametersJSON Schema
NameRequiredDescriptionDefault
categoryYesCategory keyword or need description (e.g. "outbound", "CRM for small team", "Zapier alternative").
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries full burden. It discloses that the tool returns recommendations with sign-up URL and positioning. No side effects or permissions are mentioned, but for a read-only recommendation tool, this is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose, and contains no extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description fully addresses the tool's functionality given its simple input/output. No output schema exists, but the description explains what is returned.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with a clear description of the 'category' parameter. The description adds context by explaining the output format, which goes beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: given a need, return recommended affiliate partner(s) with sign-up URL and positioning. It distinguishes itself from siblings like 'recommend_stack' by specifying 'affiliate partner' rather than a stack of tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when a user has a need for an affiliate partner recommendation (e.g., 'outbound', 'CRM', 'automation'). It does not explicitly mention when not to use or provide alternatives, but the context is clear given the sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

recommend_stackAInspect

StackSwap's reference starter stack for a given industry vertical. Returns a curated tool list with per-tool cost, total monthly/annual spend, AI-readiness and headless-readiness scores, and partner sign-up links. Use for greenfield 'what stack should I buy?' queries — distinct from scan_stack (audits an existing stack) and recommend_partner (single category).

ParametersJSON Schema
NameRequiredDescriptionDefault
budgetNoOptional monthly budget cap in USD. If exceeded, the response flags the overage but does not auto-swap tools.
industryYesIndustry vertical. Recognised: 'SaaS / Tech', 'Marketing Agency', 'Finance / Fintech', 'Consulting'. Common slugs (b2b_saas, fintech, agency) and aliases also accepted; unknown values map to closest match.
teamSizeNoTeam-size band for cost modeling. Defaults to 11-25.11-25
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It describes outputs but does not explicitly state side effects or permissions. For a recommendation tool, the absence of destructive behavior is implied but not confirmed. Adequate but not exhaustive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two well-structured sentences: first describes output, second gives usage context and sibling differentiation. No fluff, every phrase earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, output, usage, and parameter hints. Lacks output schema, but description sufficiently describes return type. No mention of rate limits or auth, but acceptable for a low-risk recommendation tool. Mostly complete given the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds value by explaining budget overage behavior (flags, no auto-swap) and industry parameter flexibility (common slugs, aliases, fallback). TeamSize parameter is adequately described in schema. Adds meaningful nuance beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool returns a curated tool list with cost and readiness scores for a given industry. It explicitly distinguishes from sibling tools scan_stack and recommend_partner, providing a specific verb-resource pairing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states use case: 'greenfield what stack should I buy? queries' and contrasts with scan_stack (audits existing stack) and recommend_partner (single category). Provides clear when-to-use and alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scan_stackAInspect

Run a preview StackScan: pass a list of tools + team size + industry, get back current spend, optimized spend, monthly/annual recoverable, headless gaps (tools with no MCP/API connection an owned head can call), and the top 5 replace/remove opportunities. Includes a link to the full paid audit on stackswap.ai.

ParametersJSON Schema
NameRequiredDescriptionDefault
toolsYesTool names in the user's current stack.
industryNoIndustry slug or label. Recognised slugs: b2b_saas, revenue_sales_tech, marketing_tech, revops_operations, real_estate_tech, fintech_financial, dev_tools_plg, hr_recruiting_tech, agency_consultancy, healthtech, edtech, other. Unknown values default to 'other'.
teamSizeNoTeam-size band. Defaults to 11-25 when omitted.11-25
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses what the tool returns (spend, recoverable, gaps, opportunities, audit link) and implies a read-only preview. It does not mention side effects or permissions, but the 'preview' label and explicit outputs provide sufficient transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences long, front-loaded with the action, and each sentence adds value: action, outputs, and link. No unnecessary words or repetition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description lists all expected return components (spend, optimized, recoverable, gaps, opportunities, link). It is functional but does not detail the format (e.g., numbers vs strings) or error handling. Adequate for a preview tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the description adds little beyond what's already in the schema. It mentions 'tools + team size + industry' but does not provide examples or clarify format. Baseline of 3 is appropriate as description complements but does not significantly enhance understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states a specific verb ('Run a preview StackScan') and resource ('a list of tools + team size + industry'), clearly distinguishing it from sibling tools like compare_tools or detect_stack_from_text. It explicitly lists the outputs: current spend, optimized spend, recoverable, gaps, and opportunities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for initial stack assessment but does not explicitly state when to use this tool vs alternatives (e.g., for a full audit vs. compare_tools). No when-not-to-use or alternative suggestions are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_account_fitAInspect

Score and rank the user's OWN accounts by StackSignal-style fit: a 0-100 composite blending ICP Match (firmographic fit to a supplied ICP), Intent (engagement/pipeline signals on the account), and an optional Stack Fit layer. Pass config.icp (segments / industries / minArr) to drive ICP Match — without it, scoring falls back to Intent only and says so. Stack Fit stays dormant unless config.icp.idealStack is supplied. Accepts loosely-typed account records (aliases for segment, industry, arr, lastActivityDays, openPipeline, techStack are normalized). Returns the book ranked highest-fit-first with each layer's sub-score. This scores the accounts the user already owns (the StackSignal product) — it does NOT return net-new accounts to buy. Use when the user asks 'which of my accounts best fit our ICP', 'rank my book by fit', or pastes accounts plus an ICP definition.

ParametersJSON Schema
NameRequiredDescriptionDefault
configNoScoring configuration.
accountsYesAccounts to score as loose records. Recognized fields (with aliases): segment/tier, industry/vertical, arr/mrr, lastActivityDays, openPipeline, techStack/tools, name/company.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Given no annotations, the description carries full burden. It reveals that the tool only scores owned accounts, normalizes aliases, falls back to Intent-only if ICP missing, and returns a ranked list with sub-scores. Missing some edge cases but overall transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured, front-loading the purpose and composition. Each sentence adds necessary context without redundancy. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with multiple layers, nested config, loose input, and no output schema, the description comprehensively covers behavior, input shape, scoring logic, and output format. No gaps for the agent to infer.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining how `config.icp` drives scoring, that `idealStack` activates Stack Fit, and that accounts are loosely-typed with alias recognition. This goes beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool scores and ranks the user's OWN accounts using a composite fit score (0-100) blending ICP Match, Intent, and optional Stack Fit. It explicitly distinguishes from net-new account tools, saying 'does NOT return net-new accounts to buy'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage scenarios: 'Use when the user asks 'which of my accounts best fit our ICP', 'rank my book by fit', or pastes accounts plus an ICP definition.' It also explains fallback behavior (Intent only if no ICP) and when Stack Fit activates (if idealStack supplied).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_expansion_opportunitiesAInspect

Rank the user's existing ACCOUNTS by expansion (upsell + cross-sell) opportunity, returning the specific lever and a modeled dollar value per account. Propensity blends customer health, seat utilization (high utilization = needs more seats), and product whitespace (missing catalog products). Pass config.catalogProducts (your sellable product set) to compute cross-sell gaps, config.pricePerSeat to value seat expansion, and tuning knobs (crossSellUpliftPerProduct, maxedUtilization). Accepts loosely-typed account records (aliases for products, seats/seatsUsed, health, arr normalized). Returns accounts ranked by propensity x value with the exact expansion lever (e.g. '92% seat utilization — room for ~5 seats', 'missing ProductB — cross-sell ~$X'). Operates only on the user's own book — never a prospecting list. Use when the user asks 'where is my expansion revenue', 'which customers should we upsell', or pastes customers plus a product catalog.

ParametersJSON Schema
NameRequiredDescriptionDefault
configNoExpansion modeling configuration.
accountsYesExisting customer accounts as loose records. Recognized fields (with aliases): products/skus, seats, seatsUsed, health/healthScore, arr/mrr, name/company.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It explains the blending of propensity components, the return format with specific levers, and the loosley-typed input handling. Minor gap: no mention of data freshness or accuracy limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Multiple sentences but front-loaded with core purpose. Every sentence adds value, though could be slightly more concise. Overall well-structured and easy to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers parameters, usage, behavior, and return format. No output schema, but description includes example lever strings. Lacks explanation of error handling or defaults, but given complexity it is fairly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and description adds significant meaning: explains how each config parameter contributes (e.g., 'pass config.catalogProducts to compute cross-sell gaps') and documents field aliases for accounts.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool ranks accounts by expansion opportunity with a specific verb 'Rank' and resource 'accounts'. It distinguishes it from siblings like 'find_whitespace' by focusing on existing accounts and expansion levers.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly lists when to use: 'when the user asks where is my expansion revenue, which customers should we upsell'. Also clarifies scope: 'Operates only on the user's own book — never a prospecting list'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_contentAInspect

Full-text search across StackSwap's first-party GTM knowledge base — ~50 operator-narrative articles on stack architecture, AI-native swaps, RevOps, data ethics, and decision frameworks. Returns ranked articles with title, slug, category, summary, and URL. Use when the user asks a GTM strategy/architecture/methodology question that's been written about (e.g. 'how should I think about CRM migration', 'what's wrong with intent data', 'how to audit my stack'). Cite the URL in your reply. Pass slug to get_kb_article for the full body.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax number of results.
queryYesFree-text search query. Multi-word queries are scored on per-term hits.
categoryNoOptional category filter (slug or label). Known slugs: gtm-infrastructure, stack-design, ai-automation, data-ethics.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must cover behavioral traits. It states it performs a read operation (full-text search returning ranked articles) and notes the knowledge base size (~50). However, it lacks details on pagination, ordering, or any limitations, which would improve transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (4 sentences), front-loads the core purpose, and efficiently includes usage guidance, example queries, and a follow-up instruction. No unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 3 parameters, no output schema, and no annotations, the description sufficiently covers what the tool does, what it returns, when to use it, and how to proceed for full article access. It leaves no major gaps for agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description provides context about the knowledge base content and usage examples but does not add significant parameter information beyond what the schema already provides (e.g., query is free-text, limit defaults to 10).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs full-text search over a specific knowledge base (~50 articles) and lists return fields (title, slug, etc.). It distinguishes itself from sibling tools like get_kb_article by noting that slug should be passed there for full body.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use when the user asks a GTM strategy/architecture/methodology question that's been written about' and provides concrete examples. It also directs to pass slug to get_kb_article for full body, but does not mention alternatives like search_tools for broader searches.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_toolsAInspect

Search StackSwap's catalog of ~400 GTM tools by name. Returns each match with its catalogued monthly cost and, when applicable, a StackSwap partner sign-up link.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax number of results to return.
queryNoSubstring to match against tool names (case-insensitive). Omit to list top tools.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries full burden. It discloses search behavior (substring, case-insensitive), return fields (cost, partner link), and the 'list top tools' fallback. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no wasted words. Purpose is front-loaded, and essential details are concisely provided.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains return format (monthly cost, partner link). Complexity is low with two parameters, and coverage is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. The description adds meaning beyond schema: explains substring matching, case-insensitivity, and the behavior when query is omitted. This enhances usability.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies a clear action: search a catalog of ~400 GTM tools by name, returning monthly cost and partner sign-up link. It distinguishes itself from sibling tools like search_content by focusing on tool names and costs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use (search by name) and implicit behavior (omit query to list top tools), but does not explicitly mention when not to use or alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

segment_revenueAInspect

Break the user's ACCOUNTS into segments and show where revenue concentrates and where it grows vs leaks. Groups by groupBy (default 'segment'; any field works — industry, owner, tier) and reports per-group account count, total ARR, share of ARR, and average ARR. If accounts carry cohort-retention inputs (startingArr + expansionArr/contractionArr/churnedArr), it also computes per-segment NRR and GRR. Accepts loosely-typed account records (arr/mrr, segment/tier, industry normalized). Operates only on the user's own book. Use when the user asks 'which segments drive revenue', 'what's my NRR by segment', or pastes accounts with a segment field.

ParametersJSON Schema
NameRequiredDescriptionDefault
groupByNoField to group by (default "segment"). Any account field works, e.g. "industry", "owner", "tier".
accountsYesAccounts as loose records. Recognized fields (with aliases): arr/mrr, segment/tier, industry/vertical, owner; optional retention inputs startingArr, expansionArr, contractionArr, churnedArr.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations present, so description bears full burden. It describes inputs (loose records with field aliases), grouping behavior, computed metrics, and the limitation to user's own book. No side effects or destructive actions are implied, and the description is comprehensive for a read-only analysis tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is informative but slightly verbose. It front-loads the core action and then details parameters and use cases. Every sentence adds value, but it could be tightened by removing redundancy (e.g., 'loosely-typed' and 'normalized' are similar).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description explains return values (per-group count, ARR, share, average ARR, plus NRR/GRR with retention inputs). It covers input requirements and constraints (only own book). Complete for the tool's complexity and context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and includes descriptions for both parameters. The description adds value by specifying defaults (groupBy default 'segment'), field aliases, and optional retention inputs. It goes beyond schema repetition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool segments accounts and shows revenue concentration, growth, and leaks, with specific metrics (account count, ARR, etc.). It distinguishes from sibling tools by mentioning NRR/GRR segmentation, which overlaps with compute_nrr but specifies the context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Operates only on the user's own book' and provides example queries like 'which segments drive revenue'. It implies when not to use (e.g., not for other users' data) but does not explicitly list exclusions. Could be stronger with a 'when not to use' statement.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

submit_correctionAInspect

Submit a correction to the StackSwap catalog (pricing, feature list, gotcha, AI-readiness score, category, or other). Submissions queue for admin review and only propagate to user-facing surfaces after merge — they DO NOT immediately mutate the catalog. Use when the user notices a stale price, an inaccurate feature list, a gotcha that should be flagged, or wants to report a tool we don't cover. Two-way data flow that helps keep the catalog accurate. Returns a correction ID + reassurance that the submission is queued.

ParametersJSON Schema
NameRequiredDescriptionDefault
toolYesTool name (fuzzy-matched against catalog; new tools accepted too).
fieldYesWhich aspect of the catalog entry the correction targets.
source_urlNoOptional: a public URL that backs the correction (vendor pricing page, doc, etc.). Strong signal for fast approval.
current_valueNoOptional: what the catalog currently shows (for reviewer context).
proposed_valueYesThe corrected value as the user would have it shown.
reporter_contextNoOptional: free-text context (e.g. "Smartlead just raised the entry tier from $39 to $49 on their pricing page").
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Since no annotations are provided, the description carries full burden. It clearly states that submissions do not immediately mutate the catalog, queue for admin review, and propagate only after merge. Also mentions two-way data flow and returns a correction ID.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, front-loaded paragraph with three sentences that cover purpose, behavior, and return value without unnecessary fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains the return value (correction ID + reassurance) and the workflow. All parameters are described, and the overall process is clear.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds value by noting 'fuzzy-matched against catalog' for tool parameter and 'strong signal for fast approval' for source_url, enhancing the schema's meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it submits a correction to the StackSwap catalog, listing specific fields like pricing, feature list, etc. It distinguishes from sibling tools (which are mostly reads or comparisons) by being a write operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use: when the user notices stale price, inaccurate feature list, etc. Mentions the submission process is queued for review, but doesn't explicitly describe when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

suggest_swapsAInspect

For each tool supplied, return StackSwap's AI-native replacement recommendation (when one exists) with annual savings and reasoning. Skews toward legacy → modern swaps (Outreach → Smartlead, ZoomInfo → Apollo, etc.).

ParametersJSON Schema
NameRequiredDescriptionDefault
toolsYesTool names to evaluate for AI-native replacements.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description takes full responsibility for behavioral disclosure. It accurately conveys the tool's behavior: it returns recommendations with cost savings and reasoning, and it may return nothing if no replacement exists ('when one exists'). This is sufficient for a straightforward recommendation tool, though more detail on response structure (e.g., is it a single best match per tool?) would improve transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences long, with the first sentence stating the core action and outputs, and the second providing an example bias. No extraneous information is included; every sentence serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema, no annotations), the description covers the essential function: what it does, what it returns, and its bias. It does not detail edge cases (e.g., multiple replacements per tool) or output format, but for typical use, it is adequate. Compared to the calibration example of a filtered-list tool, it is similarly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema covers the single parameter 'tools' with 100% description coverage. The tool description adds meaning by explaining that each tool in the array is evaluated individually, aligning with the return of per-tool recommendations. This adds value beyond the schema, which only states 'Tool names to evaluate for AI-native replacements.' The connection between input and output is clarified.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states that the tool returns AI-native replacement recommendations with annual savings and reasoning for each tool supplied. It clarifies the skew toward legacy-to-modern swaps with concrete examples (Outreach → Smartlead, ZoomInfo → Apollo), making the purpose specific and distinct from sibling tools like compare_tools or get_tool_details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for evaluating tools for replacements but does not explicitly contrast with alternatives or state when not to use. Sibling tools like compare_tools or find_overlaps might serve different needs, but no guidance is provided on selecting between them. The 'skews toward legacy → modern swaps' phrase offers some context, but lacks explicit when-to-use/when-not-to-use instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources